Google cloud platform 设置templateLocation参数时,数据流作业运行失败
当我传递参数staging、temp和output GCS bucket locations时,数据流作业失败,出现以下异常 Java代码:Google cloud platform 设置templateLocation参数时,数据流作业运行失败,google-cloud-platform,google-cloud-dataflow,Google Cloud Platform,Google Cloud Dataflow,当我传递参数staging、temp和output GCS bucket locations时,数据流作业失败,出现以下异常 Java代码: final String[] used = Arrays.copyOf(args, args.length + 1); used[used.length - 1] = "--project=OVERWRITTEN"; final T options = PipelineOptionsFactory.fromArgs(used).withValidati
final String[] used = Arrays.copyOf(args, args.length + 1);
used[used.length - 1] = "--project=OVERWRITTEN"; final T options =
PipelineOptionsFactory.fromArgs(used).withValidation().as(clazz);
options.setProject(PROJECT_ID);
options.setStagingLocation("gs://abc/staging/");
options.setTempLocation("gs://abc/temp");
options.setRunner(DataflowRunner.class);
options.setGcpTempLocation("gs://abc");
错误:
INFO: Staging pipeline description to gs://ups-heat-dev- tmp/mniazstaging_ingest_validation/staging/
May 10, 2018 11:56:35 AM org.apache.beam.runners.dataflow.util.PackageUtil tryStagePackage
INFO: Uploading <42088 bytes, hash E7urYrjAOjwy6_5H-UoUxA> to gs://ups-heat-dev-tmp/mniazstaging_ingest_validation/staging/pipeline-E7urYrjAOjwy6_5H-UoUxA.pb
Dataflow SDK version: 2.4.0
May 10, 2018 11:56:38 AM org.apache.beam.runners.dataflow.DataflowRunner run
INFO: Printed job specification to gs://ups-heat-dev-tmp/mniazstaging_ingest_validation/templates/DataValidationPipeline
May 10, 2018 11:56:40 AM org.apache.beam.runners.dataflow.DataflowRunner run
INFO: Template successfully created.
Exception in thread "main" java.lang.NullPointerException
at org.apache.beam.runners.dataflow.DataflowPipelineJob.getJobWithRetries(DataflowPipelineJob.java:501)
at org.apache.beam.runners.dataflow.DataflowPipelineJob.getStateWithRetries(DataflowPipelineJob.java:477)
at org.apache.beam.runners.dataflow.DataflowPipelineJob.waitUntilFinish(DataflowPipelineJob.java:312)
at org.apache.beam.runners.dataflow.DataflowPipelineJob.waitUntilFinish(DataflowPipelineJob.java:248)
at org.apache.beam.runners.dataflow.DataflowPipelineJob.waitUntilFinish(DataflowPipelineJob.java:202)
at org.apache.beam.runners.dataflow.DataflowPipelineJob.waitUntilFinish(DataflowPipelineJob.java:195)
at com.example.DataValidationPipeline.main(DataValidationPipeline.java:66)
INFO:gs://ups-heat-dev-tmp/mniazstaging\u-inset\u验证/暂存的暂存管道说明/
2018年5月10日上午11:56:35 org.apache.beam.runners.dataflow.util.PackageUtil tryStagePackage
信息:上传至gs://ups加热设备开发tmp/mniazstaging\u ingest\u validation/staging/pipeline-E7urYrjAOjwy6\u 5H-UUUXA.pb
数据流SDK版本:2.4.0
2018年5月10日上午11:56:38 org.apache.beam.runners.dataflow.DataflowRunner run
信息:打印工作规范至gs://ups加热设备开发tmp/mniazstaging\u摄取\u验证/templates/DataValidationPipeline
2018年5月10日上午11:56:40 org.apache.beam.runners.dataflow.DataflowRunner run
信息:模板已成功创建。
线程“main”java.lang.NullPointerException中出现异常
位于org.apache.beam.runners.dataflow.DataflowPipelineJob.getJobWithRetries(DataflowPipelineJob.java:501)
位于org.apache.beam.runners.dataflow.DataflowPipelineJob.getStateWithRetries(DataflowPipelineJob.java:477)
位于org.apache.beam.runners.dataflow.DataflowPipelineJob.waitUntilFinish(DataflowPipelineJob.java:312)
位于org.apache.beam.runners.dataflow.DataflowPipelineJob.waitUntilFinish(DataflowPipelineJob.java:248)
位于org.apache.beam.runners.dataflow.DataflowPipelineJob.waitUntilFinish(DataflowPipelineJob.java:202)
位于org.apache.beam.runners.dataflow.DataflowPipelineJob.waitUntilFinish(DataflowPipelineJob.java:195)
位于com.example.DataValidationPipeline.main(DataValidationPipeline.java:66)
我也面临同样的问题,错误是抛出p.run().waitForFinish()代码>。然后我尝试了以下代码
PipelineResult result = p.run();
System.out.println(result.getState().hasReplacementJob());
result.waitUntilFinish();
PipelineResult result = pipeline.run();
try {
result.getState();
result.waitUntilFinish();
} catch (UnsupportedOperationException e) {
// do nothing
} catch (Exception e) {
e.printStackTrace();
}
这将引发以下异常
java.lang.UnsupportedOperationException: The result of template creation should not be used.
at org.apache.beam.runners.dataflow.util.DataflowTemplateJob.getState (DataflowTemplateJob.java:67)
然后,为了解决这个问题,我使用了以下代码
PipelineResult result = p.run();
System.out.println(result.getState().hasReplacementJob());
result.waitUntilFinish();
PipelineResult result = pipeline.run();
try {
result.getState();
result.waitUntilFinish();
} catch (UnsupportedOperationException e) {
// do nothing
} catch (Exception e) {
e.printStackTrace();
}
我遇到了java.lang.UnsupportedOperationException的问题:不应使用模板创建的结果。
今天也遇到了问题,我尝试先检查作业是否为DataflowTemplateJob类型来修复它:
val(sc,args)=ContextAndArgs(cmdlineArgs)
// ...
val result=sc.run()
如果(!result.isInstanceOf[DataflowTemplateJob])result.waitUntilFinish()
我认为这应该适用于裸java作业,但是如果使用Scio,那么结果将是某种匿名类型,因此最后我还必须执行try-catch版本
试试看{
val result=sc.run().waitUntilFinish()
}抓住{
case u2;:UnsupportedOperationException=>//这在模板创建过程中发生
}
您是否介意使用您使用的完整命令行命令更新此命令?从Eclipse运行并在code.final String[]used=Arrays.copyOf(args,args.length+1)中设置参数;used[used.length-1]=“项目=覆盖”;最终T选项=PipelineOptionFactory.fromArgs(已使用).withValidation().as(clazz);选项。设置项目(项目ID);选项。设置标记位置(“gs://abc/staging/”;选项。设置位置(“gs://abc/temp”);options.setRunner(DataflowRunner.class);options.setGcpTempLocation(“gs://abc”);嗨,你介意提供更多的上下文吗。您能提供完整的代码和pom.xml文件来查看您使用的依赖项的版本吗。@MohammedNiaz-您好,问题解决了吗?如果是,您可以分享解决方案吗?