Warning: file_get_contents(/data/phpspider/zhask/data//catemap/7/user-interface/2.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
Google cloud dataflow 当使用InProcessPipelineRunner执行时,PubsubReader因NullPointerException而失败_Google Cloud Dataflow - Fatal编程技术网

Google cloud dataflow 当使用InProcessPipelineRunner执行时,PubsubReader因NullPointerException而失败

Google cloud dataflow 当使用InProcessPipelineRunner执行时,PubsubReader因NullPointerException而失败,google-cloud-dataflow,Google Cloud Dataflow,我有一个简单的管道,它只执行读取PubsubIO.Read.subscription。在消耗大约200个元素后,每次运行都会失败,但以下情况除外: [error] (run-main-0) java.lang.RuntimeException: java.lang.NullPointerException java.lang.RuntimeException: java.lang.NullPointerException at com.google.cloud.dataflow.sdk

我有一个简单的管道,它只执行读取PubsubIO.Read.subscription。在消耗大约200个元素后,每次运行都会失败,但以下情况除外:

[error] (run-main-0) java.lang.RuntimeException: java.lang.NullPointerException
java.lang.RuntimeException: java.lang.NullPointerException
    at  com.google.cloud.dataflow.sdk.runners.inprocess.InProcessPipelineRunner.run(InProcessPipelineRunner.java:281)
    at com.google.cloud.dataflow.sdk.runners.inprocess.InProcessPipelineRunner.run(InProcessPipelineRunner.java:69)
    at com.google.cloud.dataflow.sdk.Pipeline.run(Pipeline.java:181)
    at com.sandbox.WriteLogsToBQ.main(WriteLogsToBQ.java:296)
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
    at java.lang.reflect.Method.invoke(Method.java:498)
Caused by: java.lang.NullPointerException
    at com.google.cloud.dataflow.sdk.io.PubsubUnboundedSource$PubsubReader.ackBatch(PubsubUnboundedSource.java:612)
    at com.google.cloud.dataflow.sdk.io.PubsubUnboundedSource$PubsubCheckpoint.finalizeCheckpoint(PubsubUnboundedSource.java:297)
    at com.google.cloud.dataflow.sdk.runners.inprocess.UnboundedReadEvaluatorFactory$UnboundedReadEvaluator.startReader(UnboundedReadEvaluatorFactory.java:203)
    at com.google.cloud.dataflow.sdk.runners.inprocess.UnboundedReadEvaluatorFactory$UnboundedReadEvaluator.finishBundle(UnboundedReadEvaluatorFactory.java:172)
    at com.google.cloud.dataflow.sdk.runners.inprocess.TransformExecutor.finishBundle(TransformExecutor.java:163)
    at com.google.cloud.dataflow.sdk.runners.inprocess.TransformExecutor.run(TransformExecutor.java:119)
    at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
    at java.util.concurrent.FutureTask.run(FutureTask.java:266)
    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
    at java.lang.Thread.run(Thread.java:745)
我使用的是SDK版本1.9.0。 1.6.1中没有出现这种情况(成功获取了>10k个元素)。 有人知道变通方法吗?我还注意到1.9.0的抓取速度比1.6.1快得多。对于1.6.1,似乎使用了10个元素的批次


通过阅读表格主题进行测试:

p.apply(PubsubIO.Read.named("reading_topic_test").topic("projects/***/topics/SL_LogLogin"))
已自动生成订阅,但管道失败,原因是:

Jan 05, 2017 2:06:33 PM com.google.cloud.dataflow.sdk.io.PubsubUnboundedSource apply
WARNING: Created subscription null to topic NestedValueProvider{value=NestedValueProvider{value=StaticValueProvider{value=projects/***/topics/SL_LogLogin}}}. 
Note this subscription WILL NOT be deleted when the pipeline terminates
[error] (run-main-0) com.google.cloud.dataflow.sdk.transforms.display.DisplayData$InternalBuilder$PopulateDisplayDataException: 
Error while populating display data for component:com.google.cloud.dataflow.sdk.io.PubsubUnboundedSource$StatsFn
com.google.cloud.dataflow.sdk.transforms.display.DisplayData$InternalBuilder$PopulateDisplayDataException: 
Error while populating display data for component: com.google.cloud.dataflow.sdk.io.PubsubUnboundedSource$StatsFn
    at    com.google.cloud.dataflow.sdk.transforms.display.DisplayData$InternalBuilder.include(DisplayData.java:664)
    at com.google.cloud.dataflow.sdk.transforms.display.DisplayData$InternalBuilder.include(DisplayData.java:643)
    at com.google.cloud.dataflow.sdk.transforms.display.DisplayData$InternalBuilder.include(DisplayData.java:637)
    at com.google.cloud.dataflow.sdk.transforms.ParDo.populateDisplayData(ParDo.java:1266)
    at com.google.cloud.dataflow.sdk.transforms.ParDo.access$200(ParDo.java:457)
    at com.google.cloud.dataflow.sdk.transforms.ParDo$Bound.populateDisplayData(ParDo.java:816)
    at com.google.cloud.dataflow.sdk.transforms.display.DisplayData$InternalBuilder.include(DisplayData.java:657)
    at com.google.cloud.dataflow.sdk.transforms.display.DisplayData$InternalBuilder.include(DisplayData.java:643)
    at com.google.cloud.dataflow.sdk.transforms.display.DisplayData$InternalBuilder.include(DisplayData.java:637)
    at com.google.cloud.dataflow.sdk.transforms.display.DisplayData$InternalBuilder.forRoot(DisplayData.java:630)
    at com.google.cloud.dataflow.sdk.transforms.display.DisplayData$InternalBuilder.access$000(DisplayData.java:617)
    at com.google.cloud.dataflow.sdk.transforms.display.DisplayData.from(DisplayData.java:76)
    at com.google.cloud.dataflow.sdk.runners.inprocess.DisplayDataValidator.evaluateDisplayData(DisplayDataValidator.java:47)
    at com.google.cloud.dataflow.sdk.runners.inprocess.DisplayDataValidator.access$100(DisplayDataValidator.java:29)
    at com.google.cloud.dataflow.sdk.runners.inprocess.DisplayDataValidator$Visitor.visitTransform(DisplayDataValidator.java:62)
    at com.google.cloud.dataflow.sdk.runners.TransformTreeNode.visit(TransformTreeNode.java:221)
    at com.google.cloud.dataflow.sdk.runners.TransformTreeNode.visit(TransformTreeNode.java:217)
    at com.google.cloud.dataflow.sdk.runners.TransformTreeNode.visit(TransformTreeNode.java:217)
    at com.google.cloud.dataflow.sdk.runners.TransformTreeNode.visit(TransformTreeNode.java:217)
    at com.google.cloud.dataflow.sdk.runners.TransformHierarchy.visit(TransformHierarchy.java:103)
    at com.google.cloud.dataflow.sdk.Pipeline.traverseTopologically(Pipeline.java:260)
    at com.google.cloud.dataflow.sdk.runners.inprocess.DisplayDataValidator.validateTransforms(DisplayDataValidator.java:43)
    at com.google.cloud.dataflow.sdk.runners.inprocess.DisplayDataValidator.validatePipeline(DisplayDataValidator.java:35)
    at com.google.cloud.dataflow.sdk.runners.inprocess.InProcessPipelineRunner.run(InProcessPipelineRunner.java:245)
    at com.google.cloud.dataflow.sdk.runners.inprocess.InProcessPipelineRunner.run(InProcessPipelineRunner.java:69)
    at com.google.cloud.dataflow.sdk.Pipeline.run(Pipeline.java:181)
    at com.sandbox.WriteLogsToBQ.main(WriteLogsToBQ.java:303)
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
    at java.lang.reflect.Method.invoke(Method.java:498)
Caused by: java.lang.NullPointerException: Input display value cannot be null
    at com.google.cloud.dataflow.sdk.repackaged.com.google.common.base.Preconditions.checkNotNull(Preconditions.java:228)
    at com.google.cloud.dataflow.sdk.transforms.display.DisplayData$InternalBuilder.addItemIf(DisplayData.java:707)
    at com.google.cloud.dataflow.sdk.transforms.display.DisplayData$InternalBuilder.add(DisplayData.java:685)
    at com.google.cloud.dataflow.sdk.io.PubsubUnboundedSource$StatsFn.populateDisplayData(PubsubUnboundedSource.java:1147)
    at com.google.cloud.dataflow.sdk.transforms.display.DisplayData$InternalBuilder.include(DisplayData.java:657)
    at com.google.cloud.dataflow.sdk.transforms.display.DisplayData$InternalBuilder.include(DisplayData.java:643)
    at com.google.cloud.dataflow.sdk.transforms.display.DisplayData$InternalBuilder.include(DisplayData.java:637)
    at com.google.cloud.dataflow.sdk.transforms.ParDo.populateDisplayData(ParDo.java:1266)
    at com.google.cloud.dataflow.sdk.transforms.ParDo.access$200(ParDo.java:457)
    at com.google.cloud.dataflow.sdk.transforms.ParDo$Bound.populateDisplayData(ParDo.java:816)
    at com.google.cloud.dataflow.sdk.transforms.display.DisplayData$InternalBuilder.include(DisplayData.java:657)
    at com.google.cloud.dataflow.sdk.transforms.display.DisplayData$InternalBuilder.include(DisplayData.java:643)
    at com.google.cloud.dataflow.sdk.transforms.display.DisplayData$InternalBuilder.include(DisplayData.java:637)
    at com.google.cloud.dataflow.sdk.transforms.display.DisplayData$InternalBuilder.forRoot(DisplayData.java:630)
    at com.google.cloud.dataflow.sdk.transforms.display.DisplayData$InternalBuilder.access$000(DisplayData.java:617)
    at com.google.cloud.dataflow.sdk.transforms.display.DisplayData.from(DisplayData.java:76)
    at com.google.cloud.dataflow.sdk.runners.inprocess.DisplayDataValidator.evaluateDisplayData(DisplayDataValidator.java:47)
    at com.google.cloud.dataflow.sdk.runners.inprocess.DisplayDataValidator.access$100(DisplayDataValidator.java:29)
    at com.google.cloud.dataflow.sdk.runners.inprocess.DisplayDataValidator$Visitor.visitTransform(DisplayDataValidator.java:62)
    at com.google.cloud.dataflow.sdk.runners.TransformTreeNode.visit(TransformTreeNode.java:221)
    at com.google.cloud.dataflow.sdk.runners.TransformTreeNode.visit(TransformTreeNode.java:217)
    at com.google.cloud.dataflow.sdk.runners.TransformTreeNode.visit(TransformTreeNode.java:217)
    at com.google.cloud.dataflow.sdk.runners.TransformTreeNode.visit(TransformTreeNode.java:217)
    at com.google.cloud.dataflow.sdk.runners.TransformHierarchy.visit(TransformHierarchy.java:103)
    at com.google.cloud.dataflow.sdk.Pipeline.traverseTopologically(Pipeline.java:260)
    at com.google.cloud.dataflow.sdk.runners.inprocess.DisplayDataValidator.validateTransforms(DisplayDataValidator.java:43)
    at com.google.cloud.dataflow.sdk.runners.inprocess.DisplayDataValidator.validatePipeline(DisplayDataValidator.java:35)
    at com.google.cloud.dataflow.sdk.runners.inprocess.InProcessPipelineRunner.run(InProcessPipelineRunner.java:245)
    at com.google.cloud.dataflow.sdk.runners.inprocess.InProcessPipelineRunner.run(InProcessPipelineRunner.java:69)
    at com.google.cloud.dataflow.sdk.Pipeline.run(Pipeline.java:181)
    at com.sandbox.WriteLogsToBQ.main(WriteLogsToBQ.java:303)
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
    at java.lang.reflect.Method.invoke(Method.java:498)
这是通过以下方式解决的。抱歉给你添麻烦了


根本原因是孵化器beam和GoogleDataflowSDK之间的后移植错误,未能初始化在每个作业订阅的情况下返回的订阅。

要确认,您明确提供的是管道订阅,而不是主题?正确,我更新了上面关于主题的文章。请注意,到目前为止,最新的一个可能是由延迟相关的错误引起的。这似乎在Beam中得到了修复。我们正在寻找数据流的修复方法。