Google cloud dataflow Apache子读取器异常
我正在运行一个带有PubSub源的管道,由于管道崩溃,我遇到了一些奇怪的异常。我可以很好地处理一些元素(3-10),然后突然抛出以下两条错误消息中的一条。两者都没有告诉我我可能做错了什么,所以我删除了所有的转换,只保留了源代码,问题仍然存在。我只是将一些测试字符串发布到PubSub。感谢您的帮助 例外情况1:Google cloud dataflow Apache子读取器异常,google-cloud-dataflow,google-cloud-pubsub,apache-beam,Google Cloud Dataflow,Google Cloud Pubsub,Apache Beam,我正在运行一个带有PubSub源的管道,由于管道崩溃,我遇到了一些奇怪的异常。我可以很好地处理一些元素(3-10),然后突然抛出以下两条错误消息中的一条。两者都没有告诉我我可能做错了什么,所以我删除了所有的转换,只保留了源代码,问题仍然存在。我只是将一些测试字符串发布到PubSub。感谢您的帮助 例外情况1: [WARNING] java.lang.reflect.InvocationTargetException at sun.reflect.NativeMethodAcces
[WARNING]
java.lang.reflect.InvocationTargetException
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at org.codehaus.mojo.exec.ExecJavaMojo$1.run(ExecJavaMojo.java:293)
at java.lang.Thread.run(Thread.java:724)
Caused by: java.lang.NullPointerException
at org.apache.beam.sdk.io.PubsubUnboundedSource$PubsubReader.ackBatch(PubsubUnboundedSource.java:640)
at org.apache.beam.sdk.io.PubsubUnboundedSource$PubsubCheckpoint.finalizeCheckpoint(PubsubUnboundedSource.java:313)
at org.apache.beam.runners.direct.UnboundedReadEvaluatorFactory$UnboundedReadEvaluator.getReader(UnboundedReadEvaluatorFactory.java:174)
at org.apache.beam.runners.direct.UnboundedReadEvaluatorFactory$UnboundedReadEvaluator.processElement(UnboundedReadEvaluatorFactory.java:127)
at org.apache.beam.runners.direct.TransformExecutor.processElements(TransformExecutor.java:139)
at org.apache.beam.runners.direct.TransformExecutor.run(TransformExecutor.java:107)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
at java.util.concurrent.FutureTask.run(FutureTask.java:262)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
例外情况2:
[WARNING]
java.lang.reflect.InvocationTargetException
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at org.codehaus.mojo.exec.ExecJavaMojo$1.run(ExecJavaMojo.java:293)
at java.lang.Thread.run(Thread.java:724)
Caused by: java.lang.IllegalStateException: Cannot finalize a restored checkpoint
at org.apache.beam.sdk.repackaged.com.google.common.base.Preconditions.checkState(Preconditions.java:444)
at org.apache.beam.sdk.io.PubsubUnboundedSource$PubsubCheckpoint.finalizeCheckpoint(PubsubUnboundedSource.java:293)
at org.apache.beam.runners.direct.UnboundedReadEvaluatorFactory$UnboundedReadEvaluator.finishRead(UnboundedReadEvaluatorFactory.java:205)
at org.apache.beam.runners.direct.UnboundedReadEvaluatorFactory$UnboundedReadEvaluator.processElement(UnboundedReadEvaluatorFactory.java:142)
at org.apache.beam.runners.direct.TransformExecutor.processElements(TransformExecutor.java:139)
at org.apache.beam.runners.direct.TransformExecutor.run(TransformExecutor.java:107)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
at java.util.concurrent.FutureTask.run(FutureTask.java:262)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
基本代码:
PipelineOptions options = PipelineOptionsFactory.create();
PubsubOptions dataflowOptions = options.as(PubsubOptions.class);
dataflowOptions.setStreaming(true);
Pipeline p = Pipeline.create(options);
p.apply(PubsubIO.<String>read().subscription("my-subscription")
.withCoder(StringUtf8Coder.of())));
此问题的存在是因为DirectRunner中有一个Bug()和PubsubCheckpoint中的一个先决条件
答案包含有关该错误以及如何解决该错误的更多信息。谢谢 谢谢你的回答。我已经更新到最新的快照,第一次满怀希望,因为它没有立即发生,但似乎过了一段时间,上面的NullPointerException仍然在发生。PubSub中有一个bug很快就被解决了。给您带来不便,我深表歉意。
mvn compile exec:java -Dexec.mainClass=my.package.SalesTransactions -Dexec.args="--runner BlockingDataflowRunner --project=my-project --tempLocation=gs://my-project/tmp"