Apache flink Apache Beam/Flink例外InChainedStubException

Apache flink Apache Beam/Flink例外InChainedStubException,apache-flink,apache-beam,Apache Flink,Apache Beam,我使用的是相同版本的ApacheBeam2.0.0和FlinkRunner(scala 2.10)。我正在测试一个进程中的Flink主机(默认配置),FlinkRunner依赖项显然在运行时引入了Flink 1.2.1(查看MVN依赖树) 当存在“用户异常”时,找出实际错误的最佳方法是什么?这不是关于我这次做错了什么的问题;而是如何告诉-一般来说-如何从Beam或Flink中获得更多信息。这里是stacktrace: Exception in thread "main" java.lang.Ru

我使用的是相同版本的ApacheBeam2.0.0和FlinkRunner(scala 2.10)。我正在测试一个进程中的Flink主机(默认配置),FlinkRunner依赖项显然在运行时引入了Flink 1.2.1(查看MVN依赖树)

当存在“用户异常”时,找出实际错误的最佳方法是什么?这不是关于我这次做错了什么的问题;而是如何告诉-一般来说-如何从Beam或Flink中获得更多信息。这里是stacktrace:

Exception in thread "main" java.lang.RuntimeException: Pipeline execution failed
at org.apache.beam.runners.flink.FlinkRunner.run(FlinkRunner.java:122)
at org.apache.beam.sdk.Pipeline.run(Pipeline.java:295)
at org.apache.beam.sdk.Pipeline.run(Pipeline.java:281)
at com.mapfit.flow.data.environment.MFEnvironment.run(MFEnvironment.java:70)
at com.mapfit.flow.main.Scratch.main(Scratch.java:35)
Caused by: org.apache.flink.runtime.client.JobExecutionException: Job execution failed.
at org.apache.flink.runtime.jobmanager.JobManager$$anonfun$handleMessage$1$$anonfun$applyOrElse$7.apply$mcV$sp(JobManager.scala:910)
at org.apache.flink.runtime.jobmanager.JobManager$$anonfun$handleMessage$1$$anonfun$applyOrElse$7.apply(JobManager.scala:853)
at org.apache.flink.runtime.jobmanager.JobManager$$anonfun$handleMessage$1$$anonfun$applyOrElse$7.apply(JobManager.scala:853)
at scala.concurrent.impl.Future$PromiseCompletingRunnable.liftedTree1$1(Future.scala:24)
at scala.concurrent.impl.Future$PromiseCompletingRunnable.run(Future.scala:24)
at akka.dispatch.TaskInvocation.run(AbstractDispatcher.scala:40)
at akka.dispatch.ForkJoinExecutorConfigurator$AkkaForkJoinTask.exec(AbstractDispatcher.scala:397)
at scala.concurrent.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260)
at scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1339)
at scala.concurrent.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979)
at scala.concurrent.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107)
Caused by: org.apache.beam.sdk.util.UserCodeException: org.apache.flink.runtime.operators.chaining.ExceptionInChainedStubException
at org.apache.beam.sdk.util.UserCodeException.wrap(UserCodeException.java:36)
at org.apache.beam.sdk.transforms.MapElements$1$auxiliary$PCieS8xh.invokeProcessElement(Unknown Source)
at org.apache.beam.runners.core.SimpleDoFnRunner.invokeProcessElement(SimpleDoFnRunner.java:197)
at org.apache.beam.runners.core.SimpleDoFnRunner.processElement(SimpleDoFnRunner.java:158)
at org.apache.beam.runners.flink.metrics.DoFnRunnerWithMetricsUpdate.processElement(DoFnRunnerWithMetricsUpdate.java:65)
at org.apache.beam.runners.flink.translation.functions.FlinkDoFnFunction.mapPartition(FlinkDoFnFunction.java:118)
at org.apache.flink.runtime.operators.MapPartitionDriver.run(MapPartitionDriver.java:103)
at org.apache.flink.runtime.operators.BatchTask.run(BatchTask.java:490)
at org.apache.flink.runtime.operators.BatchTask.invoke(BatchTask.java:355)
at org.apache.flink.runtime.taskmanager.Task.run(Task.java:665)
at java.lang.Thread.run(Thread.java:745)
Caused by: org.apache.flink.runtime.operators.chaining.ExceptionInChainedStubException
at org.apache.flink.runtime.operators.chaining.ChainedFlatMapDriver.collect(ChainedFlatMapDriver.java:82)
at org.apache.flink.runtime.operators.util.metrics.CountingCollector.collect(CountingCollector.java:35)
at org.apache.beam.runners.flink.translation.functions.FlinkDoFnFunction$MultiDoFnOutputManager.output(FlinkDoFnFunction.java:165)
at org.apache.beam.runners.core.SimpleDoFnRunner$DoFnContext.outputWindowedValue(SimpleDoFnRunner.java:355)
at org.apache.beam.runners.core.SimpleDoFnRunner$DoFnProcessContext.output(SimpleDoFnRunner.java:629)
at org.apache.beam.sdk.transforms.MapElements$1.processElement(MapElements.java:122)
请注意,除了调用pipeline.run(),完全没有与我编写的代码相关的内容

我下载了每个链接JAR的源代码,进入了
ChainedFlatMapDriver
,它在第82行抛出了一个异常,最后看到了Java对象序列化中调用生成的EOFEException(我的值使用默认编码器)。我想我想知道一些事情,但似乎EOFException的原因是在
SimpleCollectingOutputView
第79行,它被抛出了很多次,并且经常被作为Flink的例行执行而吞没

知道如何让弗林克披露失败信息的人有什么建议吗

调试后找到更多信息:

Just found more info after walking through more Flink code in the debugger: java.lang.InterruptedException
at java.lang.Object.wait(Native Method)
at org.apache.flink.runtime.io.network.buffer.LocalBufferPool.requestBuffer(LocalBufferPool.java:168)
at org.apache.flink.runtime.io.network.buffer.LocalBufferPool.requestBufferBlocking(LocalBufferPool.java:138)
at org.apache.flink.runtime.io.network.api.writer.RecordWriter.sendToTarget(RecordWriter.java:131)
at org.apache.flink.runtime.io.network.api.writer.RecordWriter.emit(RecordWriter.java:88)
at org.apache.flink.runtime.operators.shipping.OutputCollector.collect(OutputCollector.java:65)
at org.apache.flink.runtime.operators.util.metrics.CountingCollector.collect(CountingCollector.java:35)
at org.apache.beam.runners.flink.translation.functions.FlinkMultiOutputPruningFunction.flatMap(FlinkMultiOutputPruningFunction.java:46)
at org.apache.beam.runners.flink.translation.functions.FlinkMultiOutputPruningFunction.flatMap(FlinkMultiOutputPruningFunction.java:30)
at org.apache.flink.runtime.operators.chaining.ChainedFlatMapDriver.collect(ChainedFlatMapDriver.java:80)
at org.apache.flink.runtime.operators.util.metrics.CountingCollector.collect(CountingCollector.java:35)
at org.apache.beam.runners.flink.translation.functions.FlinkDoFnFunction$MultiDoFnOutputManager.output(FlinkDoFnFunction.java:165)
at org.apache.beam.runners.core.SimpleDoFnRunner$DoFnContext.outputWindowedValue(SimpleDoFnRunner.java:355)
at org.apache.beam.runners.core.SimpleDoFnRunner$DoFnProcessContext.output(SimpleDoFnRunner.java:629)
at org.apache.beam.sdk.transforms.MapElements$1.processElement(MapElements.java:122)
at org.apache.beam.sdk.transforms.MapElements$1$auxiliary$vuuNRtio.invokeProcessElement(Unknown Source)
at org.apache.beam.runners.core.SimpleDoFnRunner.invokeProcessElement(SimpleDoFnRunner.java:197)
at org.apache.beam.runners.core.SimpleDoFnRunner.processElement(SimpleDoFnRunner.java:158)
at org.apache.beam.runners.flink.metrics.DoFnRunnerWithMetricsUpdate.processElement(DoFnRunnerWithMetricsUpdate.java:65)
at org.apache.beam.runners.flink.translation.functions.FlinkDoFnFunction.mapPartition(FlinkDoFnFunction.java:118)
at org.apache.flink.runtime.operators.MapPartitionDriver.run(MapPartitionDriver.java:103)
at org.apache.flink.runtime.operators.BatchTask.run(BatchTask.java:490)
at org.apache.flink.runtime.operators.BatchTask.invoke(BatchTask.java:355)
at org.apache.flink.runtime.taskmanager.Task.run(Task.java:665)
at java.lang.Thread.run(Thread.java:745)

请看以下两个链接:

我曾经看到类似的例外,当运行在弗林克伦纳纱梁。问题页面中建议的编码员帮助了我们


除此之外,我建议广泛使用记录器,直到管道平稳运行。可以使用纱线日志命令检索纱线日志。不知道你的情况(正在处理的Flink master),但我认为你应该能够写一些日志

我会将此标记为已接受的答案,因为我打开了这个案例(在您提到的链接存在之前的半个月),像往常一样,没有人能帮助我;然后以与链接中提到的基本相同的方式解决问题,然后忽略了这张罚单:)