为什么我在Google数据流上得到java.lang.IllegalStateException?

为什么我在Google数据流上得到java.lang.IllegalStateException?,java,google-bigquery,google-cloud-dataflow,Java,Google Bigquery,Google Cloud Dataflow,我已经升级到新的Google数据流版本1.6,当我在本地机器上测试时,我在管道的末尾得到一个java.lang.IllegalStateException。我对1.5.1版没有这个问题 这不会发生在现场环境中,而只是在本地环境中。这是新版本的错误吗?是否有必要对代码进行更改以避免这些错误 我附加了我的管道的一部分,试图找到问题 private static void getTableRowAndWrite(final PCollection<KV<Integer, Iterable&

我已经升级到新的Google数据流版本1.6,当我在本地机器上测试时,我在管道的末尾得到一个java.lang.IllegalStateException。我对1.5.1版没有这个问题

这不会发生在现场环境中,而只是在本地环境中。这是新版本的错误吗?是否有必要对代码进行更改以避免这些错误

我附加了我的管道的一部分,试图找到问题

private static void getTableRowAndWrite(final PCollection<KV<Integer, Iterable<byte[]>>> groupedTransactions, final String tableName) {
    // Get the tableRow element from the PCollection
    groupedTransactions
            .apply(ParDo
                    .of(((tableName.equals("avail")) ? new GetTableRowAvail() : new GetTableRowReservation())) //Get a TableRow
                    .named("Get " + tableName + " TableRows"))
            .apply(BigQueryIO
                    .Write
                    .named("Write to BigQuery " + tableName) //Write to BigQuery
                    .withSchema(createTableSchema())
                    .to((SerializableFunction<BoundedWindow, String>) window -> {
                        String date = window.toString();
                        String date2 = date.substring(1, 5) + date.substring(6, 8) + date.substring(9, 11);
                        return "travelinsights-1056:hotel." + tableName + "_full_" + (TEST ? "test_" : "") + date2;
                    })
                    .withCreateDisposition(BigQueryIO.Write.CreateDisposition.CREATE_IF_NEEDED)
                    .withWriteDisposition(BigQueryIO.Write.WriteDisposition.WRITE_APPEND)
            );
}
你发现了一只虫子

此文件已存档为,修复程序正在审查中,审查后将立即将其移植到Dataflow Java SDK


如果没有看到设置窗口、触发和允许延迟的代码,我无法确定这对您有何影响。但是,如果您有非全局窗口和非常大的允许延迟时间,那么有一个简单的解决方法可以工作,这样窗口在“时间结束”之前不会过期。在这种情况下,你可以用允许的迟到来更新你的工作,迟到的时间仅仅是非常大的(比如几百年),而不是实际上是无限的。

我相信问题在你的管道中的其他地方。堆栈跟踪意味着您有一个窗口,该窗口的结尾加上允许的延迟超过了数据流中允许的最大时间戳。您是否愿意共享将时间戳放在元素上并将其放入windows的管道部分?
Exception in thread "main" java.lang.IllegalStateException: Cleanup time 294293-06-23T12:00:54.774Z is beyond end-of-time
at com.google.cloud.dataflow.sdk.repackaged.com.google.common.base.Preconditions.checkState(Preconditions.java:199)
at com.google.cloud.dataflow.sdk.util.ReduceFnRunner.onTimer(ReduceFnRunner.java:642)
at com.google.cloud.dataflow.sdk.util.BatchTimerInternals.advance(BatchTimerInternals.java:134)
at com.google.cloud.dataflow.sdk.util.BatchTimerInternals.advanceInputWatermark(BatchTimerInternals.java:110)
at com.google.cloud.dataflow.sdk.util.GroupAlsoByWindowsViaOutputBufferDoFn.processElement(GroupAlsoByWindowsViaOutputBufferDoFn.java:91)
at com.google.cloud.dataflow.sdk.util.SimpleDoFnRunner.invokeProcessElement(SimpleDoFnRunner.java:49)
at com.google.cloud.dataflow.sdk.util.DoFnRunnerBase.processElement(DoFnRunnerBase.java:138)
at com.google.cloud.dataflow.sdk.transforms.ParDo.evaluateHelper(ParDo.java:1229)
at com.google.cloud.dataflow.sdk.transforms.ParDo.evaluateSingleHelper(ParDo.java:1098)
at com.google.cloud.dataflow.sdk.transforms.ParDo.access$300(ParDo.java:457)
at com.google.cloud.dataflow.sdk.transforms.ParDo$1.evaluate(ParDo.java:1084)
at com.google.cloud.dataflow.sdk.transforms.ParDo$1.evaluate(ParDo.java:1079)
at com.google.cloud.dataflow.sdk.runners.DirectPipelineRunner$Evaluator.visitTransform(DirectPipelineRunner.java:858)
at com.google.cloud.dataflow.sdk.runners.TransformTreeNode.visit(TransformTreeNode.java:219)
at com.google.cloud.dataflow.sdk.runners.TransformTreeNode.visit(TransformTreeNode.java:215)
at com.google.cloud.dataflow.sdk.runners.TransformTreeNode.visit(TransformTreeNode.java:215)
at com.google.cloud.dataflow.sdk.runners.TransformTreeNode.visit(TransformTreeNode.java:215)
at com.google.cloud.dataflow.sdk.runners.TransformTreeNode.visit(TransformTreeNode.java:215)
at com.google.cloud.dataflow.sdk.runners.TransformTreeNode.visit(TransformTreeNode.java:215)
at com.google.cloud.dataflow.sdk.runners.TransformTreeNode.visit(TransformTreeNode.java:215)
at com.google.cloud.dataflow.sdk.runners.TransformHierarchy.visit(TransformHierarchy.java:102)
at com.google.cloud.dataflow.sdk.Pipeline.traverseTopologically(Pipeline.java:259)
at com.google.cloud.dataflow.sdk.runners.DirectPipelineRunner$Evaluator.run(DirectPipelineRunner.java:814)
at com.google.cloud.dataflow.sdk.runners.DirectPipelineRunner.run(DirectPipelineRunner.java:526)
at com.google.cloud.dataflow.sdk.runners.DirectPipelineRunner.run(DirectPipelineRunner.java:96)
at com.google.cloud.dataflow.sdk.Pipeline.run(Pipeline.java:180)