Google cloud storage DirectPipelineRunner-它支持标准的全局模式吗?

Google cloud storage DirectPipelineRunner-它支持标准的全局模式吗?,google-cloud-storage,google-cloud-dataflow,Google Cloud Storage,Google Cloud Dataflow,在云中执行管道运行良好。但是当它作为DirectPipelineRunner(即本地)运行时,它会停止运行,并抱怨提供的文件模式。文件模式使用一个glob 这是本地运行时的预期行为吗 [..] TextIO.Read.from("gs://cdf-testing/NetworkClicks_123456_2015010[1-2]*") [..] Feb 18, 2015 4:19:09 PM com.google.cloud.dataflow.sdk.runners.DirectPipelin

在云中执行管道运行良好。但是当它作为
DirectPipelineRunner
(即本地)运行时,它会停止运行,并抱怨提供的文件模式。文件模式使用一个glob

这是本地运行时的预期行为吗

[..]
TextIO.Read.from("gs://cdf-testing/NetworkClicks_123456_2015010[1-2]*")
[..]

Feb 18, 2015 4:19:09 PM com.google.cloud.dataflow.sdk.runners.DirectPipelineRunner run
INFO: Executing pipeline using the DirectPipelineRunner.
Feb 18, 2015 4:19:10 PM com.google.cloud.dataflow.sdk.util.GcsUtil expand
INFO: matching files in bucket cdf-testing, prefix NetworkClicks_123456_2015010[1-2] against pattern NetworkClicks_123456_2015010[1-2][^/]*
Exception in thread "main" java.lang.RuntimeException: Failed to read from source: com.google.cloud.dataflow.sdk.runners.worker.TextReader@55dbc59b
    at com.google.cloud.dataflow.sdk.util.ReaderUtils.readElemsFromReader(ReaderUtils.java:40)
    at com.google.cloud.dataflow.sdk.io.TextIO.evaluateReadHelper(TextIO.java:702)
    at com.google.cloud.dataflow.sdk.io.TextIO.access$000(TextIO.java:98)
    at com.google.cloud.dataflow.sdk.io.TextIO$Read$Bound$1.evaluate(TextIO.java:310)
    at com.google.cloud.dataflow.sdk.io.TextIO$Read$Bound$1.evaluate(TextIO.java:306)
    at com.google.cloud.dataflow.sdk.runners.DirectPipelineRunner$Evaluator.visitTransform(DirectPipelineRunner.java:611)
    at com.google.cloud.dataflow.sdk.runners.TransformTreeNode.visit(TransformTreeNode.java:200)
    at com.google.cloud.dataflow.sdk.runners.TransformTreeNode.visit(TransformTreeNode.java:196)
    at com.google.cloud.dataflow.sdk.runners.TransformHierarchy.visit(TransformHierarchy.java:109)
    at com.google.cloud.dataflow.sdk.Pipeline.traverseTopologically(Pipeline.java:204)
    at com.google.cloud.dataflow.sdk.runners.DirectPipelineRunner$Evaluator.run(DirectPipelineRunner.java:584)
    at com.google.cloud.dataflow.sdk.runners.DirectPipelineRunner.run(DirectPipelineRunner.java:328)
    at com.google.cloud.dataflow.sdk.runners.DirectPipelineRunner.run(DirectPipelineRunner.java:70)
    at com.google.cloud.dataflow.sdk.Pipeline.run(Pipeline.java:145)
    at com.shinetech.tpc.engine.CDFEngine.loadClicks(CDFEngine.java:88)
    at com.shinetech.tpc.engine.CDFEngine.doMagic(CDFEngine.java:75)
    at com.shinetech.tpc.Main.main(Main.java:15)
Caused by: java.io.IOException: No match for file pattern 'gs://cdf-testing/NetworkClicks_123456_2015010[1-2]*'
    at com.google.cloud.dataflow.sdk.runners.worker.FileBasedReader.iterator(FileBasedReader.java:101)
    at com.google.cloud.dataflow.sdk.util.ReaderUtils.readElemsFromReader(ReaderUtils.java:35)
    ... 16 more

不,两个跑步者应该表现相同。听起来这是DirectRunner中的一个bug。感谢您的报告--当修复程序发布后,我们将在此处回复。

只是为了跟进,此修复程序自2月23日起已在Github上发布,并将在下一个月中旬发布到maven。