Java 带有数据流的ApacheBeamGoSDK
我一直在使用Go Beam SDK(v2.13.0),无法获得GCP数据流方面的工作。它进入崩溃循环,试图启动Java 带有数据流的ApacheBeamGoSDK,java,go,protocol-buffers,google-cloud-dataflow,apache-beam,Java,Go,Protocol Buffers,Google Cloud Dataflow,Apache Beam,我一直在使用Go Beam SDK(v2.13.0),无法获得GCP数据流方面的工作。它进入崩溃循环,试图启动org.apache.beam.runners.dataflow.worker.DataflowRunnerHarness。当使用Direct runner在本地运行时,该示例正在正确执行 该示例完全没有对上面给出的原始示例进行修改 堆栈跟踪是: org.apache.beam.vendor.grpc.v1p13p1.com.google.protobuf.InvalidProtocol
org.apache.beam.runners.dataflow.worker.DataflowRunnerHarness
。当使用Direct runner在本地运行时,该示例正在正确执行
该示例完全没有对上面给出的原始示例进行修改
堆栈跟踪是:
org.apache.beam.vendor.grpc.v1p13p1.com.google.protobuf.InvalidProtocolBufferException: Protocol message had invalid UTF-8.
at org.apache.beam.vendor.grpc.v1p13p1.com.google.protobuf.InvalidProtocolBufferException.invalidUtf8(InvalidProtocolBufferException.java:148)
at org.apache.beam.vendor.grpc.v1p13p1.com.google.protobuf.CodedInputStream$StreamDecoder.readStringRequireUtf8(CodedInputStream.java:2353)
at org.apache.beam.model.pipeline.v1.RunnerApi$FunctionSpec.<init>(RunnerApi.java:59611)
at org.apache.beam.model.pipeline.v1.RunnerApi$FunctionSpec.<init>(RunnerApi.java:59572)
at org.apache.beam.model.pipeline.v1.RunnerApi$FunctionSpec$1.parsePartialFrom(RunnerApi.java:60241)
at org.apache.beam.model.pipeline.v1.RunnerApi$FunctionSpec$1.parsePartialFrom(RunnerApi.java:60235)
at org.apache.beam.vendor.grpc.v1p13p1.com.google.protobuf.CodedInputStream$StreamDecoder.readMessage(CodedInputStream.java:2424)
at org.apache.beam.model.pipeline.v1.RunnerApi$Coder.<init>(RunnerApi.java:27531)
at org.apache.beam.model.pipeline.v1.RunnerApi$Coder.<init>(RunnerApi.java:27489)
at org.apache.beam.model.pipeline.v1.RunnerApi$Coder$1.parsePartialFrom(RunnerApi.java:28410)
at org.apache.beam.model.pipeline.v1.RunnerApi$Coder$1.parsePartialFrom(RunnerApi.java:28404)
at org.apache.beam.model.pipeline.v1.RunnerApi$Coder$Builder.mergeFrom(RunnerApi.java:28028)
at org.apache.beam.model.pipeline.v1.RunnerApi$Coder$Builder.mergeFrom(RunnerApi.java:27868)
at org.apache.beam.vendor.grpc.v1p13p1.com.google.protobuf.CodedInputStream$StreamDecoder.readMessage(CodedInputStream.java:2408)
at org.apache.beam.vendor.grpc.v1p13p1.com.google.protobuf.MapEntryLite.parseField(MapEntryLite.java:128)
at org.apache.beam.vendor.grpc.v1p13p1.com.google.protobuf.MapEntryLite.parseEntry(MapEntryLite.java:184)
at org.apache.beam.vendor.grpc.v1p13p1.com.google.protobuf.MapEntry.<init>(MapEntry.java:106)
at org.apache.beam.vendor.grpc.v1p13p1.com.google.protobuf.MapEntry.<init>(MapEntry.java:50)
at org.apache.beam.vendor.grpc.v1p13p1.com.google.protobuf.MapEntry$Metadata$1.parsePartialFrom(MapEntry.java:70)
at org.apache.beam.vendor.grpc.v1p13p1.com.google.protobuf.MapEntry$Metadata$1.parsePartialFrom(MapEntry.java:64)
at org.apache.beam.vendor.grpc.v1p13p1.com.google.protobuf.CodedInputStream$StreamDecoder.readMessage(CodedInputStream.java:2424)
at org.apache.beam.model.pipeline.v1.RunnerApi$Components.<init>(RunnerApi.java:930)
at org.apache.beam.model.pipeline.v1.RunnerApi$Components.<init>(RunnerApi.java:848)
at org.apache.beam.model.pipeline.v1.RunnerApi$Components$1.parsePartialFrom(RunnerApi.java:2714)
at org.apache.beam.model.pipeline.v1.RunnerApi$Components$1.parsePartialFrom(RunnerApi.java:2708)
at org.apache.beam.vendor.grpc.v1p13p1.com.google.protobuf.CodedInputStream$StreamDecoder.readMessage(CodedInputStream.java:2424)
at org.apache.beam.model.pipeline.v1.RunnerApi$Pipeline.<init>(RunnerApi.java:2892)
at org.apache.beam.model.pipeline.v1.RunnerApi$Pipeline.<init>(RunnerApi.java:2850)
at org.apache.beam.model.pipeline.v1.RunnerApi$Pipeline$1.parsePartialFrom(RunnerApi.java:3981)
at org.apache.beam.model.pipeline.v1.RunnerApi$Pipeline$1.parsePartialFrom(RunnerApi.java:3975)
at org.apache.beam.vendor.grpc.v1p13p1.com.google.protobuf.AbstractParser.parsePartialFrom(AbstractParser.java:221)
at org.apache.beam.vendor.grpc.v1p13p1.com.google.protobuf.AbstractParser.parseFrom(AbstractParser.java:239)
at org.apache.beam.vendor.grpc.v1p13p1.com.google.protobuf.AbstractParser.parseFrom(AbstractParser.java:244)
at org.apache.beam.vendor.grpc.v1p13p1.com.google.protobuf.AbstractParser.parseFrom(AbstractParser.java:49)
at org.apache.beam.vendor.grpc.v1p13p1.com.google.protobuf.GeneratedMessageV3.parseWithIOException(GeneratedMessageV3.java:311)
at org.apache.beam.model.pipeline.v1.RunnerApi$Pipeline.parseFrom(RunnerApi.java:3222)
at org.apache.beam.runners.dataflow.worker.DataflowWorkerHarnessHelper.getPipelineFromEnv(DataflowWorkerHarnessHelper.java:131)
at org.apache.beam.runners.dataflow.worker.DataflowRunnerHarness.main(DataflowRunnerHarness.java:59)
我再次尝试了getting started中提供的docker,以及使用v2.13.0构建的docker
示例文件的go.mod为:
module example.org/wordcount
go 1.12
require (
cloud.google.com/go v0.41.0 // indirect
github.com/apache/beam v2.13.0+incompatible
github.com/pkg/errors v0.8.1 // indirect
golang.org/x/net v0.0.0-20190628185345-da137c7871d7 // indirect
google.golang.org/grpc v1.22.0 // indirect
)
是什么导致了这种情况?数据流没有正式支持Apache Beam Go SDK。不过,一些用户已经能够使用它了。我怀疑这个版本可能有问题。您可以尝试不同的版本
您可以与其他用户讨论哪些版本适用于他们(尽管不受支持)。我也有同样的问题。将Beam版本增加到2.19对我来说很有用
看起来他们以向后不兼容的方式重新排列了一些Beam Fn API protos,因此您需要确保使用的是最新版本的Beam Go SDK。堆栈转储表明错误来自尝试解析管道图Protobuffer,但UTF-8编码无效。您是否对wordcount示例进行了任何修改(如添加步骤名称)?您还可以分享您是如何启动作业的吗?@Cubez更新为包含使用的命令,并指定源代码根本没有从示例中修改。到目前为止,我已经尝试了多个版本。我很想知道什么版本的组合是有效的——我可能可以复制一个工作版本,但我可能猜不出go release、protoc和Beam SDK的神奇组合。因为这不是官方支持的,另一个Beam用户可能可以帮助您选择适用于他们的版本,并可能共享依赖项文件。您可以在此处找到beam的用户邮件列表。
module example.org/wordcount
go 1.12
require (
cloud.google.com/go v0.41.0 // indirect
github.com/apache/beam v2.13.0+incompatible
github.com/pkg/errors v0.8.1 // indirect
golang.org/x/net v0.0.0-20190628185345-da137c7871d7 // indirect
google.golang.org/grpc v1.22.0 // indirect
)