Google bigquery 数据流管道,带有;更新;标记失败,错误为“0”;改组/GroupByKey“;

Google bigquery 数据流管道,带有;更新;标记失败,错误为“0”;改组/GroupByKey“;,google-bigquery,google-cloud-dataflow,dataflow,Google Bigquery,Google Cloud Dataflow,Dataflow,我当前的代码从pubsub读取并对其应用过滤器,然后写入bigQuery表。 代码如下 public class BeaconAnomalyDetectionPipeline { public static void main(String[] args) { BeaconAnomalyDetectionOptions options = PipelineOptionsFactory.fromArgs(args)

我当前的代码从pubsub读取并对其应用过滤器,然后写入bigQuery表。 代码如下

public class BeaconAnomalyDetectionPipeline {

    public static void main(String[] args) {

        BeaconAnomalyDetectionOptions options =
                PipelineOptionsFactory.fromArgs(args)
                        .withValidation()
                        .as(BeaconAnomalyDetectionOptions.class);
        
         options.setJobName("test-name");

        run(options);
    }

    public static PipelineResult run(BeaconAnomalyDetectionOptions options) {

        Pipeline p = Pipeline.create(options);

        p.getCoderRegistry().registerCoderForType(TypeDescriptor.of(String.class),
                StringUtf8Coder.of());


        PCollection<IngestionRequest> ingestionRequests = p.
                apply("ReadPubSubSubscription",
                    PubsubIO.readMessages()
                            .fromSubscription(options.getSubscriberId())) 
         .apply(Window.into(FixedWindows.of(Duration.standardMinutes(options.getWindowSize()))))
 
        PCollection<IngestionRequest> anomalies =
                ingestionRequests.apply(
                        "filter by Signature",
                        Filter.by(ingestionRequest -> ingestionRequest.getCompressionTypeValue()%2!=0));

        anomalies
        .apply(
                "WriteAnomalyToBQ",
                BQWriteTransform.newBuilder()
                        .setTableSpec(options.getTableSpec())
                        .setMethod(BigQueryIO.Write.Method.STREAMING_INSERTS)
                        .build());
        return p.run();
    }
}
我已经更新了我的代码以指定编码者,并添加了分步重新洗牌和groupByKey,但仍然看到相同的问题

更新代码如下:

public class BeaconAnomalyDetectionPipeline {

    public static void main(String[] args) {

        BeaconAnomalyDetectionOptions options =
                PipelineOptionsFactory.fromArgs(args)
                        .withValidation()
                        .as(BeaconAnomalyDetectionOptions.class);
        
         options.setJobName("test-name");

        run(options);
    }

    public static PipelineResult run(BeaconAnomalyDetectionOptions options) {

        Pipeline p = Pipeline.create(options);

        p.getCoderRegistry().registerCoderForType(TypeDescriptor.of(String.class),
                StringUtf8Coder.of());


        PCollection<IngestionRequest> ingestionRequests = p.
                apply("ReadPubSubSubscription",
                    PubsubIO.readMessages()
                            .fromSubscription(options.getSubscriberId()))
                .apply(Window.into(FixedWindows.of(Duration.standardMinutes(options.getWindowSize()))))
                .apply(WithKeys.of(input -> 1)).setCoder(KvCoder.of(VarIntCoder.of(), PubsubMessageWithAttributesCoder.of()))
                .apply(Reshuffle.of())
                .apply(GroupByKey.<Integer, PubsubMessage>create())
                .apply(ParDo.of(new Combiner()))
                .apply("filter by compression type new", MapElements.via(new SimpleFunction<KV<Integer, PubsubMessage>, PubsubMessage>() {
                        public PubsubMessage apply(KV<Integer, PubsubMessage> input) {
                                if (input.getKey()%2!=0) {
                                        return input.getValue();
                                }else {
                                        return null;
                                }
                        }
                }))
                .apply("PubSubMessagesToTableRows",
                        new PubsubProtoToIngestionRequest());

        ingestionRequests.apply(
                "WriteAnomalyToBQ",
                BQWriteTransform.newBuilder()
                        .setTableSpec(options.getTableSpec())
                        .setMethod(BigQueryIO.Write.Method.STREAMING_INSERTS)
                        .build());
       
        return p.run(); 
    }
}
我在更新脚本中使用了transformNameMapping

--update \
--transformNameMapping='{\"Reshuffle/GroupBykey\":\"\",\"filter by compression type/MapElements\":\"\",\"\":\"filter by Signature\"}' \
--jobName=test-name "
有人能帮我找到一个有效的解决方案吗?非常感谢

The new job is missing steps GroupByKey, Reshuffle/GroupByKey. If these steps have been renamed or deleted, please specify them with the update command.
--update \
--transformNameMapping='{\"Reshuffle/GroupBykey\":\"\",\"filter by compression type/MapElements\":\"\",\"\":\"filter by Signature\"}' \
--jobName=test-name "