Apache kafka 卡夫卡制作人阿帕奇·比姆

Apache kafka 卡夫卡制作人阿帕奇·比姆,apache-kafka,google-cloud-dataflow,apache-beam,apache-beam-io,Apache Kafka,Google Cloud Dataflow,Apache Beam,Apache Beam Io,如何获取在apache beam KafkaIO中收到确认的记录 基本上,我希望所有没有得到任何确认的记录都转到bigquery表,以便稍后重试。我使用了文档中的以下代码片段 .apply(KafkaIO.<Long, String>read() .withBootstrapServers("broker_1:9092,broker_2:9092") .withTopic("my_topic") // use withTopics(List<

如何获取在apache beam KafkaIO中收到确认的记录


    .apply(KafkaIO.<Long, String>read()
       .withTopic("my_topic")  // use withTopics(List<String>) to read from multiple topics.

       // Above four are required configuration. returns PCollection<KafkaRecord<Long, String>>

       // Rest of the settings are optional :

       // you can further customize KafkaConsumer used to read the records by adding more
       // settings for ConsumerConfig. e.g :
       .updateConsumerProperties(ImmutableMap.of("group.id", "my_beam_app_1"))

       // set event times and watermark based on LogAppendTime. To provide a custom
       // policy see withTimestampPolicyFactory(). withProcessingTime() is the default.

       // restrict reader to committed messages on Kafka (see method documentation).

       // offset consumed by the pipeline can be committed back.

       // finally, if you don't need Kafka metadata, you can drop it.g
       .withoutMetadata() // PCollection<KV<Long, String>>
    .apply(Values.<String>create()) // PCollection<String>
.updateConsumerProperties(ImmutableMap.of(“group.id”、“my\u beam\u app\u 1”))

默认情况下,Beam IO的设计目的是一直尝试写入/读取/处理元素,直到。(重复错误后,批处理管道将失败)




