Serialization 如何从Kafka Connect SourceTask将AVRO序列化程序与架构注册表一起使用_Serialization_Schema_Apache Kafka_Avro

Serialization 如何从Kafka Connect SourceTask将AVRO序列化程序与架构注册表一起使用

serialization apache-kafka

Serialization 如何从Kafka Connect SourceTask将AVRO序列化程序与架构注册表一起使用,serialization,schema,apache-kafka,avro,Serialization,Schema,Apache Kafka,Avro,我已经建立了Confluence数据平台并开始开发SourceConnector，在相应的SourceTask.poll（）方法中，我执行了以下操作（下面是伪Java代码）： public List poll（）引发InterruptedException{ .... 信封=新信封（）； ByteArrayOutputStream out=新建ByteArrayOutputStream（）；编码器enc=EncoderFactory.get（）.binaryEncoder（out，null）；

我已经建立了Confluence数据平台并开始开发SourceConnector，在相应的SourceTask.poll（）方法中，我执行了以下操作（下面是伪Java代码）：

public List poll（）引发InterruptedException{
....
信封=新信封（）；
ByteArrayOutputStream out=新建ByteArrayOutputStream（）；
编码器enc=EncoderFactory.get（）.binaryEncoder（out，null）；
DatumWriter dw=新的反射DatumWriter（信封类）；
写（（信封）信封，附件）；
附件：冲洗（）；
out.close（）；
Map sourcePartition=new HashMap（）；
sourcePartition.put（“stream”，streamName）；
Map sourceOffset=newhashmap（）；
sourceOffset.put（“position”，Integer.parseInt（envelope.getTimestamp（））；
添加（新的SourceRecord（sourcePartition，sourceOffset，topic，org.apache.kafka.connect.data.Schema.BYTES_Schema，信封））；
....

我希望使用模式注册表，以便使用注册表中的模式id标记要序列化的对象，然后序列化，然后通过poll（）发布到Kafka主题如果任意对象的模式不在注册表中，我希望将其注册，并将相应生成的id返回到序列化程序进程，使其成为序列化对象的一部分，从而使其可反序列化

要实现这一点，我需要在上面的代码中做些什么？

要使用SchemaRegistry，您必须使用Confluent提供的类对数据进行序列化/反序列化：

io.confluent.kafka.serializers.KafkaAvroSerializer
io.confluent.kafka.serializers.Kafkavrodeserializer

这些类包含从注册表注册和请求模式的所有逻辑

如果使用maven，可以添加此依赖项：

<dependency>
  <groupId>io.confluent</groupId>
  <artifactId>kafka-avro-serializer</artifactId>
  <version>2.0.1</version>
</dependency>


合流的
卡夫卡avro序列化程序
2.0.1

签出示例实现

您将需要confluent的以下依赖项才能使其工作

    <dependency>
        <groupId>io.confluent</groupId>
        <artifactId>common-config</artifactId>
        <version>3.0.0</version>
    </dependency>
    <dependency>
        <groupId>io.confluent</groupId>
        <artifactId>common-utils</artifactId>
        <version>3.0.0</version>
    </dependency>
    <dependency>
        <groupId>io.confluent</groupId>
        <artifactId>kafka-schema-registry-client</artifactId>
        <version>3.0.0</version>
    </dependency>
    <dependency>
        <groupId>io.confluent</groupId>
        <artifactId>kafka-avro-serializer</artifactId>
        <version>3.0.0</version>
    </dependency>


合流的
公共配置
3.0.0
合流的
公用公用事业
3.0.0
合流的
卡夫卡模式注册表客户端
3.0.0
合流的
卡夫卡avro序列化程序
3.0.0

根据：

在POM中：

<dependency>
    <groupId>io.confluent</groupId>
    <artifactId>kafka-avro-serializer</artifactId>
    <version>3.3.1</version>
</dependency>
<dependency>
    <groupId>org.apache.kafka</groupId>
    <artifactId>kafka_2.11</artifactId>
    <version>0.11.0.1-cp1</version>
    <scope>provided</scope>
</dependency>

使用生产者：

Properties props = new Properties();
props.put(ProducerConfig.KEY_SERIALIZER_CLASS_CONFIG,
          io.confluent.kafka.serializers.KafkaAvroSerializer.class);
props.put(ProducerConfig.VALUE_SERIALIZER_CLASS_CONFIG,
          io.confluent.kafka.serializers.KafkaAvroSerializer.class);
props.put("schema.registry.url", "http://localhost:8081");
// Set any other properties
KafkaProducer producer = new KafkaProducer(props);

User user1 = new User();
user1.setName("Alyssa");
user1.setFavoriteNumber(256);
Future<RecordAndMetadata> resultFuture = producer.send(user1);

User user1=新用户（）；
user1.setName（“Alyssa”）；
user1.setFavoriteNumber（256）；
Future resultFuture=producer.send（user1）；

在您的注册表中，对于本例，您需要“User”的模式

Confluent还具有以下功能：

package io.confluent.examples.producer；
导入javasessione.avro.LogLine；
导入org.apache.kafka.clients.producer.KafkaProducer；
导入org.apache.kafka.clients.producer.producer；
导入org.apache.kafka.clients.producer.ProducerRecord；
导入java.util.Properties；
导入java.util.Random；
公共类AvroClicksProducer{
公共静态void main（字符串[]args）引发InterruptedException{
如果（args.length！=1）{
System.out.println（“请提供命令行参数：schemaRegistryUrl”）；
系统退出（-1）；
}
字符串schemaUrl=args[0]；
Properties props=新属性（）；
//为本例硬编码Kafka服务器URI
put（“bootstrap.servers”，“localhost:9092”）；
道具放置（“阿克斯”、“全部”）；
道具放置（“重试”，0）；
props.put（“key.serializer”、“io.confluent.kafka.serializers.KafkaAvroSerializer”）；
put（“value.serializer”、“io.confluent.kafka.serializers.KafkaAvroSerializer”）；
put（“schema.registry.url”，schemaUrl）；
//硬编码主题也一样。
String topic=“单击”；
//硬编码会在事件之间等待，所以演示体验会非常好
int wait=500；
制作人=新卡夫卡制作人（道具）；
//我们不断生成新事件，并在它们之间等待，直到有人ctrl-c
while（true）{
LogLine event=EventGenerator.getNext（）；
System.out.println（“生成的事件”+事件.toString（））；
//使用IP作为密钥，因此来自同一IP的事件将进入同一分区
ProducerRecord记录=新的ProducerRecord（主题，event.getIp（）.toString（），事件）；
制作人。发送（记录）；
线程。睡眠（等待）；
}
}
}

Hi！那么为了让KafkaAvroSerializer能够使用，我应该在我提供的伪代码片段中将序列化内容放在什么地方呢？我不需要指定

common-config

和

common-utils

，因为maven会自动获得它们

User user1 = new User();
user1.setName("Alyssa");
user1.setFavoriteNumber(256);
Future<RecordAndMetadata> resultFuture = producer.send(user1);

package io.confluent.examples.producer;

import JavaSessionize.avro.LogLine;
import org.apache.kafka.clients.producer.KafkaProducer;
import org.apache.kafka.clients.producer.Producer;
import org.apache.kafka.clients.producer.ProducerRecord;

import java.util.Properties;
import java.util.Random;

public class AvroClicksProducer {

    public static void main(String[] args) throws InterruptedException {
        if (args.length != 1) {
            System.out.println("Please provide command line arguments: schemaRegistryUrl");
            System.exit(-1);
        }

        String schemaUrl = args[0];

        Properties props = new Properties();
        // hardcoding the Kafka server URI for this example
        props.put("bootstrap.servers", "localhost:9092");
        props.put("acks", "all");
        props.put("retries", 0);
        props.put("key.serializer", "io.confluent.kafka.serializers.KafkaAvroSerializer");
        props.put("value.serializer", "io.confluent.kafka.serializers.KafkaAvroSerializer");
        props.put("schema.registry.url", schemaUrl);

        // Hard coding topic too.
        String topic = "clicks";

        // Hard coding wait between events so demo experience will be uniformly nice
        int wait = 500;

        Producer<String, LogLine> producer = new KafkaProducer<String, LogLine>(props);

        // We keep producing new events and waiting between them until someone ctrl-c
        while (true) {
            LogLine event = EventGenerator.getNext();
            System.out.println("Generated event " + event.toString());

            // Using IP as key, so events from same IP will go to same partition
            ProducerRecord<String, LogLine> record = new ProducerRecord<String, LogLine>(topic, event.getIp().toString(), event);
            producer.send(record);
            Thread.sleep(wait);
        }
    }
}