Apache kafka 当存储值为Avro SpecificRecord时,KafkaStreamsStateStore不工作

Apache kafka 当存储值为Avro SpecificRecord时,KafkaStreamsStateStore不工作,apache-kafka,avro,apache-kafka-streams,spring-cloud-stream,confluent-schema-registry,Apache Kafka,Avro,Apache Kafka Streams,Spring Cloud Stream,Confluent Schema Registry,我有一个Spring Cloud Kafka Streams应用程序,在使用transformer执行重复数据消除时,它在处理器API中使用StateStore 状态存储键值的类型如下: 运行应用程序时,在将值放入状态存储(dedupStore.put(key,value))时,我会遇到以下异常: 原因:java.lang.ClassCastException:com.codependent.outboxpattern.account.Transfermited无法强制转换为java.lang.

我有一个Spring Cloud Kafka Streams应用程序,在使用transformer执行重复数据消除时,它在处理器API中使用StateStore

状态存储键值的类型如下:

运行应用程序时,在将值放入状态存储(
dedupStore.put(key,value)
)时,我会遇到以下异常:

原因:java.lang.ClassCastException:com.codependent.outboxpattern.account.Transfermited无法强制转换为java.lang.String

这是因为
kafkastreamssstatestore
的默认值serde是
StringSerde

因此,我在
kafkastreamssstatestore
注释中添加了valueSerde参数,表示
SpecificAvroSerde
的参数:

    @KafkaStreamsStateStore(name = DEDUP_STORE, type = KafkaStreamsStateStoreProperties.StoreType.KEYVALUE,
            valueSerde = "io.confluent.kafka.streams.serdes.avro.SpecificAvroSerde")
现在我在
AbstractKafkaAvroSerializer.serializeImpl
中得到一个NullPointerException,因为at
id=this.schemaRegistry.getId(subject,schema)架构注册表为空:

原因:org.apache.kafka.common.errors.SerializationException:序列化Avro消息时出错 原因:java.lang.NullPointerException 位于io.confluent.kafka.serializers.AbstractKafkaAvroSerializer.serializeImpl(AbstractKafkaAvroSerializer.java:82) 位于io.confluent.kafka.serializers.KafkaAvroSerializer.serialize(KafkaAvroSerializer.java:53) 在io.confluent.kafka.streams.serdes.avro.SpecificAvroSerializer.serialize(SpecificAvroSerializer.java:65)中 在io.confluent.kafka.streams.serdes.avro.SpecificAvroSerializer.serialize(SpecificAvroSerializer.java:38)中

尽管已将架构注册表配置为Springbean

@Configuration
class SchemaRegistryConfiguration {

    @Bean
    fun schemaRegistryClient(@Value("\${spring.cloud.stream.schema-registry-client.endpoint}") endpoint: String): SchemaRegistryClient {
        val client = ConfluentSchemaRegistryClient()
        client.setEndpoint(endpoint)
        return client
    }

}
…当Kafka设置
SpecificAvroSerde
时,它使用无参数构造函数,因此不会初始化架构注册表客户端:

public class SpecificAvroSerde<T extends SpecificRecord> implements Serde<T> {
    private final Serde<T> inner;

    public SpecificAvroSerde() {
        this.inner = Serdes.serdeFrom(new SpecificAvroSerializer(), new SpecificAvroDeserializer());
    }

    public SpecificAvroSerde(SchemaRegistryClient client) {
        if (client == null) {
            throw new IllegalArgumentException("schema registry client must not be null");
        } else {
            this.inner = Serdes.serdeFrom(new SpecificAvroSerializer(client), new SpecificAvroDeserializer(client));
        }
    }

我遇到了同样的问题,忘记了需要传入schema.registry.url以确保您可以在状态存储中存储Avro记录

例如:

    @Bean
    public StoreBuilder eventStore(Map<String, String> schemaConfig) {
        final Duration windowSize = Duration.ofMinutes(DUPLICATION_WINDOW_DURATION);

        // retention period must be at least window size -- for this use case, we don't need a longer retention period
        // and thus just use the window size as retention time
        final Duration retentionPeriod = windowSize;

        // We have to specify schema.registry.url here, otherwise schemaRegistry value will end up null
        KafkaAvroSerializer serializer = new KafkaAvroSerializer();
        KafkaAvroDeserializer deserializer = new KafkaAvroDeserializer();
        serializer.configure(schemaConfig, true);
        deserializer.configure(schemaConfig, true);

        final StoreBuilder<WindowStore<Object, Long>> dedupStoreBuilder = Stores.windowStoreBuilder(
                Stores.persistentWindowStore(STORE_NAME,
                        retentionPeriod,
                        windowSize,
                        false
                ),
                Serdes.serdeFrom(serializer, deserializer),
                // timestamp value is long
                Serdes.Long());
        return dedupStoreBuilder;
    }

    @Bean
    public Map<String, String> schemaConfig(@Value("${spring.cloud.stream.schemaRegistryClient.endpoint}") String url) {
        return Collections.singletonMap("schema.registry.url", "http://localhost:8081");
    }

完成此操作后,我能够正确配置此存储,并且不再看到NullPointerException。

您能否调试应用程序,并查看是否是由于活页夹中的任何间隙导致出现此异常?如果是,请提出问题或提供解决方案。如果你想聊天,请随意点击gitter。
@Suppress("UNCHECKED_CAST")
class DeduplicationTransformer : Transformer<String, TransferEmitted, KeyValue<String, TransferEmitted>> {

    private lateinit var dedupStore: KeyValueStore<String, TransferEmitted>
    private lateinit var context: ProcessorContext

    override fun init(context: ProcessorContext) {
        this.context = context
        dedupStore = context.getStateStore(DEDUP_STORE) as KeyValueStore<String, TransferEmitted>
    }

    override fun transform(key: String, value: TransferEmitted): KeyValue<String, TransferEmitted>? {
        return if (isDuplicate(key)) {
            null
        } else {
            dedupStore.put(key, value)
            KeyValue(key, value)
        }
    }

    private fun isDuplicate(key: String) = dedupStore[key] != null

    override fun close() {
    }
}
spring:
  application:
    name: fraud-service
  cloud:
    stream:
      schema-registry-client:
        endpoint: http://localhost:8081
      kafka:
        streams:
          binder:
            configuration:
              application:
                id: fraud-service
              default:
                key:
                  serde: org.apache.kafka.common.serialization.Serdes$StringSerde
              schema:
                registry:
                  url: http://localhost:8081
      bindings:
        input:
          destination: transfer
          contentType: application/*+avro
        output:
          destination: fraudulent-transfer
          contentType: application/*+avro

server:
  port: 8086

logging:
  level:
    org.springframework.cloud.stream: debug

    @Bean
    public StoreBuilder eventStore(Map<String, String> schemaConfig) {
        final Duration windowSize = Duration.ofMinutes(DUPLICATION_WINDOW_DURATION);

        // retention period must be at least window size -- for this use case, we don't need a longer retention period
        // and thus just use the window size as retention time
        final Duration retentionPeriod = windowSize;

        // We have to specify schema.registry.url here, otherwise schemaRegistry value will end up null
        KafkaAvroSerializer serializer = new KafkaAvroSerializer();
        KafkaAvroDeserializer deserializer = new KafkaAvroDeserializer();
        serializer.configure(schemaConfig, true);
        deserializer.configure(schemaConfig, true);

        final StoreBuilder<WindowStore<Object, Long>> dedupStoreBuilder = Stores.windowStoreBuilder(
                Stores.persistentWindowStore(STORE_NAME,
                        retentionPeriod,
                        windowSize,
                        false
                ),
                Serdes.serdeFrom(serializer, deserializer),
                // timestamp value is long
                Serdes.Long());
        return dedupStoreBuilder;
    }

    @Bean
    public Map<String, String> schemaConfig(@Value("${spring.cloud.stream.schemaRegistryClient.endpoint}") String url) {
        return Collections.singletonMap("schema.registry.url", "http://localhost:8081");
    }
spring:
  cloud:
    stream:
      schemaRegistryClient:
        endpoint: http://localhost:8081