Apache kafka 当存储值为Avro SpecificRecord时,KafkaStreamsStateStore不工作
我有一个Spring Cloud Kafka Streams应用程序,在使用transformer执行重复数据消除时,它在处理器API中使用StateStore 状态存储键值的类型如下:Apache kafka 当存储值为Avro SpecificRecord时,KafkaStreamsStateStore不工作,apache-kafka,avro,apache-kafka-streams,spring-cloud-stream,confluent-schema-registry,Apache Kafka,Avro,Apache Kafka Streams,Spring Cloud Stream,Confluent Schema Registry,我有一个Spring Cloud Kafka Streams应用程序,在使用transformer执行重复数据消除时,它在处理器API中使用StateStore 状态存储键值的类型如下: 运行应用程序时,在将值放入状态存储(dedupStore.put(key,value))时,我会遇到以下异常: 原因:java.lang.ClassCastException:com.codependent.outboxpattern.account.Transfermited无法强制转换为java.lang.
运行应用程序时,在将值放入状态存储(dedupStore.put(key,value)
)时,我会遇到以下异常:
原因:java.lang.ClassCastException:com.codependent.outboxpattern.account.Transfermited无法强制转换为java.lang.String
这是因为kafkastreamssstatestore
的默认值serde是StringSerde
因此,我在kafkastreamssstatestore
注释中添加了valueSerde参数,表示SpecificAvroSerde
的参数:
@KafkaStreamsStateStore(name = DEDUP_STORE, type = KafkaStreamsStateStoreProperties.StoreType.KEYVALUE,
valueSerde = "io.confluent.kafka.streams.serdes.avro.SpecificAvroSerde")
现在我在AbstractKafkaAvroSerializer.serializeImpl
中得到一个NullPointerException,因为atid=this.schemaRegistry.getId(subject,schema)代码>架构注册表为空:
原因:org.apache.kafka.common.errors.SerializationException:序列化Avro消息时出错
原因:java.lang.NullPointerException
位于io.confluent.kafka.serializers.AbstractKafkaAvroSerializer.serializeImpl(AbstractKafkaAvroSerializer.java:82)
位于io.confluent.kafka.serializers.KafkaAvroSerializer.serialize(KafkaAvroSerializer.java:53)
在io.confluent.kafka.streams.serdes.avro.SpecificAvroSerializer.serialize(SpecificAvroSerializer.java:65)中
在io.confluent.kafka.streams.serdes.avro.SpecificAvroSerializer.serialize(SpecificAvroSerializer.java:38)中
尽管已将架构注册表配置为Springbean
@Configuration
class SchemaRegistryConfiguration {
@Bean
fun schemaRegistryClient(@Value("\${spring.cloud.stream.schema-registry-client.endpoint}") endpoint: String): SchemaRegistryClient {
val client = ConfluentSchemaRegistryClient()
client.setEndpoint(endpoint)
return client
}
}
…当Kafka设置SpecificAvroSerde
时,它使用无参数构造函数,因此不会初始化架构注册表客户端:
public class SpecificAvroSerde<T extends SpecificRecord> implements Serde<T> {
private final Serde<T> inner;
public SpecificAvroSerde() {
this.inner = Serdes.serdeFrom(new SpecificAvroSerializer(), new SpecificAvroDeserializer());
}
public SpecificAvroSerde(SchemaRegistryClient client) {
if (client == null) {
throw new IllegalArgumentException("schema registry client must not be null");
} else {
this.inner = Serdes.serdeFrom(new SpecificAvroSerializer(client), new SpecificAvroDeserializer(client));
}
}
我遇到了同样的问题,忘记了需要传入schema.registry.url以确保您可以在状态存储中存储Avro记录
例如:
@Bean
public StoreBuilder eventStore(Map<String, String> schemaConfig) {
final Duration windowSize = Duration.ofMinutes(DUPLICATION_WINDOW_DURATION);
// retention period must be at least window size -- for this use case, we don't need a longer retention period
// and thus just use the window size as retention time
final Duration retentionPeriod = windowSize;
// We have to specify schema.registry.url here, otherwise schemaRegistry value will end up null
KafkaAvroSerializer serializer = new KafkaAvroSerializer();
KafkaAvroDeserializer deserializer = new KafkaAvroDeserializer();
serializer.configure(schemaConfig, true);
deserializer.configure(schemaConfig, true);
final StoreBuilder<WindowStore<Object, Long>> dedupStoreBuilder = Stores.windowStoreBuilder(
Stores.persistentWindowStore(STORE_NAME,
retentionPeriod,
windowSize,
false
),
Serdes.serdeFrom(serializer, deserializer),
// timestamp value is long
Serdes.Long());
return dedupStoreBuilder;
}
@Bean
public Map<String, String> schemaConfig(@Value("${spring.cloud.stream.schemaRegistryClient.endpoint}") String url) {
return Collections.singletonMap("schema.registry.url", "http://localhost:8081");
}
完成此操作后,我能够正确配置此存储,并且不再看到NullPointerException。您能否调试应用程序,并查看是否是由于活页夹中的任何间隙导致出现此异常?如果是,请提出问题或提供解决方案。如果你想聊天,请随意点击gitter。
@Suppress("UNCHECKED_CAST")
class DeduplicationTransformer : Transformer<String, TransferEmitted, KeyValue<String, TransferEmitted>> {
private lateinit var dedupStore: KeyValueStore<String, TransferEmitted>
private lateinit var context: ProcessorContext
override fun init(context: ProcessorContext) {
this.context = context
dedupStore = context.getStateStore(DEDUP_STORE) as KeyValueStore<String, TransferEmitted>
}
override fun transform(key: String, value: TransferEmitted): KeyValue<String, TransferEmitted>? {
return if (isDuplicate(key)) {
null
} else {
dedupStore.put(key, value)
KeyValue(key, value)
}
}
private fun isDuplicate(key: String) = dedupStore[key] != null
override fun close() {
}
}
spring:
application:
name: fraud-service
cloud:
stream:
schema-registry-client:
endpoint: http://localhost:8081
kafka:
streams:
binder:
configuration:
application:
id: fraud-service
default:
key:
serde: org.apache.kafka.common.serialization.Serdes$StringSerde
schema:
registry:
url: http://localhost:8081
bindings:
input:
destination: transfer
contentType: application/*+avro
output:
destination: fraudulent-transfer
contentType: application/*+avro
server:
port: 8086
logging:
level:
org.springframework.cloud.stream: debug
@Bean
public StoreBuilder eventStore(Map<String, String> schemaConfig) {
final Duration windowSize = Duration.ofMinutes(DUPLICATION_WINDOW_DURATION);
// retention period must be at least window size -- for this use case, we don't need a longer retention period
// and thus just use the window size as retention time
final Duration retentionPeriod = windowSize;
// We have to specify schema.registry.url here, otherwise schemaRegistry value will end up null
KafkaAvroSerializer serializer = new KafkaAvroSerializer();
KafkaAvroDeserializer deserializer = new KafkaAvroDeserializer();
serializer.configure(schemaConfig, true);
deserializer.configure(schemaConfig, true);
final StoreBuilder<WindowStore<Object, Long>> dedupStoreBuilder = Stores.windowStoreBuilder(
Stores.persistentWindowStore(STORE_NAME,
retentionPeriod,
windowSize,
false
),
Serdes.serdeFrom(serializer, deserializer),
// timestamp value is long
Serdes.Long());
return dedupStoreBuilder;
}
@Bean
public Map<String, String> schemaConfig(@Value("${spring.cloud.stream.schemaRegistryClient.endpoint}") String url) {
return Collections.singletonMap("schema.registry.url", "http://localhost:8081");
}
spring:
cloud:
stream:
schemaRegistryClient:
endpoint: http://localhost:8081