Apache kafka 如何处理卡夫卡流中给定时间范围内一个键对应的最新记录?

Apache kafka 如何处理卡夫卡流中给定时间范围内一个键对应的最新记录?,apache-kafka,kafka-consumer-api,apache-kafka-streams,spring-kafka,kafka-producer-api,Apache Kafka,Kafka Consumer Api,Apache Kafka Streams,Spring Kafka,Kafka Producer Api,说明:我只想处理密钥的最新唯一事件。我有卡夫卡斯特音乐团。假设我在kafkaStreams中得到了这些事件: {id= "DELHI", event1}, {id= "MUMBAI", event2}, {id= "DELHI", event3}, {id= "JAIPUR", event4}, {id= "MUMBAI", event5} 现在,我想将它们分组(比如10分钟内),以便在给定的

说明:我只想处理密钥的最新唯一事件。我有卡夫卡斯特音乐团。假设我在kafkaStreams中得到了这些事件:

{id= "DELHI", event1},
{id= "MUMBAI", event2},
{id= "DELHI", event3},
{id= "JAIPUR", event4},
{id= "MUMBAI", event5} 
现在,我想将它们分组(比如10分钟内),以便在给定的时间范围内,每个关键点都只有最新的事件

`**EXPECTED EVENTS:**{id= "DELHI", event3},
                     {id= "MUMBAI", event5},
                     {id= "JAIPUR", event4}`



 **Events output according to my code implementation :** 
            {id= "DELHI", event1},
            {id= "MUMBAI", event2},
            {id= "JAIPUR", event4} and rest are marked as duplicated.
Properties streamsConfiguration = this.buildKafkaProperties();
                         StreamsBuilder builder = new StreamsBuilder(); KStream<String, LocationChangeEvent> kStream = builder.stream(this.kafkaConfigProperties.getTopicName(), Consumed.with(Serdes.String(), locationChangeEventSerde));
                            final StoreBuilder<WindowStore<String, LocationChangeEvent>> dedupStoreBuilder = Stores.windowStoreBuilder(
                                        Stores.persistentWindowStore(storeName, retentionPeriod, numSegment, minutes, false),
                                        Serdes.String(),
                                        serde);
                                    builder.addStateStore(dedupStoreBuilder);
                                    kStream.filter((key,value)->value.getChangeId()!=null)
                                        .transformValues(() -> new DeduplicationTransformer<>(windowSize.toMillis(), (key, value) -> key), storeName)
                                        .filter((k, v) -> v != null)
                                        .foreach((requestId, object) -> {
                                                    // below function push event to consumer
                                                     this.streamKafkaFunction(object);
                                        });
                            KafkaStreams streams = new KafkaStreams(builder.build(), streamsConfiguration);
                                    streams.start();
                    }
                 private static class DeduplicationTransformer<K, V, E> implements ValueTransformerWithKey<K, V, V> {
                            private ProcessorContext context;
                            private WindowStore<E, Long> eventIdStore;
                            private final long leftDurationMs;
                            private final long rightDurationMs;
                            private final KeyValueMapper<K, V, E> idExtractor;
                    DeduplicationTransformer(final long maintainDurationPerEventInMs, final KeyValueMapper<K, V, E> idExtractor) {
                                if (maintainDurationPerEventInMs < 1) {
                                    throw new IllegalArgumentException("maintain duration per event must be >= 1");
                                }
                                leftDurationMs = maintainDurationPerEventInMs;
                                rightDurationMs = maintainDurationPerEventInMs;
                                this.idExtractor = idExtractor;
                            }
                    
                            @SuppressWarnings("unchecked")
                            @Override
                            public void init(ProcessorContext context) {
                                this.context = context;
                                eventIdStore = (WindowStore<E, Long>) 
                                context.getStateStore(storeName);
                            }
                        @Override
                            public V transform(final K key, final V value) {
                                final E eventId = idExtractor.apply(key, value);
                                LOGGER.info("Event is : {}", eventId);
                                if (eventId == null) {
                                    return value;
                                } else {
                                    final V output;
                                    if (isDuplicate(eventId)) {
                                        output = null;
                                    } else {
                                        output = value;
                                        rememberNewEvent(eventId, context.timestamp());
                                    }
                                    return output;
                                }
                            }
                    private boolean isDuplicate(final E eventId) {
                                final long eventTime = context.timestamp();
                                final WindowStoreIterator<Long> timeIterator = eventIdStore.fetch(
                                    eventId,
                                    eventTime - leftDurationMs,
                                    eventTime + rightDurationMs);
                                final boolean isDuplicate = timeIterator.hasNext();
                                timeIterator.close();
                                return isDuplicate;
                            }
    
                            private void rememberNewEvent(final E eventId, final long timestamp) {
                                eventIdStore.put(eventId, timestamp, timestamp);
                            }
    
                    @Override
                            public void close() {
                            }
                        } 
根据下面所附的代码,我能够将第一个唯一的事件推送给消费者,并在给定的时间内用相同的键将所有事件进一步标记为重复事件。但我不想发送第一个事件,而是想发送每个关键点的最新事件,即该时间范围内每个关键点的最后一个事件

`**EXPECTED EVENTS:**{id= "DELHI", event3},
                     {id= "MUMBAI", event5},
                     {id= "JAIPUR", event4}`



 **Events output according to my code implementation :** 
            {id= "DELHI", event1},
            {id= "MUMBAI", event2},
            {id= "JAIPUR", event4} and rest are marked as duplicated.
Properties streamsConfiguration = this.buildKafkaProperties();
                         StreamsBuilder builder = new StreamsBuilder(); KStream<String, LocationChangeEvent> kStream = builder.stream(this.kafkaConfigProperties.getTopicName(), Consumed.with(Serdes.String(), locationChangeEventSerde));
                            final StoreBuilder<WindowStore<String, LocationChangeEvent>> dedupStoreBuilder = Stores.windowStoreBuilder(
                                        Stores.persistentWindowStore(storeName, retentionPeriod, numSegment, minutes, false),
                                        Serdes.String(),
                                        serde);
                                    builder.addStateStore(dedupStoreBuilder);
                                    kStream.filter((key,value)->value.getChangeId()!=null)
                                        .transformValues(() -> new DeduplicationTransformer<>(windowSize.toMillis(), (key, value) -> key), storeName)
                                        .filter((k, v) -> v != null)
                                        .foreach((requestId, object) -> {
                                                    // below function push event to consumer
                                                     this.streamKafkaFunction(object);
                                        });
                            KafkaStreams streams = new KafkaStreams(builder.build(), streamsConfiguration);
                                    streams.start();
                    }
                 private static class DeduplicationTransformer<K, V, E> implements ValueTransformerWithKey<K, V, V> {
                            private ProcessorContext context;
                            private WindowStore<E, Long> eventIdStore;
                            private final long leftDurationMs;
                            private final long rightDurationMs;
                            private final KeyValueMapper<K, V, E> idExtractor;
                    DeduplicationTransformer(final long maintainDurationPerEventInMs, final KeyValueMapper<K, V, E> idExtractor) {
                                if (maintainDurationPerEventInMs < 1) {
                                    throw new IllegalArgumentException("maintain duration per event must be >= 1");
                                }
                                leftDurationMs = maintainDurationPerEventInMs;
                                rightDurationMs = maintainDurationPerEventInMs;
                                this.idExtractor = idExtractor;
                            }
                    
                            @SuppressWarnings("unchecked")
                            @Override
                            public void init(ProcessorContext context) {
                                this.context = context;
                                eventIdStore = (WindowStore<E, Long>) 
                                context.getStateStore(storeName);
                            }
                        @Override
                            public V transform(final K key, final V value) {
                                final E eventId = idExtractor.apply(key, value);
                                LOGGER.info("Event is : {}", eventId);
                                if (eventId == null) {
                                    return value;
                                } else {
                                    final V output;
                                    if (isDuplicate(eventId)) {
                                        output = null;
                                    } else {
                                        output = value;
                                        rememberNewEvent(eventId, context.timestamp());
                                    }
                                    return output;
                                }
                            }
                    private boolean isDuplicate(final E eventId) {
                                final long eventTime = context.timestamp();
                                final WindowStoreIterator<Long> timeIterator = eventIdStore.fetch(
                                    eventId,
                                    eventTime - leftDurationMs,
                                    eventTime + rightDurationMs);
                                final boolean isDuplicate = timeIterator.hasNext();
                                timeIterator.close();
                                return isDuplicate;
                            }
    
                            private void rememberNewEvent(final E eventId, final long timestamp) {
                                eventIdStore.put(eventId, timestamp, timestamp);
                            }
    
                    @Override
                            public void close() {
                            }
                        } 
Properties-streamsConfiguration=this.buildKafkaProperties();
StreamsBuilder builder=新的StreamsBuilder();KStream KStream=builder.stream(this.kafkanconfigproperties.getTopicName(),consumered.with(Serdes.String(),locationChangeEventSerde));
最终StoreBuilder重复数据StoreBuilder=Stores.WindowsStoreBuilder(
Stores.persistentWindowsStore(storeName、retentionPeriod、numSegment、minutes、false),
Serdes.String(),
塞德);
builder.addStateStore(重复数据存储生成器);
kStream.filter((键,值)->value.getChangeId()!=null)
.transformValues(()->新的重复数据消除转换器(WindowsSize.toMillis(),(键,值)->键),storeName)
.filter((k,v)->v!=null)
.foreach((请求ID,对象)->{
//以下功能将事件推送到消费者
this.streamkafkaf函数(对象);
});
KafkaStreams streams=新的KafkaStreams(builder.build(),streams配置);
streams.start();
}
私有静态类重复数据消除Transformer实现ValueTransformerWithKey{
私有处理器上下文上下文;
私有WindowsStore事件存储;
私人最后长时间;
私人最终长期权利期限;
私有最终键值映射器idExtractor;
重复数据消除转换器(最终长期维护持续时间Pereventinms,最终KeyValueMapper idExtractor){
如果(维护持续时间pereventinms<1){
抛出新的IllegalArgumentException(“每个事件的维护持续时间必须大于等于1”);
}
leftDurationMs=MaintaintDurationPerEventMs;
rightDurationMs=维护持续时间PerEventMs;
this.idExtractor=idExtractor;
}
@抑制警告(“未选中”)
@凌驾
公共void init(ProcessorContext上下文){
this.context=上下文;
eventIdStore=(WindowsStore)
getStateStore(storeName);
}
@凌驾
公共V变换(最终K键,最终V值){
最终E事件ID=idExtractor.apply(键,值);
info(“事件是:{}”,eventId);
if(eventId==null){
返回值;
}否则{
最终V输出;
if(isDuplicate(eventId)){
输出=空;
}否则{
产出=价值;
rememberNewEvent(eventId,context.timestamp());
}
返回输出;
}
}
专用布尔值isDuplicate(最终E事件ID){
final long eventTime=context.timestamp();
最终WindowsStoreIterator timeIterator=eventIdStore.fetch(
eventId,
事件时间-leftDurationMs,
eventTime+rightDurationMs);
最终布尔值isDuplicate=timeIterator.hasNext();
timeIterator.close();
返回两份;
}
私有void rememberNewEvent(最终E事件ID,最终长时间戳){
put(eventId,timestamp,timestamp);
}
@凌驾
公众假期结束(){
}
} 

下面是一个简单的示例,演示如何使用带窗口的聚合和
抑制操作,仅在窗口关闭时才为每个键发出上次接收的值

var builder=newstreamsbuilder();
stream(“我的输入主题”)
.groupByKey()
.windowedBy(时间窗的持续时间(10分钟)))
.减少((值1,值2)->值2)
.suppress(supprested.untilwindowcloss(supprested.BufferConfig.unbounded()))