Apache Flink-使用EventTimeSessionWindows处理来自Kinesis的pojo消息
我试图在使用AWS Kinesis的json消息时使用EventTimeSessionWindows。Apache Flink-使用EventTimeSessionWindows处理来自Kinesis的pojo消息,session,apache-flink,flink-streaming,Session,Apache Flink,Flink Streaming,我试图在使用AWS Kinesis的json消息时使用EventTimeSessionWindows。 到目前为止,我所拥有的: DataStream<SamplePojo> kinesis = env.addSource(new FlinkKinesisConsumer<>( "my-stream", new POJODeserializationSchema(), kinesisConsumerC
到目前为止,我所拥有的:
DataStream<SamplePojo> kinesis = env.addSource(new FlinkKinesisConsumer<>(
"my-stream",
new POJODeserializationSchema(),
kinesisConsumerConfig));
DataStream<SamplePojo> aggregated = kinesis
.keyBy("someProperty1")
.window(EventTimeSessionWindows.withGap(Time.seconds(2L)))
.sum("indicator");
//kinesis.print();
aggregated.print();
env.execute();
DataStream kinesis=env.addSource(新Flinkkinesis)(
“我的小溪”,
新的POJODeserializationSchema(),
运动消费配置);
数据流聚合=运动
.keyBy(“某些财产1”)
.window(EventTimeSessionWindows.withGap(时间秒(2L)))
.总额(“指标”);
//print();
聚合。打印();
execute();
其中POJODeserializationSchema类似于
这依赖于接收Tuple3的文档中的基本示例:
DataStream<Tuple3<String, Long, Integer>> aggregated = source
.keyBy(0)
.window(EventTimeSessionWindows.withGap(Time.milliseconds(3L)))
.sum(2);
DataStream聚合=源
.keyBy(0)
.window(EventTimeSessionWindows.withGap(时间毫秒(3L)))
.总数(2);
但是聚合的似乎是空的。。有什么想法吗?
(kinesis.print()
会显示抛出到kinesis中的所有消息)您必须为流提供时间戳和水印,如中所示
比如:
DataStream<Tuple3<String, Long, Integer>> aggregated = source
.assignTimestampsAndWatermarks(new AscendingTimestampExtractor<SamplePojo>() {...})
.keyBy(0)
.window(EventTimeSessionWindows.withGap(Time.milliseconds(3L)))
.sum(2);
让它与:
DataStream<SamplePojo> aggregated = kinesis.assignTimestampsAndWatermarks((new AscendingTimestampExtractor<SamplePojo>() {
@Override
public long extractAscendingTimestamp(SamplePojo samplePojo) {
return samplePojo.getSomeProperty2();
}
}));
aggregated
.keyBy((event) -> event.getSomeProperty1())
.timeWindow(Time.seconds(1));
aggregated.print();
DataStream aggregated=kinesis.assignTimeStampAndWatermarks((新的AscendingTimestampExtractor(){
@凌驾
公共长提取AscendingTimeStamp(SamplePojo SamplePojo){
返回samplePojo.getSomeProperty2();
}
}));
聚合
.keyBy((事件)->event.getSomeProperty1()
.时间窗口(时间秒(1));
聚合。打印();
感谢Dawid Wysakowicz提供的有用链接实际上,我真正想要的是我的应用程序输出代表整个会话的对象(SamplePojo=Event)
DataStream<SamplePojo> aggregated = kinesis.assignTimestampsAndWatermarks((new AscendingTimestampExtractor<SamplePojo>() {
@Override
public long extractAscendingTimestamp(SamplePojo samplePojo) {
return samplePojo.getSomeProperty2();
}
}));
aggregated
.keyBy((event) -> event.getSomeProperty1())
.timeWindow(Time.seconds(1));
aggregated.print();