Apache flink 弗林克窗口连接不';使用事件时间和时间戳赋值器时不起作用

Apache flink 弗林克窗口连接不';使用事件时间和时间戳赋值器时不起作用,apache-flink,flink-streaming,Apache Flink,Flink Streaming,我刚刚遇到了一个非常奇怪的问题,当使用带有时间戳和水印赋值器的EventTime时,我无法从流窗口连接中获得任何结果 我使用Kafka作为我的数据流源,并尝试了AscendingTimestampExtractor和custom assigner,它们实现了具有周期性水印的assigner,如中所述,并且正如我所测试的,没有发出水印,也没有生成连接结果。如果我更改为使用ProcessingTime和TumblingProcessingTimeWindows而不使用任何时间戳赋值器,那么我可以得到

我刚刚遇到了一个非常奇怪的问题,当使用带有时间戳和水印赋值器的EventTime时,我无法从流窗口连接中获得任何结果

我使用Kafka作为我的数据流源,并尝试了AscendingTimestampExtractor和custom assigner,它们实现了具有周期性水印的assigner,如中所述,并且正如我所测试的,没有发出水印,也没有生成连接结果。如果我更改为使用ProcessingTimeTumblingProcessingTimeWindows而不使用任何时间戳赋值器,那么我可以得到正确的结果

我的自定义时间戳和水印赋值器代码如下:

FlinkKafkaConsumer09<String> myConsumer1 =
                new FlinkKafkaConsumer09<>(myTopic1, new SimpleStringSchema(), props);
myConsumer1.assignTimestampsAndWatermarks(new MyTimestampsAndWatermarks());

FlinkKafkaConsumer09<String> myConsumer2 =
                new FlinkKafkaConsumer09<>(myTopic2, new SimpleStringSchema(), props);
myConsumer2.assignTimestampsAndWatermarks(new MyTimestampsAndWatermarks());
...
public static class MyTimestampsAndWatermarks implements AssignerWithPeriodicWatermarks<String> {
        private long currentMaxTimestamp;
        @Override
        public long extractTimestamp(String element, long previousElementTimestamp) {
            long timestamp = myFunctionToGetMillisFromString(element);
            currentMaxTimestamp = Math.max(timestamp, currentMaxTimestamp);
            return timestamp;
        }
        @Override
        public Watermark getCurrentWatermark() {
            return new Watermark(currentMaxTimestamp - 1L);
        }
}
...
DataStream<myPOJO1> stream1 = env.addSource(myConsumer1).map(new MyMapper1());
DataStream<myPOJO2> stream2 = env.addSource(myConsumer2).map(new MyMapper2());
stream1.join(stream2)
    .where(new KeySelector1())
    .equalTo(new KeySelector2())
    .window(TumblingEventTimeWindows.of(Time.seconds(windowSize)))
    .apply(new JoinFunction<AdClick, GameCreate, TransferResult>() {...});
FlinkKafkaConsumer09<String> myConsumer1 =
                new FlinkKafkaConsumer09<>(myTopic1, new SimpleStringSchema(), props);
myConsumer1.assignTimestampsAndWatermarks(new AscendingTimestampExtractor<String>() {
    @Override
    public long extractAscendingTimestamp(String element) {
        return myFunctionToGetMillisFromString(element);
    }
});

FlinkKafkaConsumer09<String> myConsumer2 =
                new FlinkKafkaConsumer09<>(myTopic2, new SimpleStringSchema(), props);
myConsumer2.assignTimestampsAndWatermarks(new AscendingTimestampExtractor<String>() {
    @Override
    public long extractAscendingTimestamp(String element) {
        return myFunctionToGetMillisFromString(element);
    }
});
...
DataStream<myPOJO1> stream1 = env.addSource(myConsumer1).map(new MyMapper1());
DataStream<myPOJO2> stream2 = env.addSource(myConsumer2).map(new MyMapper2());
stream1.join(stream2)
    .where(new KeySelector1())
    .equalTo(new KeySelector2())
    .window(TumblingEventTimeWindows.of(Time.seconds(windowSize)))
    .apply(new JoinFunction<AdClick, GameCreate, TransferResult>() {...});
FlinkKafkaConsumer09联合消费者1=
新的FlinkKafkaConsumer09(myTopic1,新的SimpleStringSchema(),props);
myConsumer1.assignTimestampsAndWatermarks(新的MyTimestampsAndWatermarks());
FlinkKafkaConsumer09联合消费者2=
新的FlinkKafkaConsumer09(myTopic2,新的SimpleStringSchema(),props);
myConsumer2.assignTimestampsAndWatermarks(新的MyTimestampsAndWatermarks());
...
公共静态类MyTimestampsAndWatermarks使用PeriodicWatermarks实现AssignerWithPeriodicWatermarks{
私有长currentMaxTimestamp;
@凌驾
公共长提取时间戳(字符串元素,长previousElementTimestamp){
长时间戳=myFunctionToGetMillisFromString(元素);
currentMaxTimestamp=Math.max(时间戳,currentMaxTimestamp);
返回时间戳;
}
@凌驾
公共水印getCurrentWatermark(){
返回新水印(currentMaxTimestamp-1L);
}
}
...
DataStream stream1=env.addSource(myConsumer1.map)(新的MyMapper1());
DataStream stream2=env.addSource(myConsumer2.map)(新的MyMapper2());
流1.连接(流2)
.where(新键选择器1())
.equalTo(新的按键选择器2())
.window(TumblingEventTimeWindows.of(时间秒(窗口大小)))
.apply(新的JoinFunction(){…});
我的AscendingTimestampExtractor代码如下:

FlinkKafkaConsumer09<String> myConsumer1 =
                new FlinkKafkaConsumer09<>(myTopic1, new SimpleStringSchema(), props);
myConsumer1.assignTimestampsAndWatermarks(new MyTimestampsAndWatermarks());

FlinkKafkaConsumer09<String> myConsumer2 =
                new FlinkKafkaConsumer09<>(myTopic2, new SimpleStringSchema(), props);
myConsumer2.assignTimestampsAndWatermarks(new MyTimestampsAndWatermarks());
...
public static class MyTimestampsAndWatermarks implements AssignerWithPeriodicWatermarks<String> {
        private long currentMaxTimestamp;
        @Override
        public long extractTimestamp(String element, long previousElementTimestamp) {
            long timestamp = myFunctionToGetMillisFromString(element);
            currentMaxTimestamp = Math.max(timestamp, currentMaxTimestamp);
            return timestamp;
        }
        @Override
        public Watermark getCurrentWatermark() {
            return new Watermark(currentMaxTimestamp - 1L);
        }
}
...
DataStream<myPOJO1> stream1 = env.addSource(myConsumer1).map(new MyMapper1());
DataStream<myPOJO2> stream2 = env.addSource(myConsumer2).map(new MyMapper2());
stream1.join(stream2)
    .where(new KeySelector1())
    .equalTo(new KeySelector2())
    .window(TumblingEventTimeWindows.of(Time.seconds(windowSize)))
    .apply(new JoinFunction<AdClick, GameCreate, TransferResult>() {...});
FlinkKafkaConsumer09<String> myConsumer1 =
                new FlinkKafkaConsumer09<>(myTopic1, new SimpleStringSchema(), props);
myConsumer1.assignTimestampsAndWatermarks(new AscendingTimestampExtractor<String>() {
    @Override
    public long extractAscendingTimestamp(String element) {
        return myFunctionToGetMillisFromString(element);
    }
});

FlinkKafkaConsumer09<String> myConsumer2 =
                new FlinkKafkaConsumer09<>(myTopic2, new SimpleStringSchema(), props);
myConsumer2.assignTimestampsAndWatermarks(new AscendingTimestampExtractor<String>() {
    @Override
    public long extractAscendingTimestamp(String element) {
        return myFunctionToGetMillisFromString(element);
    }
});
...
DataStream<myPOJO1> stream1 = env.addSource(myConsumer1).map(new MyMapper1());
DataStream<myPOJO2> stream2 = env.addSource(myConsumer2).map(new MyMapper2());
stream1.join(stream2)
    .where(new KeySelector1())
    .equalTo(new KeySelector2())
    .window(TumblingEventTimeWindows.of(Time.seconds(windowSize)))
    .apply(new JoinFunction<AdClick, GameCreate, TransferResult>() {...});
FlinkKafkaConsumer09联合消费者1=
新的FlinkKafkaConsumer09(myTopic1,新的SimpleStringSchema(),props);
myConsumer1.分配时间戳和水印(新AscendingTimestampExtractor(){
@凌驾
公共长提取AscendingTimeStamp(字符串元素){
返回myFunctionToGetMillisFromString(元素);
}
});
FlinkKafkaConsumer09联合消费者2=
新的FlinkKafkaConsumer09(myTopic2,新的SimpleStringSchema(),props);
myConsumer2.分配时间戳和水印(新AscendingTimestampExtractor(){
@凌驾
公共长提取AscendingTimeStamp(字符串元素){
返回myFunctionToGetMillisFromString(元素);
}
});
...
DataStream stream1=env.addSource(myConsumer1.map)(新的MyMapper1());
DataStream stream2=env.addSource(myConsumer2.map)(新的MyMapper2());
流1.连接(流2)
.where(新键选择器1())
.equalTo(新的按键选择器2())
.window(TumblingEventTimeWindows.of(时间秒(窗口大小)))
.apply(新的JoinFunction(){…});

谢谢你的帮助

myConsumer3=myConsumer1.assign*** myConsumer4=myConsumer2.assign***


使用myConsumer3/myConsumer4就可以了

我也有同样的问题,这是一个相当愚蠢的错误,我找到了解决方案:

当你写作时:

myConsumer1.assignTimestampsAndWatermarks(new MyTimestampsAndWatermarks());
它创建一个新的数据流,而不是修改该数据流,并且您没有将其存储在变量中。 所以底线是:


将其存储在新的数据流中,并将join应用于此数据流(将分配这些时间戳和水印)。

我不知道问题出在哪里,但有几点建议:(1)一般来说,您最好扩展BoundedAutofordernessTimestampExtractor,而不是部分重新实现它;(2)您可以使用IDE中的调试器查看流是否有水印,窗口是否被触发,如果没有,原因为何。无论出于何种原因,这看起来像是水印没有向前移动的问题。正如@alpinegzmo所提到的,尝试扩展BoundedAutofordernessTimestampExtractor(或者至少手动检查事件的顺序是否正确)。另外,尝试根据您希望水印的行为增加水印。感谢回复@alpinegzmo。我使用扩展的BoundedAutofordernessTimestampExtractor从Kafka源和在窗口连接之前进行测试,根本没有水印。我还尝试使用slf4j将时间戳记录到任务管理器日志中,并且时间戳的数量是正确的。我现在找不到打印水印的方法。发送到Flink的数据包含具有不同值的时间戳。Flink webui可以显示水印,这有助于调试。@alpinegizmo,是的,当我在窗口联接中使用IngestionTime和TumblingEventTimeWindows时,我可以看到webui中的水印发生了变化,但在使用EventTime时不会发生这种情况。WebUI始终显示“无水印”。