Apache flink 为什么PatternStream的相同事件可以同时发送到PatternSelectFunction和PatternTimeoutFunction?

Apache flink 为什么PatternStream的相同事件可以同时发送到PatternSelectFunction和PatternTimeoutFunction?,apache-flink,flink-cep,Apache Flink,Flink Cep,我必须在给定的时间内收集3个来自kafka的流中的3个事件,这些流具有相同的correlationId,并且如果这些事件延迟到达,我能够收集全部或部分事件 我在3数据流上使用了一个union和一个CEP模式。但是我注意到,与模式非常匹配的事件,因此在选择函数中收集的事件,一旦达到超时,也会在超时函数中发送 我不知道在我的示例中我做错了什么,或者我不理解什么,但我希望积极匹配的事件也不会超时 我得到的印象是存储了不相交的时间快照 我正在使用1.3.0 Flink版本 谢谢你的帮助 控制台输出,我们

我必须在给定的时间内收集3个来自kafka的流中的3个事件,这些流具有相同的correlationId,并且如果这些事件延迟到达,我能够收集全部或部分事件

我在3数据流上使用了一个union和一个CEP模式。但是我注意到,与模式非常匹配的事件,因此在选择函数中收集的事件,一旦达到超时,也会在超时函数中发送

我不知道在我的示例中我做错了什么,或者我不理解什么,但我希望积极匹配的事件也不会超时

我得到的印象是存储了不相交的时间快照

我正在使用1.3.0 Flink版本

谢谢你的帮助

控制台输出,我们可以看到3个相关事件中的2个被选中并按时间安排:

匹配事件:
键--0b3c116e-0703-43cb-8b3e-54b0b5e93948
键---f969dd4d-47ff-445c-9182-0f95a569febb
键--2ecbb89d-1463-4669-a657-555f73b6fb1d

超时事件:

首次调用超时函数:
键---f969dd4d-47ff-445c-9182-0f95a569febb
键--0b3c116e-0703-43cb-8b3e-54b0b5e93948

第二次呼叫:
键---f969dd4d-47ff-445c-9182-0f95a569febb

11:01:44,677 INFO  com.bnpp.pe.cep.Main                                          - Matching events:
11:01:44,678 INFO  com.bnpp.pe.cep.Main                                          - SctRequestProcessStep2Event(super=SctRequestEvent(correlationId=cId---a14a4e23-56c5-4242-9c43-d465d2b84454, key=Key---0b3c116e-0703-43cb-8b3e-54b0b5e93948, debtorIban=BE42063929068055, creditorIban=BE42063929068056, amount=100.0, communication=test), succeeded=false)
11:01:44,678 INFO  com.bnpp.pe.cep.Main                                          - SctRequestProcessStep1Event(super=SctRequestEvent(correlationId=cId---a14a4e23-56c5-4242-9c43-d465d2b84454, key=Key---2ecbb89d-1463-4669-a657-555f73b6fb1d, debtorIban=BE42063929068055, creditorIban=BE42063929068056, amount=100.0, communication=test), succeeded=false)
11:01:44,678 INFO  com.bnpp.pe.cep.Main                                          - SctRequestProcessStep3Event(super=SctRequestEvent(correlationId=cId---a14a4e23-56c5-4242-9c43-d465d2b84454, key=Key---f969dd4d-47ff-445c-9182-0f95a569febb, debtorIban=BE42063929068055, creditorIban=BE42063929068056, amount=100.0, communication=test), succeeded=false)
Right(SctRequestFinalEvent(super=SctRequestEvent(correlationId=cId---a14a4e23-56c5-4242-9c43-d465d2b84454, key=Key---2196fdb0-01e8-4cc6-af4b-04bcf9dc67a2, debtorIban=null, creditorIban=null, amount=null, communication=null), state=SUCCESS))
11:01:49,635 INFO  com.bnpp.pe.cep.Main                                          - Timed out events:
11:01:49,636 INFO  com.bnpp.pe.cep.Main                                          - SctRequestProcessStep3Event(super=SctRequestEvent(correlationId=cId---a14a4e23-56c5-4242-9c43-d465d2b84454, key=Key---f969dd4d-47ff-445c-9182-0f95a569febb, debtorIban=BE42063929068055, creditorIban=BE42063929068056, amount=100.0, communication=test), succeeded=false)
11:01:49,636 INFO  com.bnpp.pe.cep.Main                                          - SctRequestProcessStep2Event(super=SctRequestEvent(correlationId=cId---a14a4e23-56c5-4242-9c43-d465d2b84454, key=Key---0b3c116e-0703-43cb-8b3e-54b0b5e93948, debtorIban=BE42063929068055, creditorIban=BE42063929068056, amount=100.0, communication=test), succeeded=false)
11:01:49,636 INFO  com.bnpp.pe.cep.Main                                          - Timed out events:
11:01:49,636 INFO  com.bnpp.pe.cep.Main                                          - SctRequestProcessStep3Event(super=SctRequestEvent(correlationId=cId---a14a4e23-56c5-4242-9c43-d465d2b84454, key=Key---f969dd4d-47ff-445c-9182-0f95a569febb, debtorIban=BE42063929068055, creditorIban=BE42063929068056, amount=100.0, communication=test), succeeded=false)
Left(SctRequestFinalEvent(super=SctRequestEvent(correlationId=cId---a14a4e23-56c5-4242-9c43-d465d2b84454, key=Key---aa437bcf-ecaa-4561-9f4e-08a902f0e248, debtorIban=null, creditorIban=null, amount=null, communication=null), state=FAILED))
Left(SctRequestFinalEvent(super=SctRequestEvent(correlationId=cId---a14a4e23-56c5-4242-9c43-d465d2b84454, key=Key---5420eb41-2723-42ac-83fd-d203d6bf2526, debtorIban=null, creditorIban=null, amount=null, communication=null), state=FAILED))
我的测试代码:

package com.bnpp.pe.cep;

import com.bnpp.pe.event.Event;
import com.bnpp.pe.event.SctRequestFinalEvent;
import com.bnpp.pe.util.EventHelper;
import lombok.extern.slf4j.Slf4j;
import org.apache.flink.api.java.functions.KeySelector;
import org.apache.flink.cep.CEP;
import org.apache.flink.cep.PatternSelectFunction;
import org.apache.flink.cep.PatternStream;
import org.apache.flink.cep.PatternTimeoutFunction;
import org.apache.flink.cep.pattern.Pattern;
import org.apache.flink.streaming.api.TimeCharacteristic;
import org.apache.flink.streaming.api.datastream.DataStream;
import org.apache.flink.streaming.api.environment.StreamExecutionEnvironment;
import org.apache.flink.streaming.api.windowing.time.Time;
import org.apache.flink.streaming.connectors.kafka.FlinkKafkaConsumer010;
import org.apache.flink.streaming.util.serialization.DeserializationSchema;

import java.io.Serializable;
import java.util.List;
import java.util.Map;
import java.util.Properties;

/**
 * Created by Laurent Bauchau on 2/08/2017.
 */
@Slf4j
public class Main implements Serializable {

    public static void main(String... args) {
        new Main();
    }

    public static final String step1Topic = "sctinst-step1";
    public static final String step2Topic = "sctinst-step2";
    public static final String step3Topic = "sctinst-step3";

    private static final String PATTERN_NAME = "the_3_correlated_events_pattern";

    private final FlinkKafkaConsumer010<Event> kafkaSource1;
    private final DeserializationSchema<Event> deserializationSchema1;

    private final FlinkKafkaConsumer010<Event> kafkaSource2;
    private final DeserializationSchema<Event> deserializationSchema2;

    private final FlinkKafkaConsumer010<Event> kafkaSource3;
    private final DeserializationSchema<Event> deserializationSchema3;

    private Main() {

        // Kafka init
        Properties kafkaProperties = new Properties();
        kafkaProperties.setProperty("bootstrap.servers", "localhost:9092");
        kafkaProperties.setProperty("zookeeper.connect", "localhost:2180");
        kafkaProperties.setProperty("group.id", "sct-validation-cgroup1");

        deserializationSchema1 = new SctRequestProcessStep1EventDeserializer();
        kafkaSource1 = new FlinkKafkaConsumer010<>(step1Topic, deserializationSchema1, kafkaProperties);

        deserializationSchema2 = new SctRequestProcessStep2EventDeserializer();
        kafkaSource2 = new FlinkKafkaConsumer010<>(step2Topic, deserializationSchema2, kafkaProperties);

        deserializationSchema3 = new SctRequestProcessStep3EventDeserializer();
        kafkaSource3 = new FlinkKafkaConsumer010<>(step3Topic, deserializationSchema3, kafkaProperties);

        try {
            StreamExecutionEnvironment env = StreamExecutionEnvironment.getExecutionEnvironment();
            env.setStreamTimeCharacteristic(TimeCharacteristic.IngestionTime);

            DataStream<Event> s1 = env.addSource(kafkaSource1);
            DataStream<Event> s2 = env.addSource(kafkaSource2);
            DataStream<Event> s3 = env.addSource(kafkaSource3);

            DataStream<Event> unionStream = s1.union(s2, s3);

            Pattern successPattern = Pattern.<Event>begin(PATTERN_NAME)
                    .times(3)
                    .within(Time.seconds(5));

            PatternStream<Event> matchingStream = CEP.pattern(
                    unionStream.keyBy(new CIDKeySelector()),
                    successPattern);

            matchingStream.select(new MyPatternTimeoutFunction(), new MyPatternSelectFunction())
                    .print()
                    .setParallelism(1);

            env.execute();

        } catch (Exception e) {
            log.error(e.getMessage(), e);
        }
    }

    private static class MyPatternTimeoutFunction implements PatternTimeoutFunction<Event, SctRequestFinalEvent> {

        @Override
        public SctRequestFinalEvent timeout(Map<String, List<Event>> pattern, long timeoutTimestamp) throws Exception {

            List<Event> events = pattern.get(PATTERN_NAME);
            log.info("Timed out events:");
            events.forEach(e -> log.info(e.toString()));

            // Resulting event creation
            SctRequestFinalEvent event = new SctRequestFinalEvent();
            EventHelper.correlate(events.get(0), event);
            EventHelper.injectKey(event);
            event.setState(SctRequestFinalEvent.State.FAILED);

            return event;
        }
    }

    private static class MyPatternSelectFunction
            implements PatternSelectFunction<Event, SctRequestFinalEvent> {

        @Override
        public SctRequestFinalEvent select(Map<String, List<Event>> pattern) throws Exception {

            List<Event> events = pattern.get(PATTERN_NAME);
            log.info("Matching events:");
            events.forEach(e -> log.info(e.toString()));

            // Resulting event creation
            SctRequestFinalEvent event = new SctRequestFinalEvent();
            EventHelper.correlate(events.get(0), event);
            EventHelper.injectKey(event);
            event.setState(SctRequestFinalEvent.State.SUCCESS);

            return event;
        }
    }

    private static class CIDKeySelector implements KeySelector<Event, String> {
        @Override
        public String getKey(Event event) throws Exception {
            return event.getCorrelationId();
        }
    }
}
package com.bnpp.pe.cep;
导入com.bnpp.pe.event.event;
导入com.bnpp.pe.event.SctRequestFinalEvent;
导入com.bnpp.pe.util.EventHelper;
导入lombok.extern.slf4j.slf4j;
导入org.apache.flink.api.java.functions.KeySelector;
导入org.apache.flink.cep.cep;
导入org.apache.flink.cep.PatternSelectFunction;
导入org.apache.flink.cep.PatternStream;
导入org.apache.flink.cep.PatternTimeoutFunction;
导入org.apache.flink.cep.pattern.pattern;
导入org.apache.flink.streaming.api.TimeCharacteristic;
导入org.apache.flink.streaming.api.datastream.datastream;
导入org.apache.flink.streaming.api.environment.StreamExecutionEnvironment;
导入org.apache.flink.streaming.api.windowing.time.time;
导入org.apache.flink.streaming.connectors.kafka.FlinkKafkaConsumer010;
导入org.apache.flink.streaming.util.serialization.DeserializationSchema;
导入java.io.Serializable;
导入java.util.List;
导入java.util.Map;
导入java.util.Properties;
/**
*Laurent Bauchau于2017年8月2日创作。
*/
@Slf4j
公共类Main实现可序列化{
公共静态void main(字符串…参数){
新的Main();
}
公共静态最终字符串step1Topic=“sctinst-step1”;
公共静态最终字符串step2Topic=“sctinst-step2”;
公共静态最终字符串step3Topic=“sctinst-step3”;
私有静态最终字符串模式\u NAME=“相关事件\u模式”;
私人最终FlinkKafkaConsumer010卡夫卡索资源1;
私有最终反序列化模式反序列化模式1;
私人最终FlinkKafkaConsumer010卡夫卡索资源2;
私有最终反序列化模式反序列化模式2;
私人最终FlinkKafkaConsumer010卡夫卡索资源3;
私有最终反序列化模式反序列化模式3;
专用干管(){
//卡夫卡因特
属性kafkaProperties=新属性();
kafkaProperties.setProperty(“bootstrap.servers”,“localhost:9092”);
kafkaProperties.setProperty(“zookeeper.connect”,“localhost:2180”);
kafkaProperties.setProperty(“group.id”、“sct-validation-cgroup1”);
反序列化Schema1=新的SctRequestProcessStep1EventDeserializer();
kafkaSource1=新的FlinkKafkaConsumer010(step1Topic,反序列化方案1,kafkaProperties);
反序列化Schema2=新的SctRequestProcessStep2EventDeserializer();
kafkaSource2=新的FlinkKafkaConsumer010(step2Topic,反序列化模式2,kafkaProperties);
deserializationSchema3=新的SctRequestProcessStep3EventDeserializer();
kafkaSource3=新的FlinkKafkaConsumer010(step3Topic,反序列化方案3,kafkaProperties);
试一试{
StreamExecutionEnvironment env=StreamExecutionEnvironment.getExecutionEnvironment();
环境设置流时间特征(时间特征、摄取时间);
数据流s1=env.addSource(kafkaSource1);
DataStream s2=env.addSource(kafkaSource2);
数据流s3=env.addSource(kafkaSource3);
datastreamunionstream=s1.union(s2,s3);
Pattern successPattern=Pattern.begin(Pattern\u NAME)
.次(3)
.在(时间.秒(5))内;
PatternStream matchingStream=CEP.pattern(
unionStream.keyBy(新的CIDKeySelector()),
成功模式);
matchingStream.select(新建MyPatternTimeoutFunction(),新建MyPatternSelectFunction())
.print()
.1(1);
execute();
}捕获(例外e){
log.error(e.getMessage(),e);
}
}
私有静态类MyPatternTimeoutFunction实现PatternTimeoutFunction{
@凌驾
公共SctRequestFinalEvent超时(映射模式,long-timeoutTimestamp)引发异常{
列表事件=pattern.get(pattern\u NAME);
log.info(“超时事件:”);
events.forEach(e->log.info(e.toString());
//结果事件创建
SctRequestFinalEvent=新的SctRequestFinalEvent();
关联(events.get(0),event);
EventHelper.injectKey(事件);
event.setState(SctRequestFinalEvent.State.FAILED);
返回事件;
}
}
私有静态类MyPatternSelectFunction
实现PatternSelectFunction{
@凌驾
公共SctRequestFinalEvent选择(映射模式)引发异常{
列表事件=pattern.get(pattern\u NAME);
log.info(“匹配事件:”);
电动汽车
Pattern.<Event>begin(PATTERN_NAME)
    .times(3)
    .within(Time.seconds(5));
import org.apache.flink.api.java.tuple.Tuple2;
import org.apache.flink.cep.CEP;
import org.apache.flink.cep.PatternSelectFunction;
import org.apache.flink.cep.PatternStream;
import org.apache.flink.cep.pattern.Pattern;
import org.apache.flink.streaming.api.datastream.DataStream;
import org.apache.flink.streaming.api.environment.StreamExecutionEnvironment;
import org.apache.flink.streaming.api.windowing.time.Time;

import java.util.Map;

public class FlinkCEP {

    public static void main(String[] args) throws Exception {

        // set up the execution environment
        final StreamExecutionEnvironment env = StreamExecutionEnvironment.getExecutionEnvironment();

        DataStream<String> text = env.socketTextStream("localhost", 1111)
                .flatMap(new LineTokenizer());

        text.print();

        Pattern<String, String> pattern =
                Pattern.<String>begin("start").where(txt -> txt.equals("a"))
                       .next("middle").where(txt -> txt.equals("b"))
                       .followedBy("end").where(txt -> txt.equals("c")).within(Time.seconds(1));

        PatternStream<String> patternStream = CEP.pattern(text, pattern);

        DataStream<String> alerts = patternStream.select(new PatternSelectFunction<String, String>() {
            @Override
            public String select(Map<String, String> matches) throws Exception {
                return "Found: " +
                        matches.get("start") + "->" +
                        matches.get("middle") + "->" +
                        matches.get("end");
            }
        });

        // emit result
        alerts.print();

        // execute program
        env.execute("WordCount Example");
    }
}