Apache flink 如何广播CEP模式并在CEP中迭代多个模式?
我尝试在CEP中的Patternstream上应用模式列表我尝试了下面的附加代码我对Flink是新手,我不确定这是一种愚蠢的方法还是正确的方法,我不确定我是否可以在KeyedBroadCastFunction中的processFunction中使用datastreamApache flink 如何广播CEP模式并在CEP中迭代多个模式?,apache-flink,flink-streaming,flink-cep,Apache Flink,Flink Streaming,Flink Cep,我尝试在CEP中的Patternstream上应用模式列表我尝试了下面的附加代码我对Flink是新手,我不确定这是一种愚蠢的方法还是正确的方法,我不确定我是否可以在KeyedBroadCastFunction中的processFunction中使用datastream public static final MapStateDescriptor<String, String> patternDescriptor = new MapStateDescriptor<Strin
public static final MapStateDescriptor<String, String> patternDescriptor = new MapStateDescriptor<String,
String>("CEPPatternList", BasicTypeInfo.STRING_TYPE_INFO, BasicTypeInfo.STRING_TYPE_INFO);
StreamExecutionEnvironment env = env.getExecutionEnvironment();
env.setStreamTimeCharacteristic(TimeCharacteristic.EventTime);
DataStream<JSONObject> source =
env.fromCollection(Arrays.asList(event, event2, event3));//KafkaSource
DataStream<Tuple2<String, JSONObject>> eventDataStream =
source.map(new JSONObjectToTuple2());
DataStream<Tuple2<String, JSONObject>> eventStream = eventDataStream.keyBy(0);
DataStream<Tuple2<String, String>> patternStream =
env.fromElements(pattern).flatMap(new FlatMapFunction<String, Tuple2<String,
String>>() {
@Override
public void flatMap(String value, Collector<Tuple2<String, String>> out) throws Exception {
out.collect(new Tuple2<>("PATTERN_1", value));
}
});
BroadcastStream<Tuple2<String, String>> broadcastStream =
patternStream.broadcast(patternDescriptor);
DataStream<Tuple2<String, JSONObject>> output =
eventStream.connect(broadcastStream).process(new KeyedBroadcastProcessFunction<String, Tuple2<String,
JSONObject>, Tuple2<String, String>, Tuple2<String, JSONObject>>() {
@Override
public void processElement(Tuple2<String, JSONObject> value, ReadOnlyContext ctx,
Collector<Tuple2<String, JSONObject>> out) throws Exception {
for (Map.Entry<String, String> patterns :
ctx.getBroadcastState(patternRuleDescriptor).immutableEntries()) {
String patternValue = patterns.getValue();
DataStream<Tuple2<String, JSONObject>> eventStream =
env.fromElements(value);
PatternStream<Tuple2<String, JSONObject>> patternStream =
cepPatternMatching.compile(patternValue,
eventStream);
OutputTag<Tuple2<String, JSONObject>> timedout = new OutputTag<Tuple2<String, JSONObject>>(
"timedout") {
};
SingleOutputStreamOperator<Tuple2<String, JSONObject>> result = patternStream.flatSelect(
timedout,
new EventTimeOut(),
new PatternFlatSelect()
);
result.flatMap(new FlatMapFunction<Tuple2<String, JSONObject>, Object>() {
@Override
public void flatMap(Tuple2<String, JSONObject> value, Collector<Object> out) throws Exception {
out.collect(value);
}
});
ctx.output(unMatched,value);
}
}
}
@Override
public void processBroadcastElement(Tuple2<String, String> value, Context ctx,
Collector<Tuple2<String, JSONObject>> out) throws Exception {
System.out.println("Pattern Name: " + value.f0);
System.out.println("Pattern Condition: " + value.f1);
ctx.getBroadcastState(patternDescriptor).put(value.f0, value.f1);
}
});
我没有在代码中的任何地方进行序列化不确定为什么会出现此错误我甚至尝试实现可序列化接口,但仍然出现错误
有没有办法广播CEP模式并逐一迭代以应用于数据流请分享完整的stacktrace?通常,Flink使用序列化将用户代码传输到TaskManager,因此您的任何用户代码都不应包含对外部类的引用。@ArvidHeise我在questionOh jeez中附加了错误stacktrace。我刚才看到您正在用户代码中使用
DataStream
。那根本行不通。我也不确定CEP是否可以在运行时与模式一起使用。我怀疑您需要退回到DataStream
,自己在一个新的环境中实现它。请注意,如果您在应用程序开始时知道模式,则可以使用CEP。
Exception in thread "main" org.apache.flink.api.common.InvalidProgramException: The implementation of the StreamExecutionEnvironment is not serializable. The object probably contains or references non serializable fields.
at org.apache.flink.api.java.ClosureCleaner.clean(ClosureCleaner.java:151)
at org.apache.flink.api.java.ClosureCleaner.clean(ClosureCleaner.java:126)
at org.apache.flink.api.java.ClosureCleaner.clean(ClosureCleaner.java:71)
at org.apache.flink.streaming.api.environment.StreamExecutionEnvironment.clean(StreamExecutionEnvironment.java:1821)
at org.apache.flink.streaming.api.datastream.BroadcastConnectedStream.clean(BroadcastConnectedStream.java:249)
at org.apache.flink.streaming.api.datastream.BroadcastConnectedStream.process(BroadcastConnectedStream.java:162)
at org.apache.flink.streaming.api.datastream.BroadcastConnectedStream.process(BroadcastConnectedStream.java:139)
at CEPPatternMatchingApp.main(CEPPatternMatchingApp.java:122)
Caused by: java.io.NotSerializableException: org.apache.flink.streaming.api.environment.LocalStreamEnvironment
Caused by: java.io.NotSerializableException: org.apache.flink.streaming.api.environment.LocalStreamEnvironment
at java.io.ObjectOutputStream.writeObject0(ObjectOutputStream.java:1184)
at java.io.ObjectOutputStream.writeObject(ObjectOutputStream.java:348)
at org.apache.flink.util.InstantiationUtil.serializeObject(InstantiationUtil.java:586)
at org.apache.flink.api.java.ClosureCleaner.clean(ClosureCleaner.java:133)
... 7 more