Warning: file_get_contents(/data/phpspider/zhask/data//catemap/2/scala/17.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
Scala apacheflink使用coGroup实现左外连接_Scala_Apache Flink - Fatal编程技术网

Scala apacheflink使用coGroup实现左外连接

Scala apacheflink使用coGroup实现左外连接,scala,apache-flink,Scala,Apache Flink,我一直在尝试使用Flink中的CoGroupFunction连接两个流 我有两条小溪;它们是: S1 val m = env .addSource(new FlinkKafkaConsumer010[String]("topic-1", schema, props)) .map(gson.fromJson(_, classOf[Master])) .assignAscendingTimestamps(_.time) S2 val d = env .addSource(new FlinkKafk

我一直在尝试使用Flink中的
CoGroupFunction
连接两个流

我有两条小溪;它们是:

S1

val m = env
.addSource(new FlinkKafkaConsumer010[String]("topic-1", schema, props))
.map(gson.fromJson(_, classOf[Master]))
.assignAscendingTimestamps(_.time)
S2

val d = env
.addSource(new FlinkKafkaConsumer010[String]("topic-2", schema, props))
.map(gson.fromJson(_, classOf[Detail]))
.assignAscendingTimestamps(_.time)
我的
coGroup
实现是

class MasterDetailOuterJoin extends CoGroupFunction[Master, Detail, 
(Master, Option[Detail])] {

  override def coGroup(
      leftElements : java.lang.Iterable[Master],
      rightElements: java.lang.Iterable[Detail],
      out: Collector[(Master, Option[Detail]) ]): Unit = {

    for (leftElem <- leftElements) {
      var isMatch = false
      println(leftElem.orderNo)
      for (rightElem <- rightElements) {
        println(rightElem.orderNo)
        out.collect((leftElem, Some(rightElem)))
        isMatch = true
      }
      if (!isMatch) {
        out.collect((leftElem, None))
      }
    }
  }
}
但是,即使有一个匹配的大师和细节,也没有印刷! 我用console consumer监控kafka流,顺便说一句,它们工作得很好

如果我用一个内部连接来代替,我会得到结果

 m.keyBy(_.orderNo)
    .connect(d.keyBy(_.orderNo))
    .flatMap(new MasterDetailInnerJoin) //RichCoFlatMapFunction
    .map(gson.toJson(_, classOf[(Master, Detail)]))
    .print

原来,我缺少的是,

  • env.setStreamTimeCharacteristic(TimeCharacteristic.EventTime)
  • 并为每个流分配时间戳和水印提取器

事实证明,我缺少的是

  • env.setStreamTimeCharacteristic(TimeCharacteristic.EventTime)
  • 并为每个流分配时间戳和水印提取器
 m.keyBy(_.orderNo)
    .connect(d.keyBy(_.orderNo))
    .flatMap(new MasterDetailInnerJoin) //RichCoFlatMapFunction
    .map(gson.toJson(_, classOf[(Master, Detail)]))
    .print