Apache flink 什么';这是KeyedStream#max的实际用法
我有一个简单的Flink应用程序来说明Apache flink 什么';这是KeyedStream#max的实际用法,apache-flink,Apache Flink,我有一个简单的Flink应用程序来说明KeyedStream#max import com.huawei.flink.time.Box import org.apache.flink.streaming.api.TimeCharacteristic import org.apache.flink.streaming.api.scala.{StreamExecutionEnvironment, _} object KeyStreamMaxTest { val env = StreamExe
KeyedStream#max
import com.huawei.flink.time.Box
import org.apache.flink.streaming.api.TimeCharacteristic
import org.apache.flink.streaming.api.scala.{StreamExecutionEnvironment, _}
object KeyStreamMaxTest {
val env = StreamExecutionEnvironment.getExecutionEnvironment
def main(args: Array[String]): Unit = {
env.setStreamTimeCharacteristic(TimeCharacteristic.ProcessingTime)
env.setParallelism(1)
env.setMaxParallelism(1)
val ds = env.fromElements(("X,Red,10"), ("Y,Blue,10"), ("Z,Black, 22"), ("U,Green,22"), ("N,Blue,25"), ("M,Green,23"))
val ds2 = ds.map { line =>
val Array(name, color, size) = line.split(",")
Box(name.trim, color.trim, size.trim.toInt)
}.keyBy(_.color).max("size")
ds2.print()
env.execute()
}
}
输出为:
Box(X,Red,10)
Box(Y,Blue,10)
Box(Z,Black,22)
Box(U,Green,22)
Box(Y,Blue,25) -- I thought this should be ("N,Blue,25")
Box(U,Green,23)
看起来Flink只替换尺寸,但名称和颜色不变
我会问这种行为的实际用途是什么?我只能想象得到最大大小的整个记录是很自然的。有时,您只需要知道一个字段的每个键的最大值。我相信
max
能够提供这些信息,同时比更常用的maxBy
做更少的工作,后者返回最大大小的整个记录