Scala ApacheFlink:向文件名添加时间戳

Scala ApacheFlink:向文件名添加时间戳,scala,apache-flink,stream-processing,Scala,Apache Flink,Stream Processing,我想在Flink中创建一个BucketingSink,它将所有文件写入同一个文件夹中,但将文件名设置为当前时间戳,而不是递增计数器。例如: 第0部分-1514107452000.avro 第0部分-1514021052000.avro 这是我的密码: val sink = new BucketingSink[Tuple2[String, MyType]]("/tmp/flink/") sink.setBucketer(new MyTypeBucketer(new SimpleDateForma

我想在Flink中创建一个BucketingSink,它将所有文件写入同一个文件夹中,但将文件名设置为当前时间戳,而不是递增计数器。例如:

第0部分-1514107452000.avro

第0部分-1514021052000.avro

这是我的密码:

val sink = new BucketingSink[Tuple2[String, MyType]]("/tmp/flink/")
sink.setBucketer(new MyTypeBucketer(new SimpleDateFormat("yyyy-MM-dd--HH")))
sink.setInactiveBucketThreshold(120000) // this is 2 minutes
sink.setBatchSize(1024 * 1024 * 64) // this is 64 MB,
sink.setPendingSuffix(".avro")

val writer: AvroKeyValueSinkWriter[String, MyType] = new AvroKeyValueSinkWriter[String, MyType](parseAvroSinkProperties())
sink.setWriter(writer.duplicate())

def parseAvroSinkProperties(): util.Map[String, String] = {
var properties = new util.HashMap[String, String]()
val stringSchema = Schema.create(Type.STRING)
val myTypeSchema = myType.getClassSchema
val keySchema = stringSchema.toString
val valueSchema = myTypeSchema.toString
val compress = true
properties.put(AvroKeyValueSinkWriter.CONF_OUTPUT_KEY_SCHEMA, keySchema)
properties.put(AvroKeyValueSinkWriter.CONF_OUTPUT_VALUE_SCHEMA, valueSchema)
properties.put(AvroKeyValueSinkWriter.CONF_COMPRESS, compress.toString)
properties.put(AvroKeyValueSinkWriter.CONF_COMPRESS_CODEC, DataFileConstants.SNAPPY_CODEC)
properties
}

class MyTypeBucketer(dateFormatter: SimpleDateFormat) extends DateTimeBucketer[Tuple2[String, MyType]] {
override def getBucketPath(clock: Clock, basePath: Path, element: Tuple2[String, MyType]) = {
  new Path(s"$basePath/${element.f1.getMyStringProp}")
}
有人知道吗?
谢谢

有什么问题吗!所以我想问题是如何在最终生成的文件中添加文件名后缀,对吗?BucketingSink API只能设置InProgress、Pending和ValidLength状态的后缀。