Java Apache Flink writeAsCsv()方法来编写对象的元组
我遵循ApacheFlink教程来清理出租车事件流。生成的流将打印到控制台。现在我想把它写入csv文件Java Apache Flink writeAsCsv()方法来编写对象的元组,java,stream,bigdata,apache-flink,Java,Stream,Bigdata,Apache Flink,我遵循ApacheFlink教程来清理出租车事件流。生成的流将打印到控制台。现在我想把它写入csv文件 // configure event-time processing env.setStreamTimeCharacteristic(TimeCharacteristic.EventTime); // get the taxi ride data stream DataStream<TaxiRide> rides =
// configure event-time processing
env.setStreamTimeCharacteristic(TimeCharacteristic.EventTime);
// get the taxi ride data stream
DataStream<TaxiRide> rides = env.addSource(
new TaxiRideSource(path, maxEventDelay, servingSpeedFactor));
DataStream<TaxiRide> filteredRides = rides
// filter out rides that do not start or stop in NYC
.filter(new RideCleansing.NYCFilter());
filteredRides.print();
当我制作一个数据集1=filteredRides.writeAsCsv(“/resources”).setParallelism(1)代码>它会导致编译器错误
我应该怎么做才能将清理后的滑行设备对象流写入csv文件?DataStream
和DataSet
属于不能混合的单独API。因此,出现编译错误
错误消息“writeAsCsv()方法只能用于元组的数据流。”意味着您必须将DataStream
对象转换为元组的DataStream
,才能将其写入CSV文件。
这可以通过一个简单的映射函数来完成:
DataStream<Tuple9<Long, Boolean, DateTime, DateTime, Float, Float, Float, Float, Float, Short>> rideTuples = filteredRides
.map(new TupleConverter());
一旦拥有数据流
rideTuples
,就可以将其写入CSV文件
DataStream<Tuple9<Long, Boolean, DateTime, DateTime, Float, Float, Float, Float, Float, Short>> rideTuples = filteredRides
.map(new TupleConverter());
class TupleConverter implements MapFunction<TaxiRide, Tuple9<Long, Boolean, DateTime, DateTime, Float, Float, Float, Float, Float, Short>> {
public Tuple9<Long, Boolean, DateTime, DateTime, Float, Float, Float, Float, Float, Short> map(TaxiRide ride) {
return Tuple9.of(ride.rideId, ride.isStart, ...);
}
}