Scala 将RDD[OmnitureData]写入S3
我有一个RDD,它包含自定义类OmnitureData的对象类型。OmnitureData数据包含1000个数据变量。我想将数据写入S3Scala 将RDD[OmnitureData]写入S3,scala,apache-spark,amazon-s3,rdd,Scala,Apache Spark,Amazon S3,Rdd,我有一个RDD,它包含自定义类OmnitureData的对象类型。OmnitureData数据包含1000个数据变量。我想将数据写入S3 data: RDD[OmnitureData] data.saveAsTextFile(path) 在S3中,我将数据视为: OmnitureFeedOutputEntry@5655c68b OmnitureFeedOutputEntry@kgfwe77c OmnitureFeedOutputEntry@4rjkks8f OmnitureFeedOutput
data: RDD[OmnitureData]
data.saveAsTextFile(path)
在S3中,我将数据视为:
OmnitureFeedOutputEntry@5655c68b
OmnitureFeedOutputEntry@kgfwe77c
OmnitureFeedOutputEntry@4rjkks8f
OmnitureFeedOutputEntry@57bfgk6d
OmnitureFeedOutputEntry@646lk6sd
我如何以可以查看OmnitureData成员实际数据的方式存储它?找到了解决方案
def writeOnS3(data: RDD[OmnitureFeedOutputEntry], path: String)= {
try {
val finalData: RDD[String] = data.map(x => {
implicit val formats = Serialization.formats(NoTypeHints)
write(x)})
finalData.saveAsTextFile(path)
logger.info("task=writeOnS3, status=success")
} catch {
case e: Exception => logger.error("task=writeOnS3, status=failure")
}
}