scala RDD数据处理
这是我的代码:scala RDD数据处理,scala,apache-spark,rdd,Scala,Apache Spark,Rdd,这是我的代码: val data = Array("eq_len Alice@TF [11.5, 11.8, 12.0, 12.3, 12.56, 12.79, 13.01, 16.85] 639684 16.4 11.565149", "eq_len Bob@TY [0.0, 2.4, 4.8, 7.2, 9.6, 12.0, 14.4, 16.8] 604804 48.0 0.0", "eq_len Cool@GF [11.
val data = Array("eq_len Alice@TF [11.5, 11.8, 12.0, 12.3, 12.56, 12.79, 13.01, 16.85] 639684 16.4 11.565149",
"eq_len Bob@TY [0.0, 2.4, 4.8, 7.2, 9.6, 12.0, 14.4, 16.8] 604804 48.0 0.0",
"eq_len Cool@GF [11.4, 12.35, 13.3, 14.25, 15.2, 16.15, 17.1, 18.05] 639677 0.184546 0.003718",
"eq_len Gop@FF [ 7.6, 8.55, 9.5, 10.45, 13.2, 13.9, 14.6, 15.3] 629981 0.585282 0.000504")
val sc = prepareConfig();
val baseRDD = sc.parallelize(data)
val rdds = sc.textFile(filename)
val sets = rdds.map{
line => val splt = line.split("\t")
val spltflag = "\\w+".r
val id = spltflag findFirstIn splt(1) match {
case Some(y) => y
}
(id,splt(2))
}
我想要这个结果:
Alice,11.5, 11.8, 12.0, 12.3, 12.56, 12.79, 13.01, 16.85
Bob,0.0, 2.4, 4.8, 7.2, 9.6, 12.0, 14.4, 16.8
Cool,11.4, 12.35, 13.3, 14.25, 15.2, 16.15, 17.1, 18.05
Gop,7.6, 8.55, 9.5, 10.45, 13.2, 13.9, 14.6, 15.3
谢谢。这是我的代码:
val data = Array("eq_len Alice@TF [11.5, 11.8, 12.0, 12.3, 12.56, 12.79, 13.01, 16.85] 639684 16.4 11.565149",
"eq_len Bob@TY [0.0, 2.4, 4.8, 7.2, 9.6, 12.0, 14.4, 16.8] 604804 48.0 0.0",
"eq_len Cool@GF [11.4, 12.35, 13.3, 14.25, 15.2, 16.15, 17.1, 18.05] 639677 0.184546 0.003718",
"eq_len Gop@FF [ 7.6, 8.55, 9.5, 10.45, 13.2, 13.9, 14.6, 15.3] 629981 0.585282 0.000504")
val sc = prepareConfig();
val baseRDD = sc.parallelize(data)
val rdds = sc.textFile(filename)
val sets = rdds.map{
line => val splt = line.split("\t")
val spltflag = "\\w+".r
val id = spltflag findFirstIn splt(1) match {
case Some(y) => y
}
(id,splt(2))
}
所以你的问题和你以前问过的不同,是怎么回事?请在你的问题上投入一些精力,并阅读关于如何提问的文章!这不是一个家庭作业平台编辑问题,不要添加这个(非)答案。正如@eliasah所说,在这里阅读一些关于如何提问的(相当好的)帮助