将每个分区存储到文件中,并将其加载到Scala Spark中的同一分区上
我遇到了这样的情况:必须将每个分区的数据存储到一个文件中,然后在同一分区加载存储的数据。这是我的密码 基类将每个分区存储到文件中,并将其加载到Scala Spark中的同一分区上,scala,apache-spark,io,filereader,partition,Scala,Apache Spark,Io,Filereader,Partition,我遇到了这样的情况:必须将每个分区的数据存储到一个文件中,然后在同一分区加载存储的数据。这是我的密码 基类 case class foo ( posVals : Array[Double] , velVals : Array[Double] , f: Array[Double] => Double , fitnessVal: Double , LR1 : Double , PR1 : Double) extends Serializable { va
case class foo ( posVals : Array[Double] , velVals : Array[Double] , f: Array[Double] => Double ,
fitnessVal: Double , LR1 : Double , PR1 : Double) extends Serializable {
var position : Array[Double] = posVals
var velocity : Array[Double] = velVals
var fitness : Double = fitnessVal
var PulseRate: Double = PR1
var LoudnessRate: Double = LR1
}
目标函数
def sphere (ar : Array[Double]) : Double = ar.reduce((x,y) => x+y*y)
在每个分区内存储和读取数据
def execute(RDD: RDD[foo], c_itr: Int ): Array[(foo, Int)] = {
val newRDD = RDD.mapPartitionsWithIndex {
(index, Iterator) => {
var arr: Array[foo] = Iterator.toArray
if (c_itr != 0) {
//Read Data from stored file where file name is equal to partition number (index)
val bufferedSource = Source.fromFile("/result/"+index+".txt")
val lines = bufferedSource.getLines()
val data : Array[BAT1] = lines.flatMap{line =>
val p = line.split(",")
Seq( BAT1(p(0).toArray.map(_.toDouble) , p(1).toArray.map(_.toDouble) ,sphere ,line(2).toDouble, p(3).toDouble, p(4).toDouble) )
}.toArray
}
arr = data.clone() // Replace arr with loaded data from file
//Save to file
val writer = new FileWriter(Path + index + ".txt")
for ( i <- 0 until arr.length ) {
writer.write(arr(i).position.toList + "," + arr(i).velocity.toList + "," + arr(i).fitness + "," +
arr(i).LoudnessRate + "," + arr(i).PulseRate + "\n")
}
writer.close()
val bests : Array[(foo , Int)] = res1.map(x => (x, index))
bests.toIterator
}
}
newRDD.persist().collect()
}
从文件中读取数据时,此代码不会读取精确的数据。我试了很多,但找不到问题。如何正确读取数据对象中存储的数据?您传入
执行的RDD
的值是多少?我理解类型为foot的RDD,但我询问其中存在的值mapPartitionsWithIndex
将遍历RDD中存在的每个分区。我的问题是RDD中的数据是什么,您是从某处读取的吗?还是生成它?生成。请查看保存到文件注释后的第行。写入程序正在将数据写入文件并加载数据
List(86.6582767815429, -25.224569272200586, 90.52371028878218, -59.91851894060545, -37.12944037124118),List(-59.60155033146984, -8.927455672466586, -23.679516503590534, 87.58857469881022 ,-14.864361504195127),6.840659702736215E10,0.6012,0.04131580765457621
List(86.6582767815429, -25.224569272200586, 90.52371028878218, -59.91851894060545, -26.10553311409422),List(-66.83980088207335, 51.088426986986015, -109.74073303298485, 66.87095748811572, -22.941448024344268),9.195157603574039E10,0.9025,0.06132589765454988