Scala 火花蓄能器参数通用参数

Scala 火花蓄能器参数通用参数,scala,generics,apache-spark,Scala,Generics,Apache Spark,我在Spark中使用蓄能器有问题。如Spark网站所示,如果您想要定制累加器,只需(使用对象)扩展累加器参数特性即可。问题是我想但不能使该对象成为通用对象,例如: object SeqAccumulatorParam[B] extends AccumulatorParam[Seq[B]] { override def zero(initialValue: Seq[B]): Seq[B] = Seq[B]() override def addInPlace(s1: Seq[B]

我在Spark中使用蓄能器有问题。如Spark网站所示,如果您想要定制累加器,只需(使用对象)扩展
累加器参数
特性即可。问题是我想但不能使该对象成为通用对象,例如:

object SeqAccumulatorParam[B] extends AccumulatorParam[Seq[B]] {

    override def zero(initialValue: Seq[B]): Seq[B] = Seq[B]()

    override def addInPlace(s1: Seq[B], s2: Seq[B]): Seq[B] = s1 ++ s2

}
但这给了我一个编译错误,因为对象不能使用泛型参数。我的情况不允许我为每种给定类型定义
seqAccumeratorParam
,因为这会导致大量丑陋的代码重复

我有另一种方法,只需将所有结果放入
RDD
,然后使用为该单一类型定义的累加器对其进行迭代,但这会更好


我的问题是:有没有其他方法来创建累加器?

您可以简单地使用类来创建对象,而不是单例对象

class SeqAccumulatorParam[B] extends AccumulatorParam[Seq[B]] {
    override def zero(initialValue: Seq[B]): Seq[B] = Seq[B]()
    override def addInPlace(s1: Seq[B], s2: Seq[B]): Seq[B] = s1 ++ s2
}

val seqAccum = sc.accumulator(Seq[Int]())(new SeqAccumulatorParam[Int]())  

val lists = (1 to 5).map(x => (0 to x).toList)
sc.parallelize(lists).foreach(x => seqAccum += x)

seqAccum.value
// Seq[Int] = List(0, 1, 2, 3, 4, 0, 1, 2, 3, 4, 5, 0, 1, 2, 3, 0, 1, 2, 0, 1)
// result can be in different order.

// For Doubles.
val seqAccumD = sc.accumulator(Seq[Double]())(new SeqAccumulatorParam[Double]())
sc.parallelize(lists.map(x => x.map(_.toDouble))).foreach(x => seqAccumD += x)

seqAccumD.value
// Seq[Double] = List(0.0, 1.0, 0.0, 1.0, 2.0, 0.0, 1.0, 2.0, 3.0, 0.0, 1.0, 2.0, 3.0, 4.0, 0.0, 1.0, 2.0, 3.0, 4.0, 5.0)