Scala 参数化类型的mapValues后的reduceByKey不'；不编译_Scala_Apache Spark_Type Inference

Scala 参数化类型的mapValues后的reduceByKey不'；不编译

scala apache-spark

Scala 参数化类型的mapValues后的reduceByKey不'；不编译,scala,apache-spark,type-inference,Scala,Apache Spark,Type Inference,当我调用RDD.mapValues（…）.reduceByKey（…）时，我的代码不会编译。但是当我颠倒顺序时，RDD.reduceByKey（…）.mapValues（…），代码就编译了。这些类型似乎是匹配的一个完整的最小复制示例是： def test[E]() = new SparkContext().textFile("") .keyBy(_ ⇒ 0L) .mapValues(_.asInstanceOf[E]) .reduceBy

当我调用

RDD.mapValues（…）.reduceByKey（…）

时，我的代码不会编译。但是当我颠倒顺序时，

RDD.reduceByKey（…）.mapValues（…）

，代码就编译了。这些类型似乎是匹配的

一个完整的最小复制示例是：

def test[E]() =
    new SparkContext().textFile("")
        .keyBy(_ ⇒ 0L)
        .mapValues(_.asInstanceOf[E])
        .reduceByKey((x, _) ⇒ x)

编译错误与中相同，但其补救措施没有帮助：

Test.scala:7: error: value reduceByKey is not a member of org.apache.spark.rdd.RDD[(Long, E)]
possible cause: maybe a semicolon is missing before `value reduceByKey'?
            .reduceByKey((x, _) ⇒ x)

这个问题似乎更像是Scala级别的问题，而不是Spark级别的问题。将类型参数替换为Int会起作用，因此类型推断可能会出现问题。我将Spark 2.2.0与Scala 2.11一起使用。

方法，如

。reduceByKey

和

。mapValues

是

pairddFunctions

的成员，但您可以调用它们，因为存在从

RDD[（K，V）]

的隐式转换。但如果仔细观察该转换的定义，您可能会发现问题：

implicit def rddToPairRDDFunctions[K, V](rdd: RDD[(K, V)])
    (implicit kt: ClassTag[K], vt: ClassTag[V], ord: Ordering[K] = null): PairRDDFunctions[K, V]

对于

和

类型，它需要一个

ClassTag

实例。在您的示例中，没有可用于

，因此无法应用隐式转换，因此它找不到

reduceByKey

方法。试试这个：

def test[E]()(implicit et: ClassTag[E]) = ...

或速记：

def test[E : ClassTag]() = ...

诸如

.reduceByKey

和

.mapValues

之类的方法是

pairddFunctions