Warning: file_get_contents(/data/phpspider/zhask/data//catemap/2/scala/18.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181

Warning: file_get_contents(/data/phpspider/zhask/data//catemap/3/apache-spark/6.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
Scala 在RDD上设置差异_Scala_Apache Spark_Rdd - Fatal编程技术网

Scala 在RDD上设置差异

Scala 在RDD上设置差异,scala,apache-spark,rdd,Scala,Apache Spark,Rdd,我正在尝试在两个rdd上获得一组差异,即a-b val a: RDD[A] = getARDD() val b: RDD[A] = getBRDD() 其中A是一个类,具有使用lombok注释的字符串字段 @Data @AllArgsConstructor @NoArgsConstructor public class A implements Serializable, Comparable<A> { String field; } 两者都不起作用。我原以为按键分组应该行得

我正在尝试在两个rdd上获得一组差异,即a-b

val a: RDD[A] = getARDD()
val b: RDD[A] = getBRDD()
其中A是一个类,具有使用lombok注释的字符串字段

@Data
@AllArgsConstructor
@NoArgsConstructor
public class A implements Serializable, Comparable<A> {
  String field;
}
两者都不起作用。我原以为按键分组应该行得通,但行不通。知道我的代码有什么问题吗


它给出了RDD1中的值,即val a

您所说的“不工作”到底是什么意思?异常(post It)?错误的结果(显示示例)?它给出了val a中的rdd。更新了问题。我将subtract更改为使用subtractByKey,它可以工作
val diff=a.keyBy(u.getField())。subtractByKey(b.keyBy(u.getField())。值
val diff = a.keyBy(_.getField()).subtractByKey(b.keyBy(_.getField()).values
or
val diff = a.subtract(b)
or
val rdd1Grouped = a.groupBy(_.getField)
val rdd2Grouped = b.groupBy(_.getField)
val diff = rdd1Grouped.subtractByKey(rdd2Grouped).values