Sorting spark sortByKey案例类

Sorting spark sortByKey案例类,sorting,apache-spark,case-class,Sorting,Apache Spark,Case Class,spark sortByKey是案例类的类型 rdd.filter(line => { if(StringUtils.isEmpty(line)){ false }else{ true } }).map(line => { val array = line.split(",") (OrderedKey(array(0),array(1)),array(2)) }).repartition(1).sortByKey(true).foreach(p

spark sortByKey是案例类的类型

rdd.filter(line => {
  if(StringUtils.isEmpty(line)){
    false
  }else{
    true
  }
}).map(line => {
    val array = line.split(",")
  (OrderedKey(array(0),array(1)),array(2))
}).repartition(1).sortByKey(true).foreach(println(_))
案例类OrderedKey(k1:String,k2:String)


但结果不一样!为什么?

您需要提供一个顺序,在该顺序下可以比较案例类实例。然后,
sortByKey()
转换将使用此排序对
OrderedKey
键进行排序

以下是按参数到案例类的顺序排序的示例:

case class OrderedKey(k1: String, k2: String) extends Ordered[OrderedKey] {
  import scala.math.Ordered.orderingToOrdered
  def compare(that: OrderedKey): Int = (this.k1, this.k2) compare (that.k1, that.k2)
}

当将重新分区(1)设置为正确时,存在隐式val-implicit-sortOrderKey=new Ordering[OrderedKey]{override-def-compare(x:OrderedKey,y:OrderedKey)={(x.k1,x.k2)compare(x.k1,y.k2)}但不存在排序