Apache spark 我们如何在Spark Core中通过两个不同的字段实现排序?
我正在用spark做一些基本的编程 输入文件:Apache spark 我们如何在Spark Core中通过两个不同的字段实现排序?,apache-spark,Apache Spark,我正在用spark做一些基本的编程 输入文件: 2008,20 2008,40 2000,10 2000,30 2001,9 scala> val dataRDD = sc.textFile("/user/cloudera/inputfiles/year.txt") scala> val mapRDD = dataRDD.map(elem => elem.split(",")) scala> val keyValueRDD = mapRDD.map( elem =>
2008,20
2008,40
2000,10
2000,30
2001,9
scala> val dataRDD = sc.textFile("/user/cloudera/inputfiles/year.txt")
scala> val mapRDD = dataRDD.map(elem => elem.split(","))
scala> val keyValueRDD = mapRDD.map( elem => (elem(0),elem(1)))
scala> val sortRDD = keyValueRDD.sortByKey(true,1)
res29: Array[(String, String)] = Array((2000,30), (2000,10), (2001,9), (2008,20), (2008,40))
2000,30
2000,10
2001,9
2008,40
2008,20
我的火花代码:
2008,20
2008,40
2000,10
2000,30
2001,9
scala> val dataRDD = sc.textFile("/user/cloudera/inputfiles/year.txt")
scala> val mapRDD = dataRDD.map(elem => elem.split(","))
scala> val keyValueRDD = mapRDD.map( elem => (elem(0),elem(1)))
scala> val sortRDD = keyValueRDD.sortByKey(true,1)
res29: Array[(String, String)] = Array((2000,30), (2000,10), (2001,9), (2008,20), (2008,40))
2000,30
2000,10
2001,9
2008,40
2008,20
我希望输出按年份按升序排序,每年的值按降序排序
预期输出:
2008,20
2008,40
2000,10
2000,30
2001,9
scala> val dataRDD = sc.textFile("/user/cloudera/inputfiles/year.txt")
scala> val mapRDD = dataRDD.map(elem => elem.split(","))
scala> val keyValueRDD = mapRDD.map( elem => (elem(0),elem(1)))
scala> val sortRDD = keyValueRDD.sortByKey(true,1)
res29: Array[(String, String)] = Array((2000,30), (2000,10), (2001,9), (2008,20), (2008,40))
2000,30
2000,10
2001,9
2008,40
2008,20
有人能帮我得到这个结果吗?你必须定义一个类,它包含年份和年份值。此类应通过重写compare方法扩展Ordered。然后使用此类的对象作为键值并应用sortBy操作
class TwoKeys(var first: Int, var second: Int) extends Ordered[TwoKeys] {
def compare(that: TwoKeys): Int = {
if(first == that.first){
that.second - second
}else{
first - that.first
}
}
}
...
val keyValueRDD = mapRDD.map(elem => (TwoKeys(elem(0), elem(1)), TwoKeys(elem(0), elem(1))))
val sortRDD = keyValueRDD.sortByKey(true,1)
是的,它是有效的,但我想了解一些关于比较法的解释?那是什么,秒-秒?我们在减法吗?请解释您在compare methodcompare返回值-1、0或1中编写的逻辑,它们分别对应于less、equal和biger。