Sorting 带元组的Spark重分区和SortWithinPartition

Sorting 带元组的Spark重分区和SortWithinPartition,sorting,apache-spark,hbase,rdd,Sorting,Apache Spark,Hbase,Rdd,我尝试按照以下示例对hbase行进行分区: 但是,我已经在(String,String,String)中存储了数据,其中第一个是行键,第二个是列名,第三个是列值 我尝试编写隐式排序来实现OrderedRDD隐式 implicit val caseInsensitiveOrdering: Ordering[(String, String, String)] = new Ordering[(String, String, String)] { override def compare(x:

我尝试按照以下示例对hbase行进行分区:

但是,我已经在(String,String,String)中存储了数据,其中第一个是行键,第二个是列名,第三个是列值

我尝试编写隐式排序来实现OrderedRDD隐式

 implicit val caseInsensitiveOrdering: Ordering[(String, String, String)] = new Ordering[(String, String, String)] {
    override def compare(x: (String, String, String), y: (String, String, String)): Int = ???
  }

但是重新分区和SortWithinPartitions仍然不可用。有什么方法可以将此方法用于此元组吗?

RDD必须具有键和值,而不仅仅是值,例如:

val data = List((("5", "6", "1"), (1)))
val rdd : RDD[((String, String, String), Int)] = sparkContext.parallelize(data)
implicit val caseInsensitiveOrdering = new Ordering[(String, String, String)] {
  override def compare(x: (String, String, String), y: (String, String, String)): Int = 1
}
rdd.repartitionAndSortWithinPartitions(..)

RDD必须有键和值,而不仅仅是值,例如:

val data = List((("5", "6", "1"), (1)))
val rdd : RDD[((String, String, String), Int)] = sparkContext.parallelize(data)
implicit val caseInsensitiveOrdering = new Ordering[(String, String, String)] {
  override def compare(x: (String, String, String), y: (String, String, String)): Int = 1
}
rdd.repartitionAndSortWithinPartitions(..)