Scala Int的排序数组_Scala_Sorting_Apache Spark

Scala Int的排序数组

scala sorting apache-spark

Scala Int的排序数组,scala,sorting,apache-spark,Scala,Sorting,Apache Spark,我正在尝试按降序对以下数组进行排序，但不知道如何进行排序我尝试过使用.sort和.sortWith但它们似乎不适用于数组 val result = postIdCount.withFilter(_._2 > 5).map(_._1.toInt) result.collect Array[Int] = Array(41, 974, 662, 9554, 116, 4942, 410, 2269, 5443, 5357, 9435, 2293, 266, 711, 441, 61, 37

我正在尝试按降序对以下

数组进行排序，但不知道如何进行排序
我尝试过使用.sort
和.sortWith
但它们似乎不适用于数组

val result = postIdCount.withFilter(_._2 > 5).map(_._1.toInt)

result.collect

Array[Int] = Array(41, 974, 662, 9554, 116, 4942, 410, 2269, 5443, 5357, 9435, 2293, 266, 711, 441, 61, 3738, 22, 6318, 8390, 497, 19, 9364, 412, 893, 334, 9000, 678, 313, 253, 979, 842, 4914, 2651, 6547, 6576, 1159, 5224, 1107, 52, 810, 361, 694, 739, 904, 5706, 422, 778, 9818, 758, 130, 265, 6107, 155, 2618, 8941, 8963, 834, 326, 731, 2368, 430, 1253)

有人知道我如何才能做到这一点吗
谢谢你的帮助
编辑：这是我目前所拥有的：
当我尝试添加：
val result = postIdCount.withFilter(_._2 > 5).map(_._1.toInt).sorted(Ordering[Integer].reverse)

我得到一个错误，说：
error: value sorted is not a member of org.apache.spark.rdd.RDD[Int]

postdcount.withFilter（u.\u 2>5）.map（u.\u 1.toInt）
为您提供org.apache.spark.rdd.rdd
非Array

试一试
collect
函数以数组形式返回数据集的所有元素。但这会将所有数据收集到spark集群中的一台机器上。
PostedCount.withFilter（uu 2>5）.map（u 1.toInt）
为您提供了org.apache.spark.rdd.rdd
而不是Array

val sorted = postIdCount
   .withFilter(_._2 > 5)
   .map(_._1.toInt)
   .sortBy(identity, ascending = false)

试一试
collect
函数以数组形式返回数据集的所有元素。但这会将所有数据收集到spark集群中的一台机器上
val sorted = postIdCount
   .withFilter(_._2 > 5)
   .map(_._1.toInt)
   .sortBy(identity, ascending = false)

这将返回排序后的RDD[Int]

这将返回一个排序后的RDD[Int]
，它是升序，而不是降序。可能的重复项需要对代码进行排序。您询问对数组进行排序，但错误消息显然不是关于数组，而是关于某个不属于Scala的模糊类型。你需要先弄清楚这个奇怪的类型是从哪里来的，以及为什么你用这个奇怪的类型而不是数组。@JörgWMittag当我做result.collect时，它似乎是数组？也许吧。但无论您在哪个对象上调用sorted
，它都不是数组。同样，错误消息清楚地表明它不可能重复。您需要整理代码。您询问对数组进行排序，但错误消息显然不是关于数组，而是关于某个不属于Scala的模糊类型。你需要先弄清楚这个奇怪的类型是从哪里来的，以及为什么你用这个奇怪的类型而不是数组。@JörgWMittag当我做result.collect时，它似乎是数组？也许吧。但无论您在哪个对象上调用sorted
，它都不是数组。同样，错误消息清楚地表明它不是。嗨，Dimitry，我试过了，但得到一个错误，说明错误：value sorted不是org.apache.spark.rdd.rdd[Int]
的成员。我在问题中提供了更多细节。看起来您使用的结构不是数组，而是org.apache.spark.rdd.rdd
。所以这个应该是适用的：嗨，Dimitry，我尝试了这个，但是得到了一个错误，说明错误：value sorted不是org.apache.spark.rdd.rdd[Int]
的成员。我在问题中提供了更多细节。看起来您使用的结构不是数组，而是org.apache.spark.rdd.rdd
。所以这个应该是适用的：我尝试了这个，得到了以下结果：错误：类型不匹配；找到：scala.math.Ordering[Integer]必需：scala.math.Ordering[Any]注意：Integer@Archer UseOrdering[Int]
我尝试了这个方法，得到了以下结果：错误：类型不匹配；找到：scala.math.Ordering[Integer]必需：scala.math.Ordering[Any]注意：Integer@Archer UseOrdering[Int]
postIdCount.withFilter(_._2 > 5).map(_._1.toInt).collect.sorted(Ordering[Int].reverse)` 

val sorted = postIdCount
   .withFilter(_._2 > 5)
   .map(_._1.toInt)
   .sortBy(identity, ascending = false)