Warning: file_get_contents(/data/phpspider/zhask/data//catemap/2/scala/18.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
Scala 分隔数据的Spark排序_Scala_Apache Spark_Scala 2.10 - Fatal编程技术网

Scala 分隔数据的Spark排序

Scala 分隔数据的Spark排序,scala,apache-spark,scala-2.10,Scala,Apache Spark,Scala 2.10,我是新手。你能告诉我下面的代码有什么问题吗 val rawData="""USA | E001 | ABC DE | 19850607 | IT | $100 UK | E005 | CHAN CL | 19870512 | OP | $200 USA | E003 | XYZ AB | 19890101 | IT | $250 USA | E002 | XYZ AB | 19890705 | IT | $200""" val sc = ... val data= rawData.sp

我是新手。你能告诉我下面的代码有什么问题吗

val rawData="""USA | E001 | ABC DE | 19850607 | IT | $100
UK | E005 | CHAN CL | 19870512 | OP | $200
USA | E003 | XYZ AB | 19890101 | IT | $250
USA | E002 | XYZ AB | 19890705 | IT | $200"""
val sc = ...     
val data= rawData.split("\n")
val rdd= sc.parallelize(data)
val data1=rdd.flatMap(line=> line.split(" | "))
val data2 = data1.map(arr => (arr(2), arr.mkString(""))).sortByKey(false)
data2.saveAsTextFile("./sample_data1_output")
这里,
.sortByKey(false)
不工作,编译器给我错误信息:

[error] /home/admin/scala/spark-poc/src/main/scala/SparkApp.scala:26: value sortByKey is not a member of org.apache.spark.rdd.RDD[(String, String)]
[error] val data2 = data1.map(arr => (arr(2), arr.mkString(""))).sortByKey(false) 

问题是如何获得MappedRDD?或者我应该在什么对象上调用sortByKey()?

Spark在成对的RDD上提供额外的操作,如sortByKey()。这些操作可以通过名为pairddfunctions的类获得,Spark使用隐式转换自动执行RDD->pairddfunctions包装

要导入隐式转换,请在程序顶部添加以下行:

import org.apache.spark.SparkContext
import org.apache.spark.SparkContext._

《Spark编程指南》关于的章节中讨论了这一点。

您可以粘贴完整/实际的编译器输出吗?[error]/home/admin/scala/Spark-poc/src/main/scala/SparkApp.scala:26:value-sortByKey不是org.apache.Spark.rdd.rdd[(String,String)][error]val-data2=data1.map(arr=>(arr(2),arr.mkString(“”))的成员。sortByKey(false)