Warning: file_get_contents(/data/phpspider/zhask/data//catemap/3/apache-spark/6.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
Scala/Spark中的类导入错误_Scala_Apache Spark - Fatal编程技术网

Scala/Spark中的类导入错误

Scala/Spark中的类导入错误,scala,apache-spark,Scala,Apache Spark,我是Spark的新手,我正在Scala中使用它。我编写了一个简单的对象,它使用:load test.scala在spark shell中加载良好 import org.apache.spark.ml.feature.StringIndexer 对象协作{ def列车模型()={ val data=sc.textFile(“/user/PT/data/newfav.csv”) val df=data.map(u.split(“,”)匹配{ 案例数组(用户,食品,fav)=>(用户,食品,fav.t

我是Spark的新手,我正在Scala中使用它。我编写了一个简单的
对象
,它使用
:load test.scala
spark shell
中加载良好

import org.apache.spark.ml.feature.StringIndexer
对象协作{
def列车模型()={
val data=sc.textFile(“/user/PT/data/newfav.csv”)
val df=data.map(u.split(“,”)匹配{
案例数组(用户,食品,fav)=>(用户,食品,fav.toDouble)
}).toDF(“userID”、“foodID”、“favorite”)
val userIndexer=new StringIndexer().setInputCol(“userID”).setOutputCol(“userIndex”)
}
}
现在我想把它放到一个类中来传递参数。我将相同的代码用于

import org.apache.spark.ml.feature.StringIndexer
班级协作{
def列车模型()={
val data=sc.textFile(“/user/PT/data/newfav.csv”)
val df=data.map(u.split(“,”)匹配{
案例数组(用户,食品,fav)=>(用户,食品,fav.toDouble)
}).toDF(“userID”、“foodID”、“favorite”)
val userIndexer=new StringIndexer().setInputCol(“userID”).setOutputCol(“userIndex”)
}
}
这将返回导入错误

<console>:19: error: value toDF is not a member of org.apache.spark.rdd.RDD[(String, String, Double)]
           val df = data.map(_.split(",") match { case Array(user,food,fav) => (user,food,fav.toDouble) }).toDF("userID","foodID","favorite")

<console>:24: error: not found: type StringIndexer
           val userIndexer = new StringIndexer().setInputCol("userID").setOutputCol("userIndex")
:19:错误:值toDF不是org.apache.spark.rdd.rdd[(字符串,字符串,双精度)]
val df=data.map(u.split(“,”)匹配{case数组(user,food,fav)=>(user,food,fav.toDouble)}).toDF(“userID”,“foodID”,“favorite”)
:24:错误:未找到:类型StringIndexer
val userIndexer=new StringIndexer().setInputCol(“userID”).setOutputCol(“userIndex”)

我在这里遗漏了什么?

试试这个,这个似乎很好用

def trainModel() ={
    val spark = SparkSession.builder().appName("test").master("local").getOrCreate()
    import spark.implicits._ 
    val data = spark.read.textFile("/user/PT/data/newfav.csv")
    val df = data.map(_.split(",") match {
      case Array(user,food,fav) => (user,food,fav.toDouble)
    }).toDF("userID","foodID","favorite")
    val userIndexer = new StringIndexer().setInputCol("userID").setOutputCol("userIndex")
  }

试试这个,这个似乎很好用

def trainModel() ={
    val spark = SparkSession.builder().appName("test").master("local").getOrCreate()
    import spark.implicits._ 
    val data = spark.read.textFile("/user/PT/data/newfav.csv")
    val df = data.map(_.split(",") match {
      case Array(user,food,fav) => (user,food,fav.toDouble)
    }).toDF("userID","foodID","favorite")
    val userIndexer = new StringIndexer().setInputCol("userID").setOutputCol("userIndex")
  }