Scala/Spark中的类导入错误_Scala_Apache Spark

Scala/Spark中的类导入错误

scala apache-spark

Scala/Spark中的类导入错误,scala,apache-spark,Scala,Apache Spark,我是Spark的新手，我正在Scala中使用它。我编写了一个简单的对象，它使用：load test.scala在spark shell中加载良好 import org.apache.spark.ml.feature.StringIndexer 对象协作{ def列车模型（）={ val data=sc.textFile（“/user/PT/data/newfav.csv”） val df=data.map（u.split（“，”）匹配{ 案例数组（用户，食品，fav）=>（用户，食品，fav.t

我是Spark的新手，我正在Scala中使用它。我编写了一个简单的

对象

，它使用

：load test.scala

在

spark shell

中加载良好

import org.apache.spark.ml.feature.StringIndexer
对象协作{
def列车模型（）={
val data=sc.textFile（“/user/PT/data/newfav.csv”）
val df=data.map（u.split（“，”）匹配{
案例数组（用户，食品，fav）=>（用户，食品，fav.toDouble）
}).toDF（“userID”、“foodID”、“favorite”）
val userIndexer=new StringIndexer（）.setInputCol（“userID”）.setOutputCol（“userIndex”）
}
}

现在我想把它放到一个类中来传递参数。我将相同的代码用于

类

import org.apache.spark.ml.feature.StringIndexer
班级协作{
def列车模型（）={
val data=sc.textFile（“/user/PT/data/newfav.csv”）
val df=data.map（u.split（“，”）匹配{
案例数组（用户，食品，fav）=>（用户，食品，fav.toDouble）
}).toDF（“userID”、“foodID”、“favorite”）
val userIndexer=new StringIndexer（）.setInputCol（“userID”）.setOutputCol（“userIndex”）
}
}

这将返回导入错误

<console>:19: error: value toDF is not a member of org.apache.spark.rdd.RDD[(String, String, Double)]
           val df = data.map(_.split(",") match { case Array(user,food,fav) => (user,food,fav.toDouble) }).toDF("userID","foodID","favorite")

<console>:24: error: not found: type StringIndexer
           val userIndexer = new StringIndexer().setInputCol("userID").setOutputCol("userIndex")

：19:错误：值toDF不是org.apache.spark.rdd.rdd[（字符串，字符串，双精度）]
val df=data.map（u.split（“，”）匹配{case数组（user，food，fav）=>（user，food，fav.toDouble）}）.toDF（“userID”，“foodID”，“favorite”）
：24:错误：未找到：类型StringIndexer
val userIndexer=new StringIndexer（）.setInputCol（“userID”）.setOutputCol（“userIndex”）

我在这里遗漏了什么？

试试这个，这个似乎很好用

def trainModel() ={
    val spark = SparkSession.builder().appName("test").master("local").getOrCreate()
    import spark.implicits._ 
    val data = spark.read.textFile("/user/PT/data/newfav.csv")
    val df = data.map(_.split(",") match {
      case Array(user,food,fav) => (user,food,fav.toDouble)
    }).toDF("userID","foodID","favorite")
    val userIndexer = new StringIndexer().setInputCol("userID").setOutputCol("userIndex")
  }

试试这个，这个似乎很好用

def trainModel() ={
    val spark = SparkSession.builder().appName("test").master("local").getOrCreate()
    import spark.implicits._ 
    val data = spark.read.textFile("/user/PT/data/newfav.csv")
    val df = data.map(_.split(",") match {
      case Array(user,food,fav) => (user,food,fav.toDouble)
    }).toDF("userID","foodID","favorite")
    val userIndexer = new StringIndexer().setInputCol("userID").setOutputCol("userIndex")
  }