Scala/Spark中的类导入错误
我是Spark的新手,我正在Scala中使用它。我编写了一个简单的Scala/Spark中的类导入错误,scala,apache-spark,Scala,Apache Spark,我是Spark的新手,我正在Scala中使用它。我编写了一个简单的对象,它使用:load test.scala在spark shell中加载良好 import org.apache.spark.ml.feature.StringIndexer 对象协作{ def列车模型()={ val data=sc.textFile(“/user/PT/data/newfav.csv”) val df=data.map(u.split(“,”)匹配{ 案例数组(用户,食品,fav)=>(用户,食品,fav.t
对象
,它使用:load test.scala
在spark shell
中加载良好
import org.apache.spark.ml.feature.StringIndexer
对象协作{
def列车模型()={
val data=sc.textFile(“/user/PT/data/newfav.csv”)
val df=data.map(u.split(“,”)匹配{
案例数组(用户,食品,fav)=>(用户,食品,fav.toDouble)
}).toDF(“userID”、“foodID”、“favorite”)
val userIndexer=new StringIndexer().setInputCol(“userID”).setOutputCol(“userIndex”)
}
}
现在我想把它放到一个类中来传递参数。我将相同的代码用于类
import org.apache.spark.ml.feature.StringIndexer
班级协作{
def列车模型()={
val data=sc.textFile(“/user/PT/data/newfav.csv”)
val df=data.map(u.split(“,”)匹配{
案例数组(用户,食品,fav)=>(用户,食品,fav.toDouble)
}).toDF(“userID”、“foodID”、“favorite”)
val userIndexer=new StringIndexer().setInputCol(“userID”).setOutputCol(“userIndex”)
}
}
这将返回导入错误
<console>:19: error: value toDF is not a member of org.apache.spark.rdd.RDD[(String, String, Double)]
val df = data.map(_.split(",") match { case Array(user,food,fav) => (user,food,fav.toDouble) }).toDF("userID","foodID","favorite")
<console>:24: error: not found: type StringIndexer
val userIndexer = new StringIndexer().setInputCol("userID").setOutputCol("userIndex")
:19:错误:值toDF不是org.apache.spark.rdd.rdd[(字符串,字符串,双精度)]
val df=data.map(u.split(“,”)匹配{case数组(user,food,fav)=>(user,food,fav.toDouble)}).toDF(“userID”,“foodID”,“favorite”)
:24:错误:未找到:类型StringIndexer
val userIndexer=new StringIndexer().setInputCol(“userID”).setOutputCol(“userIndex”)
我在这里遗漏了什么?试试这个,这个似乎很好用
def trainModel() ={
val spark = SparkSession.builder().appName("test").master("local").getOrCreate()
import spark.implicits._
val data = spark.read.textFile("/user/PT/data/newfav.csv")
val df = data.map(_.split(",") match {
case Array(user,food,fav) => (user,food,fav.toDouble)
}).toDF("userID","foodID","favorite")
val userIndexer = new StringIndexer().setInputCol("userID").setOutputCol("userIndex")
}
试试这个,这个似乎很好用
def trainModel() ={
val spark = SparkSession.builder().appName("test").master("local").getOrCreate()
import spark.implicits._
val data = spark.read.textFile("/user/PT/data/newfav.csv")
val df = data.map(_.split(",") match {
case Array(user,food,fav) => (user,food,fav.toDouble)
}).toDF("userID","foodID","favorite")
val userIndexer = new StringIndexer().setInputCol("userID").setOutputCol("userIndex")
}