Warning: file_get_contents(/data/phpspider/zhask/data//catemap/3/apache-spark/5.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
Scala 为什么我在spark中尝试建立回归模型时无法执行用户定义的函数错误?_Scala_Apache Spark_Apache Spark Sql - Fatal编程技术网

Scala 为什么我在spark中尝试建立回归模型时无法执行用户定义的函数错误?

Scala 为什么我在spark中尝试建立回归模型时无法执行用户定义的函数错误?,scala,apache-spark,apache-spark-sql,Scala,Apache Spark,Apache Spark Sql,我有一些在spark中构建线性回归模型的代码。但在构建模型时抛出:SparkException:无法执行用户定义的函数($anonfun$4:(struct)=>struct,value:array>) 这是我的数据集: val csv = ss.read.option("header","true").option("inferSchema","true").format("csv").load("/data.csv") csv.show() 这几行代码也可以毫无问题地运行: val

我有一些在spark中构建线性回归模型的代码。但在构建模型时抛出:SparkException:无法执行用户定义的函数($anonfun$4:(struct)=>struct,value:array>)

这是我的数据集:

val csv = 
ss.read.option("header","true").option("inferSchema","true").format("csv").load("/data.csv")
csv.show()
这几行代码也可以毫无问题地运行:

  val selectedCols = csv.select(csv("use [kW]").as("label"), $"gen [kW]", $"House overall [kW]")
  val assembler = new VectorAssembler().setInputCols(Array("gen [kW]","House overall [kW]")).setOutputCol("features")
  val output = assembler.transform(selectedCols).select($"label",$"features")
  output.show()
但当它进入我代码的这一部分时:

  val model = new LinearRegression()
  val lrmodel = model.fit(output)
  val trainingSum = lrmodel.summary
  print(trainingSum.r2)
发生以下错误:

org.apache.spark.SparkException: Failed to execute user defined function($anonfun$4: (struct<gen [kW]:double,House overall [kW]:double>) => struct<type:tinyint,size:int,indices:array<int>,values:array<double>>)
我认为问题在于我的数据集,但我不明白为什么它与回归模型和“用户定义函数”相关

org.apache.spark.SparkException: Failed to execute user defined function($anonfun$4: (struct<gen [kW]:double,House overall [kW]:double>) => struct<type:tinyint,size:int,indices:array<int>,values:array<double>>)
+------------------+------------------+------------------------+---------------------------+------------------+------------------+
|   Avg Area Income|Avg Area House Age|Avg Area Number of Rooms|Avg Area Number of Bedrooms|   Area Population|             Price|
+------------------+------------------+------------------------+---------------------------+------------------+------------------+
| 79545.45857431678| 5.682861321615587|       7.009188142792237|                       4.09|23086.800502686456|1059033.5578701235|
| 79248.64245482568|6.0028998082752425|       6.730821019094919|                       3.09| 40173.07217364482|  1505890.91484695|
|61287.067178656784| 5.865889840310001|       8.512727430375099|                       5.13| 36882.15939970458|1058987.9878760849|
| 63345.24004622798|7.1882360945186425|       5.586728664827653|                       3.26| 34310.24283090706|1260616.8066294468|
| 59982.19722570803| 5.040554523106283|       7.839387785120487|                       4.23|26354.109472103148| 630943.4893385402|