Warning: file_get_contents(/data/phpspider/zhask/data//catemap/2/scala/18.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181

Warning: file_get_contents(/data/phpspider/zhask/data//catemap/3/apache-spark/6.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
Scala Kmeans,无法执行用户定义的函数($anonfun$4:…)_Scala_Apache Spark_K Means - Fatal编程技术网

Scala Kmeans,无法执行用户定义的函数($anonfun$4:…)

Scala Kmeans,无法执行用户定义的函数($anonfun$4:…),scala,apache-spark,k-means,Scala,Apache Spark,K Means,我尝试应用kmeans算法 代码 val dfJoin_products_items = df_products.join(df_items, "product_id") dfJoin_products_items.createGlobalTempView("products_items") val weightFreight = spark.sql("SELECT cast(product_weight_g as double) weight, cast(freight_value as do

我尝试应用kmeans算法

代码

val dfJoin_products_items = df_products.join(df_items, "product_id")
dfJoin_products_items.createGlobalTempView("products_items")
val weightFreight = spark.sql("SELECT cast(product_weight_g as double) weight, cast(freight_value as double) freight FROM global_temp.products_items")
case class Rows(weight:Double, freight:Double)
val rows = weightFreight.as[Rows]
val assembler = new VectorAssembler().setInputCols(Array("weight", "freight")).setOutputCol("features")
val data = assembler.transform(rows)
val kmeans = new KMeans().setK(4)
val model = kmeans.fit(data)

dfJoin_产品_项目

重量运费

错误

2019-02-03 20:51:41 WARN  BlockManager:66 - Putting block rdd_126_1 failed due to exception org.apache.spark.SparkException: Failed to execute user defined function($anonfun$4: (struct<weight:double,freight:double>) => struct<type:tinyint,size:int,indices:array<int>,values:array<double>>).
2019-02-03 20:51:41 WARN  BlockManager:66 - Block rdd_126_1 could not be removed as it was not found on disk or in memory
2019-02-03 20:51:41 WARN  BlockManager:66 - Putting block rdd_126_2 failed due to exception org.apache.spark.SparkException: Failed to execute user defined function($anonfun$4: (struct<weight:double,freight:double>) => struct<type:tinyint,size:int,indices:array<int>,values:array<double>>).
2019-02-03 20:51:41 ERROR Executor:91 - Exception in task 1.0 in stage 16.0 (TID 23)
org.apache.spark.SparkException: Failed to execute user defined function($anonfun$4: (struct<weight:double,freight:double>) => struct<type:tinyint,size:int,indices:array<int>,values:array<double>>)
2019-02-03 20:51:41警告块管理器:66-由于异常org.apache.spark.SparkException:未能执行用户定义的函数($anonfun$4:(struct)=>struct),放置块rdd_126_1失败。
2019-02-03 20:51:41警告块管理器:66-无法删除块rdd_126_1,因为在磁盘或内存中找不到它
2019-02-03 20:51:41警告块管理器:66-由于org.apache.spark.sparkeException异常,放置块rdd_126_2失败:未能执行用户定义的函数($anonfun$4:(struct)=>struct)。
2019-02-03 20:51:41错误执行者:91-第16.0阶段任务1.0中的异常(TID 23)
org.apache.spark.SparkException:无法执行用户定义的函数($anonfun$4:(struct)=>struct)
我不明白这个错误,有人能给我解释一下吗

非常感谢

更新1:完整堆栈跟踪

val dfJoin_products_items = df_products.join(df_items, "product_id")
dfJoin_products_items.createGlobalTempView("products_items")
val weightFreight = spark.sql("SELECT cast(product_weight_g as double) weight, cast(freight_value as double) freight FROM global_temp.products_items")
case class Rows(weight:Double, freight:Double)
val rows = weightFreight.as[Rows]
val assembler = new VectorAssembler().setInputCols(Array("weight", "freight")).setOutputCol("features")
val data = assembler.transform(rows)
val kmeans = new KMeans().setK(4)
val model = kmeans.fit(data)

stacktrace非常庞大,因此您可以在此处找到它:

可能重复我已经尝试过的,但我仍然有相同的错误。可能重复我已经尝试过的,但我仍然有相同的错误。
2019-02-03 20:51:41 WARN  BlockManager:66 - Putting block rdd_126_1 failed due to exception org.apache.spark.SparkException: Failed to execute user defined function($anonfun$4: (struct<weight:double,freight:double>) => struct<type:tinyint,size:int,indices:array<int>,values:array<double>>).
2019-02-03 20:51:41 WARN  BlockManager:66 - Block rdd_126_1 could not be removed as it was not found on disk or in memory
2019-02-03 20:51:41 WARN  BlockManager:66 - Putting block rdd_126_2 failed due to exception org.apache.spark.SparkException: Failed to execute user defined function($anonfun$4: (struct<weight:double,freight:double>) => struct<type:tinyint,size:int,indices:array<int>,values:array<double>>).
2019-02-03 20:51:41 ERROR Executor:91 - Exception in task 1.0 in stage 16.0 (TID 23)
org.apache.spark.SparkException: Failed to execute user defined function($anonfun$4: (struct<weight:double,freight:double>) => struct<type:tinyint,size:int,indices:array<int>,values:array<double>>)