Warning: file_get_contents(/data/phpspider/zhask/data//catemap/2/scala/19.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
Scala XGBoost Spark示例_Scala_Apache Spark_Xgboost - Fatal编程技术网

Scala XGBoost Spark示例

Scala XGBoost Spark示例,scala,apache-spark,xgboost,Scala,Apache Spark,Xgboost,我正在尝试运行XGBoost spark示例 在下面的步骤中,我遇到了类型不匹配的问题 val model = XGBoost.train(trainData3, paramMap, round=2, nWorkers=2, useExternalMemory=true, missing=0.0f ) 错误:类型不匹配 org.apache.spark.rdd.rdd[org.apache.spark.mllib.regression.LabeledPoint] 必需:org.apache

我正在尝试运行XGBoost spark示例

在下面的步骤中,我遇到了类型不匹配的问题

val model = XGBoost.train(trainData3, paramMap, round=2, nWorkers=2, useExternalMemory=true, missing=0.0f ) 
错误:类型不匹配 org.apache.spark.rdd.rdd[org.apache.spark.mllib.regression.LabeledPoint] 必需:org.apache.spark.rdd.rdd[org.apache.spark.ml.feature.LabeledPoint]

似乎val trainData3=MLUtils.loadLibSVMFile(sc,inputRainPath)之前生成的是mllib.regression.LabeledPoint,而不是XGBoost期望的ml.feature.LabelPoint

为了测试我的整体设置,我通过手动创建了一个ml.feature.LabelPoint

val trainRDD = sc.parallelize(Seq(
     |   LabeledPoint(1.0, new DenseVector(Array(2.0, 3.0, 4.0))),
     |   LabeledPoint(0.0, new DenseVector(Array(5.0, 5.0, 5.0))),
     ...
     ),4)
这作为xgboost.train的输入,运行时没有任何问题。trainRDD是一个org.apache.spark.ml.feature.LabeledPoint

有没有想过为什么我会看到这种类型的不匹配?谢谢

scala版本-2.10.6, 火花-2.11-2.0.0, xgboost-0.7,
Mac OS El Capitan

使用SparkWithRDD转换为XGBoost所需的ml.feature.LabeledPoint类型。一个例子可以在