Warning: file_get_contents(/data/phpspider/zhask/data//catemap/3/apache-spark/6.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
Scala Spark-调用createDataFrame时获取重载方法_Scala_Apache Spark_Bigdata - Fatal编程技术网

Scala Spark-调用createDataFrame时获取重载方法

Scala Spark-调用createDataFrame时获取重载方法,scala,apache-spark,bigdata,Scala,Apache Spark,Bigdata,我尝试从数组Double(Array[Array[Double]])创建一个数据帧,如下所示: val points : ArrayBuffer[Array[Double]] = ArrayBuffer( Array(0.19238990024216676, 1.0, 0.0, 0.0), Array(0.2864319929878242, 0.0, 1.0, 0.0), Array(0.11160349352921925, 0.0, 2.0, 1.0), Array(0.36592200264

我尝试从数组Double(Array[Array[Double]])创建一个数据帧,如下所示:

val points : ArrayBuffer[Array[Double]] = ArrayBuffer(
Array(0.19238990024216676, 1.0, 0.0, 0.0),
Array(0.2864319929878242, 0.0, 1.0, 0.0),
Array(0.11160349352921925, 0.0, 2.0, 1.0),
Array(0.3659220026496052, 2.0, 2.0, 0.0),
Array(0.31809629470827383, 1.0, 1.0, 1.0))

val x = Array("__1", "__2", "__3", "__4")
val myschema = StructType(x.map(fieldName ⇒ StructField(fieldName, DoubleType, true)))

points.map(e => Row(e(0), e(1), e(2), e(3)))
val newDF = sqlContext.createDataFrame(points, myschema)
但是得到这个错误:

<console>:113: error: overloaded method value createDataFrame with alternatives:
(data: java.util.List[_],beanClass: Class[_])org.apache.spark.sql.DataFrame <and>
(rdd: org.apache.spark.api.java.JavaRDD[_],beanClass: Class[_])org.apache.spark.sql.DataFrame <and>
(rdd: org.apache.spark.rdd.RDD[_],beanClass: Class[_])org.apache.spark.sql.DataFrame <and>
(rows: java.util.List[org.apache.spark.sql.Row],schema: org.apache.spark.sql.types.StructType)org.apache.spark.sql.DataFrame <and>
(rowRDD: org.apache.spark.api.java.JavaRDD[org.apache.spark.sql.Row],schema: org.apache.spark.sql.types.StructType)org.apache.spark.sql.DataFrame <and>
(rowRDD: org.apache.spark.rdd.RDD[org.apache.spark.sql.Row],schema: org.apache.spark.sql.types.StructType)org.apache.spark.sql.DataFrame
cannot be applied to (scala.collection.mutable.ArrayBuffer[Array[Double]], org.apache.spark.sql.types.StructType)
val newDF = sqlContext.createDataFrame(points, myschema)
:113:错误:重载了方法值createDataFrame,并带有可选项:
(数据:java.util.List[\ux],beanClass:Class[\ux])org.apache.spark.sql.DataFrame
(rdd:org.apache.spark.api.java.JavaRDD[_],beanClass:Class[_])org.apache.spark.sql.DataFrame
(rdd:org.apache.spark.rdd.rdd[\u0],beanClass:Class[\u0])org.apache.spark.sql.DataFrame
(行:java.util.List[org.apache.spark.sql.Row],模式:org.apache.spark.sql.types.StructType)org.apache.spark.sql.DataFrame
(rowRDD:org.apache.spark.api.java.JavaRDD[org.apache.spark.sql.Row],schema:org.apache.spark.sql.types.StructType)org.apache.spark.sql.DataFrame
(rowRDD:org.apache.spark.rdd.rdd[org.apache.spark.sql.Row],schema:org.apache.spark.sql.types.StructType)org.apache.spark.sql.DataFrame
无法应用于(scala.collection.mutable.ArrayBuffer[Array[Double]],org.apache.spark.sql.types.StructType)
val newDF=sqlContext.createDataFrame(点,myschema)

我在互联网上搜索过,但找不到如何修复它!所以如果有人对此有任何想法,请帮助我

没有方法
createDataFrame
重载,该方法接受
ArrayBuffer[Array[Double]]
的实例。您对
points.map的调用未分配给任何对象,它返回一个新实例,而不是就地操作。尝试:

val points : List[Array[Double]] = List(
    Seq(0.19238990024216676, 1.0, 0.0, 0.0),
    Seq(0.2864319929878242, 0.0, 1.0, 0.0),
    Seq(0.11160349352921925, 0.0, 2.0, 1.0),
    Seq(0.3659220026496052, 2.0, 2.0, 0.0),
    Seq(0.31809629470827383, 1.0, 1.0, 1.0))

val x = Array("__1", "__2", "__3", "__4")
val myschema = StructType(x.map(fieldName ⇒ StructField(fieldName, DoubleType, true)))

val newDF = sqlContext.createDataFrame(
    points.map(Row.fromSeq(_), myschema)
这对我很有用:

import org.apache.spark.sql._
import org.apache.spark.sql.types._
import scala.collection.mutable.ArrayBuffer

val sqlContext = new org.apache.spark.sql.SQLContext(sc)

val points : ArrayBuffer[Array[Double]] = ArrayBuffer(
  Array(0.19238990024216676, 1.0, 0.0, 0.0),
  Array(0.2864319929878242, 0.0, 1.0, 0.0),
  Array(0.11160349352921925, 0.0, 2.0, 1.0),
  Array(0.3659220026496052, 2.0, 2.0, 0.0),
  Array(0.31809629470827383, 1.0, 1.0, 1.0))

val x = Array("__1", "__2", "__3", "__4")
val myschema = StructType(x.map(fieldName ⇒ StructField(fieldName, DoubleType, true)))

val rdd = sc.parallelize(points.map(e => Row(e(0), e(1), e(2), e(3))))
val newDF = sqlContext.createDataFrame(rdd, myschema)

newDF.show

请解释一下为什么被否决,我将不胜感激。谢谢