Warning: file_get_contents(/data/phpspider/zhask/data//catemap/3/apache-spark/5.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
Scala 将包含BigInt的RDD转换为Spark数据帧_Scala_Apache Spark_Apache Spark Sql_Spark Dataframe_Scala Collections - Fatal编程技术网

Scala 将包含BigInt的RDD转换为Spark数据帧

Scala 将包含BigInt的RDD转换为Spark数据帧,scala,apache-spark,apache-spark-sql,spark-dataframe,scala-collections,Scala,Apache Spark,Apache Spark Sql,Spark Dataframe,Scala Collections,嗨,我在spark 1.6.3工作。我有一个rdd,里面有一些BigInt scala类型。如何将其转换为spark数据帧? 是否可以在创建数据帧之前强制转换类型 我的rdd: Array[(BigInt, String, String, BigInt, BigInt, BigInt, BigInt, List[String])] = Array((14183197,Browse,3393626f-98e3-4973-8d38-6b2fb17454b5_27331247X28X6839X1506

嗨,我在spark 1.6.3工作。我有一个rdd,里面有一些BigInt scala类型。如何将其转换为spark数据帧? 是否可以在创建数据帧之前强制转换类型

我的rdd:

Array[(BigInt, String, String, BigInt, BigInt, BigInt, BigInt, List[String])] = Array((14183197,Browse,3393626f-98e3-4973-8d38-6b2fb17454b5_27331247X28X6839X1506087469573,80161702,8702170626376335,59,527780275219,List(NavigationLevel, Session)), (14183197,Browse,3393626f-98e3-4973-8d38-6b2fb17454b5_27331247X28X6839X1506087469573,80161356,8702171157207449,72,527780278061,List(StartPlay, Action, Session)))
打印出:

(14183197,Browse,3393626f-98e3-4973-8d38-6b2fb17454b5_27331247X28X6839X1506087469573,80161356,8702171157207449,72,527780278061,List(StartPlay, Action, Session))
(14183197,Browse,3393626f-98e3-4973-8d38-6b2fb17454b5_27331247X28X6839X1506087469573,80161702,8702170626376335,59,527780275219,List(NavigationLevel, Session))
我已经厌倦了创建模式对象

  val schema = StructType(Array(
    StructField("trackId", LongType, true),
    StructField("location", StringType, true),
    StructField("listId", StringType, true),
    StructField("videoId", LongType, true),
    StructField("id", LongType, true),
    StructField("sequence", LongType, true),
    StructField("time", LongType, true),
    StructField("type", ArrayType(StringType), true)
  ))
如果我尝试
val df=sqlContext.createDataFrame(rdd,schema)
我会得到这个错误

error: overloaded method value createDataFrame with alternatives:
  (data: java.util.List[_],beanClass: Class[_])org.apache.spark.sql.DataFrame <and>
  (rdd: org.apache.spark.api.java.JavaRDD[_],beanClass: Class[_])org.apache.spark.sql.DataFrame <and>
  (rdd: org.apache.spark.rdd.RDD[_],beanClass: Class[_])org.apache.spark.sql.DataFrame <and>
  (rows: java.util.List[org.apache.spark.sql.Row],schema: org.apache.spark.sql.types.StructType)org.apache.spark.sql.DataFrame <and>
  (rowRDD: org.apache.spark.api.java.JavaRDD[org.apache.spark.sql.Row],schema: org.apache.spark.sql.types.StructType)org.apache.spark.sql.DataFrame <and>
  (rowRDD: org.apache.spark.rdd.RDD[org.apache.spark.sql.Row],schema: org.apache.spark.sql.types.StructType)org.apache.spark.sql.DataFrame
 cannot be applied to (org.apache.spark.rdd.RDD[(BigInt, String, String, BigInt, BigInt, BigInt, BigInt, scala.collection.immutable.List[String])], org.apache.spark.sql.types.StructType)

非常感谢您提供的任何帮助

模式只能与
RDD[Row]
一起使用。这里使用反射:

sqlContext.createDataFrame(rdd)

您还需要将
BigInt
更改为(
BigDecimal
?)或。

感谢您的评论,我得到了
java.lang.UnsupportedOperationException:不支持scala.BigInt类型的架构
错误
sqlContext.createDataFrame(rdd)