Apache spark spark 1.6.1上的spark csv读数存在问题
当我试图读取CSV文件时遇到错误。我使用的是spark 1.6.1,下面是我的代码Apache spark spark 1.6.1上的spark csv读数存在问题,apache-spark,apache-spark-sql,Apache Spark,Apache Spark Sql,当我试图读取CSV文件时遇到错误。我使用的是spark 1.6.1,下面是我的代码 val reftable_df = sqlContext.read .format("com.databricks.spark.csv") .option("header", "true") .option("inferSchema", "true") .load("/home/hadoop1/Reference_Currencyctoff.csv") reftabl
val reftable_df = sqlContext.read
.format("com.databricks.spark.csv")
.option("header", "true")
.option("inferSchema", "true")
.load("/home/hadoop1/Reference_Currencyctoff.csv")
reftable_df.show()
Exception in thread "main" java.lang.NoClassDefFoundError: org/apache/commons/csv/CSVFormat
at com.databricks.spark.csv.package$.<init>(package.scala:27)
at com.databricks.spark.csv.package$.<clinit>(package.scala)
at com.databricks.spark.csv.CsvRelation.inferSchema(CsvRelation.scala:218)
at com.databricks.spark.csv.CsvRelation.<init>(CsvRelation.scala:72)
at com.databricks.spark.csv.DefaultSource.createRelation(DefaultSource.scala:157)
at com.databricks.spark.csv.DefaultSource.createRelation(DefaultSource.scala:44)
at org.apache.spark.sql.execution.datasources.ResolvedDataSource$.apply(ResolvedDataSource.scala:158)
at org.apache.spark.sql.DataFrameReader.load(DataFrameReader.scala:119)
at org.apache.spark.sql.DataFrameReader.load(DataFrameReader.scala:109)
at scb.HBaseBroadcast$.main(HBaseBroadcast.scala:138)
at scb.HBaseBroadcast.main(HBaseBroadcast.scala)
val reftable_df=sqlContext.read
.format(“com.databricks.spark.csv”)
.选项(“标题”、“正确”)
.选项(“推断模式”、“真”)
.load(“/home/hadoop1/Reference\u Currencyctoff.csv”)
参考表_df.show()
线程“main”java.lang.NoClassDefFoundError中出现异常:org/apache/commons/csv/CSVFormat
在com.databricks.spark.csv.package$(package.scala:27)
在com.databricks.spark.csv.package$(package.scala)上
在com.databricks.spark.csv.CsvRelation.inferSchema上(CsvRelation.scala:218)
请访问com.databricks.spark.csv.CsvRelation.(CsvRelation.scala:72)
位于com.databricks.spark.csv.DefaultSource.createRelation(DefaultSource.scala:157)
位于com.databricks.spark.csv.DefaultSource.createRelation(DefaultSource.scala:44)
位于org.apache.spark.sql.execution.datasources.resolvedatasource$.apply(resolvedatasource.scala:158)
位于org.apache.spark.sql.DataFrameReader.load(DataFrameReader.scala:119)
位于org.apache.spark.sql.DataFrameReader.load(DataFrameReader.scala:109)
在scb.HBaseBroadcast.main美元(HBaseBroadcast.scala:138)
位于scb.HBaseBroadcast.main(HBaseBroadcast.scala)
注意:我已经厌倦了以下CSV依赖项
Spark Csv»1.3.0
Spark Csv»1.3.1
Spark Csv»1.4.0
Spark Csv»1.5.0
谢谢 我也面临同样的问题
--jars /path/to/spark-csv.jar,/path/to/commons-csv.jar
解决了这个问题
commons-csv.jar有这个类
您可以使用
jar-tvf commons-csv.jar | grep CSVFormat查看该类
在启动spark shell时尝试此操作
bin/spark-shell --packages com.databricks:spark-csv_2.10:1.5.0
包括这个包裹