Apache spark 在Cloudera VM中添加spark csv包时遇到问题

Apache spark 在Cloudera VM中添加spark csv包时遇到问题,apache-spark,pyspark,spark-dataframe,cloudera-quickstart-vm,Apache Spark,Pyspark,Spark Dataframe,Cloudera Quickstart Vm,我正在使用Cloudera quickstart VM测试一些pyspark工作。对于一个任务,我需要添加spark csv包。以下是我所做的: PYSPARK_DRIVER_PYTHON=ipython pyspark -- packages com.databricks:spark-csv_2.10:1.3.0 pyspark启动良好,但我确实收到如下警告: **16/02/09 17:41:22 WARN util.Utils: Your hostname, quickstart.clo

我正在使用Cloudera quickstart VM测试一些pyspark工作。对于一个任务,我需要添加spark csv包。以下是我所做的:

PYSPARK_DRIVER_PYTHON=ipython pyspark -- packages com.databricks:spark-csv_2.10:1.3.0
pyspark启动良好,但我确实收到如下警告:

**16/02/09 17:41:22 WARN util.Utils: Your hostname, quickstart.cloudera resolves to a loopback address: 127.0.0.1; using 10.0.2.15 instead (on interface eth0)
16/02/09 17:41:22 WARN util.Utils: Set SPARK_LOCAL_IP if you need to bind to another address
16/02/09 17:41:26 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable**
然后我在pyspark中运行了我的代码:

yelp_df = sqlCtx.load( 
    source="com.databricks.spark.csv",  


    header = 'true',  


    inferSchema = 'true',  


    path = 'file:///directory/file.csv')
但我收到一条错误消息:

Py4JJavaError: An error occurred while calling o19.load.: java.lang.RuntimeException: Failed to load class for data source:    com.databricks.spark.csv at scala.sys.package$.error(package.scala:27)
可能出了什么问题??提前感谢您的帮助。

试试这个

PYSPARK\u DRIVER\u PYTHON=ipython-PYSPARK-packages com.databricks:spark-csv\u 2.10:1.3.0

如果没有空间,就会出现打字错误