Warning: file_get_contents(/data/phpspider/zhask/data//catemap/3/apache-spark/5.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
Jdbc IllegalArgumentException:u';错误的FS:file://spark-warehouse,应为:文件://';_Jdbc_Apache Spark_Pyspark_Py4j - Fatal编程技术网

Jdbc IllegalArgumentException:u';错误的FS:file://spark-warehouse,应为:文件://';

Jdbc IllegalArgumentException:u';错误的FS:file://spark-warehouse,应为:文件://';,jdbc,apache-spark,pyspark,py4j,Jdbc,Apache Spark,Pyspark,Py4j,我正在尝试使用PySpark将我的Postgres数据库加载到Spark中: from pyspark import SparkContext from pyspark import SparkConf from random import random #spark conf conf = SparkConf() conf.setMaster("spark://spark-master:7077") conf.setAppName('pyspark') sc = SparkContext(

我正在尝试使用PySpark将我的Postgres数据库加载到Spark中:

from pyspark import SparkContext
from pyspark import SparkConf
from random import random

#spark conf
conf = SparkConf()
conf.setMaster("spark://spark-master:7077")
conf.setAppName('pyspark')

sc = SparkContext(conf=conf)

from pyspark.sql import SQLContext
sqlContext = SQLContext(sc)
properties = {
    "user": "postgres",
    "password": "password123",
    "driver": "org.postgresql.Driver"
}
url = "jdbc.postgresql://<POSTGRES_IP>/DB_NAME"
df = sqlContext.read.jdbc(url=url, table='myTable', properties=properties)

为sql仓库目录设置指定一个现有目录应该可以解决您的问题。例如,在作业启动时:

./bin/spark-submit --conf spark.sql.warehouse.dir=/tmp/ \
... # other options
your_file.py \
[application-arguments]
./bin/spark-submit --conf spark.sql.warehouse.dir=/tmp/ \
... # other options
your_file.py \
[application-arguments]