Jdbc IllegalArgumentException:u';错误的FS:file://spark-warehouse,应为:文件://';
我正在尝试使用PySpark将我的Postgres数据库加载到Spark中:Jdbc IllegalArgumentException:u';错误的FS:file://spark-warehouse,应为:文件://';,jdbc,apache-spark,pyspark,py4j,Jdbc,Apache Spark,Pyspark,Py4j,我正在尝试使用PySpark将我的Postgres数据库加载到Spark中: from pyspark import SparkContext from pyspark import SparkConf from random import random #spark conf conf = SparkConf() conf.setMaster("spark://spark-master:7077") conf.setAppName('pyspark') sc = SparkContext(
from pyspark import SparkContext
from pyspark import SparkConf
from random import random
#spark conf
conf = SparkConf()
conf.setMaster("spark://spark-master:7077")
conf.setAppName('pyspark')
sc = SparkContext(conf=conf)
from pyspark.sql import SQLContext
sqlContext = SQLContext(sc)
properties = {
"user": "postgres",
"password": "password123",
"driver": "org.postgresql.Driver"
}
url = "jdbc.postgresql://<POSTGRES_IP>/DB_NAME"
df = sqlContext.read.jdbc(url=url, table='myTable', properties=properties)
为sql仓库目录设置指定一个现有目录应该可以解决您的问题。例如,在作业启动时:
./bin/spark-submit --conf spark.sql.warehouse.dir=/tmp/ \
... # other options
your_file.py \
[application-arguments]
./bin/spark-submit --conf spark.sql.warehouse.dir=/tmp/ \
... # other options
your_file.py \
[application-arguments]