Postgresql 在Pyspark中填充空的postgres数据库
我想将Pyspark数据帧Postgresql 在Pyspark中填充空的postgres数据库,postgresql,pyspark,Postgresql,Pyspark,我想将Pyspark数据帧json_df放入一个完全空的postgres数据库(没有模式和表)。我使用下面的代码,但是在write语句的选项中选择的表有问题。这个错误显示了一个驱动程序问题,但我已经更新了驱动程序,所以我认为这只是我的代码错了。任何帮助都将不胜感激 我的代码: database = "postgres" jdbcUrl = f"jdbc:postgres://localhost:5432;databaseName={database}"
json_df
放入一个完全空的postgres数据库(没有模式和表)。我使用下面的代码,但是在write语句的选项中选择的表有问题。这个错误显示了一个驱动程序问题,但我已经更新了驱动程序,所以我认为这只是我的代码错了。任何帮助都将不胜感激
我的代码:
database = "postgres"
jdbcUrl = f"jdbc:postgres://localhost:5432;databaseName={database}"
schema = StructType([
StructField('first_column', StringType(), True),
StructField('second_columns', StringType(), True),
])
df = sqlContext.createDataFrame(sc.emptyRDD(),schema)
json_df.select("first_column","second_column").write.format("jdbc") \
.mode("overwrite") \
.option("url", jdbcUrl) \
.option("user", user) \
.option("dbtable", df) \
.save()
错误:
Traceback (most recent call last):
File "etl.py", line 98, in <module>
.option("dbtable", df) \
File "/home/ubuntu/.local/lib/python3.6/site-packages/pyspark/sql/readwriter.py", line 825, in save
self._jwrite.save()
File "/home/ubuntu/.local/lib/python3.6/site-packages/py4j/java_gateway.py", line 1305, in __call__
answer, self.gateway_client, self.target_id, self.name)
File "/home/ubuntu/.local/lib/python3.6/site-packages/pyspark/sql/utils.py", line 128, in deco
return f(*a, **kw)
File "/home/ubuntu/.local/lib/python3.6/site-packages/py4j/protocol.py", line 328, in get_return_value
format(target_id, ".", name), value)
py4j.protocol.Py4JJavaError: An error occurred while calling o74.save.
: java.sql.SQLException: No suitable driver
错误是:
File "etl.py", line 104, in <module>
.option("dbtable", "new_schema.json_df") \
File "/home/ubuntu/.local/lib/python3.6/site-packages/pyspark/sql/readwriter.py", line 825, in save
self._jwrite.save()
File "/home/ubuntu/.local/lib/python3.6/site-packages/py4j/java_gateway.py", line 1305, in __call__
answer, self.gateway_client, self.target_id, self.name)
File "/home/ubuntu/.local/lib/python3.6/site-packages/pyspark/sql/utils.py", line 128, in deco
return f(*a, **kw)
File "/home/ubuntu/.local/lib/python3.6/site-packages/py4j/protocol.py", line 328, in get_return_value
format(target_id, ".", name), value)
py4j.protocol.Py4JJavaError: An error occurred while calling o53.save.
: java.sql.SQLException: No suitable driver
文件“etl.py”,第104行,在
.option(“dbtable”、“new_schema.json_df”)\
文件“/home/ubuntu/.local/lib/python3.6/site packages/pyspark/sql/readwriter.py”,第825行,保存
self.\u jwrite.save()
文件“/home/ubuntu/.local/lib/python3.6/site packages/py4j/java_gateway.py”,第1305行,在调用中__
回答,self.gateway\u客户端,self.target\u id,self.name)
文件“/home/ubuntu/.local/lib/python3.6/site-packages/pyspark/sql/utils.py”,第128行,deco格式
返回f(*a,**kw)
文件“/home/ubuntu/.local/lib/python3.6/site packages/py4j/protocol.py”,第328行,在get\u return\u值中
格式(目标id,“.”,名称),值)
py4j.protocol.Py4JJavaError:调用o53.save时出错。
:java.sql.SQLException:没有合适的驱动程序
根据,您应该使用postgresql
而不是postgres
,并且不需要指定databaseName=
jdbcUrl = f"jdbc:postgresql://localhost:5432/{database}"
根据,您应该使用postgresql
而不是postgres
,并且不需要指定databaseName=
jdbcUrl = f"jdbc:postgresql://localhost:5432/{database}"
评论不用于扩展讨论;此对话已结束。评论不用于扩展讨论;这段对话已经结束。