Postgresql 在Pyspark中填充空的postgres数据库

Postgresql 在Pyspark中填充空的postgres数据库,postgresql,pyspark,Postgresql,Pyspark,我想将Pyspark数据帧json_df放入一个完全空的postgres数据库(没有模式和表)。我使用下面的代码,但是在write语句的选项中选择的表有问题。这个错误显示了一个驱动程序问题,但我已经更新了驱动程序,所以我认为这只是我的代码错了。任何帮助都将不胜感激 我的代码: database = "postgres" jdbcUrl = f"jdbc:postgres://localhost:5432;databaseName={database}"

我想将Pyspark数据帧
json_df
放入一个完全空的postgres数据库(没有模式和表)。我使用下面的代码,但是在write语句的选项中选择的表有问题。这个错误显示了一个驱动程序问题,但我已经更新了驱动程序,所以我认为这只是我的代码错了。任何帮助都将不胜感激

我的代码:

database = "postgres"

jdbcUrl = f"jdbc:postgres://localhost:5432;databaseName={database}"

schema = StructType([
  StructField('first_column', StringType(), True),
  StructField('second_columns', StringType(), True),
  ])
df = sqlContext.createDataFrame(sc.emptyRDD(),schema)

json_df.select("first_column","second_column").write.format("jdbc") \
  .mode("overwrite") \
  .option("url", jdbcUrl) \
  .option("user", user) \
  .option("dbtable", df) \
  .save()
错误:

Traceback (most recent call last):
  File "etl.py", line 98, in <module>
    .option("dbtable", df) \
  File "/home/ubuntu/.local/lib/python3.6/site-packages/pyspark/sql/readwriter.py", line 825, in save
    self._jwrite.save()
  File "/home/ubuntu/.local/lib/python3.6/site-packages/py4j/java_gateway.py", line 1305, in __call__
    answer, self.gateway_client, self.target_id, self.name)
  File "/home/ubuntu/.local/lib/python3.6/site-packages/pyspark/sql/utils.py", line 128, in deco
    return f(*a, **kw)
  File "/home/ubuntu/.local/lib/python3.6/site-packages/py4j/protocol.py", line 328, in get_return_value
    format(target_id, ".", name), value)
py4j.protocol.Py4JJavaError: An error occurred while calling o74.save.
: java.sql.SQLException: No suitable driver
错误是:

 File "etl.py", line 104, in <module>
    .option("dbtable", "new_schema.json_df") \
  File "/home/ubuntu/.local/lib/python3.6/site-packages/pyspark/sql/readwriter.py", line 825, in save
    self._jwrite.save()
  File "/home/ubuntu/.local/lib/python3.6/site-packages/py4j/java_gateway.py", line 1305, in __call__
    answer, self.gateway_client, self.target_id, self.name)
  File "/home/ubuntu/.local/lib/python3.6/site-packages/pyspark/sql/utils.py", line 128, in deco
    return f(*a, **kw)
  File "/home/ubuntu/.local/lib/python3.6/site-packages/py4j/protocol.py", line 328, in get_return_value
    format(target_id, ".", name), value)
py4j.protocol.Py4JJavaError: An error occurred while calling o53.save.
: java.sql.SQLException: No suitable driver
文件“etl.py”,第104行,在
.option(“dbtable”、“new_schema.json_df”)\
文件“/home/ubuntu/.local/lib/python3.6/site packages/pyspark/sql/readwriter.py”,第825行,保存
self.\u jwrite.save()
文件“/home/ubuntu/.local/lib/python3.6/site packages/py4j/java_gateway.py”,第1305行,在调用中__
回答,self.gateway\u客户端,self.target\u id,self.name)
文件“/home/ubuntu/.local/lib/python3.6/site-packages/pyspark/sql/utils.py”,第128行,deco格式
返回f(*a,**kw)
文件“/home/ubuntu/.local/lib/python3.6/site packages/py4j/protocol.py”,第328行,在get\u return\u值中
格式(目标id,“.”,名称),值)
py4j.protocol.Py4JJavaError:调用o53.save时出错。
:java.sql.SQLException:没有合适的驱动程序
根据,您应该使用
postgresql
而不是
postgres
,并且不需要指定
databaseName=

jdbcUrl = f"jdbc:postgresql://localhost:5432/{database}"
根据,您应该使用
postgresql
而不是
postgres
,并且不需要指定
databaseName=

jdbcUrl = f"jdbc:postgresql://localhost:5432/{database}"

评论不用于扩展讨论;此对话已结束。评论不用于扩展讨论;这段对话已经结束。