Mysql 显示表,描述myql的查询在pyspark中不起作用

Mysql 显示表,描述myql的查询在pyspark中不起作用,mysql,apache-spark,pyspark,apache-spark-sql,Mysql,Apache Spark,Pyspark,Apache Spark Sql,我是Pypark的新手。我正在尝试使用pyspark对MySQL数据库执行SQL查询。我尝试使用SELECT查询获取表数据,效果很好。在通过执行“showtables”获取数据库中可用表的列表时,我遇到了一些问题。我在查询中尝试了以下几种变体,但仍然不起作用 - "(show tables)" - "show tables" - "(show tables) as list_tables" - "show tab

我是Pypark的新手。我正在尝试使用pyspark对MySQL数据库执行SQL查询。我尝试使用SELECT查询获取表数据,效果很好。在通过执行“showtables”获取数据库中可用表的列表时,我遇到了一些问题。我在查询中尝试了以下几种变体,但仍然不起作用

 - "(show tables)" 
 - "show tables" 
 - "(show tables) as list_tables" 
 - "show tables as list_tables" 
 - "(show tables system-dev)" 
 - "(show tables from system-dev)"
另外,“描述表_名称”查询在pyspark中不起作用。提供相同的错误日志

那么,我在代码中是否有任何错误,或者pyspark中不支持这些命令

代码:

from pyspark.sql import SparkSession

if __name__ == "__main__":
     print("Read Mysql table Demo - application started ...")

spark = SparkSession \
        .builder \
        .appName("Read mysql table demo") \
        .master("local[*]") \
        .config("spark.jars","file:///home/dhruv/Desktop/mysql-connector-java-8.0.22.jar") \
        .enableHiveSupport() \
        .getOrCreate()

spark.sparkContext.setLogLevel("ERROR")

mysql_db_driver_class = "com.mysql.jdbc.Driver"
table_name ="call_slot"
host_name = "192.168.4.61"
port_no="3306"
user_name="dhruv"
password="asdqwepoi"
database_name="candidate_screen_dev"

mysql_select_query = None
# mysql_select_query = "(select * from " + table_name + ") as call_slot"  # it's working
# mysql_select_query = "(DESCRIBE table call_slot) as call_slot"          # not working
mysql_select_query = "(show tables system-dev)"                           # not working 

print("Printing mysql_select_query:")
print(mysql_select_query)

mysql_jdbc_url = "jdbc:mysql://" + host_name + ":" + port_no + "/" + database_name

print("Printing JDBC Url: " + mysql_jdbc_url)

trans_detail_tbl_data_df = spark.read.format("jdbc") \
    .option("url", mysql_jdbc_url) \
    .option("driver", mysql_db_driver_class) \
    .option("dbtable", mysql_select_query) \
    .option("user", user_name) \
    .option("password", password) \
    .load()

trans_detail_tbl_data_df.show()

print("Read MySQL Table Demo - Application Completed.")
错误:

Printing mysql_select_query:
(show TABLES)
Printing JDBC Url: jdbc:mysql://192.168.4.61:3306/candidate_screen_dev
Loading class `com.mysql.jdbc.Driver'. This is deprecated. The new driver class is `com.mysql.cj.jdbc.Driver'. The driver is automatically registered via the SPI and manual loading of the driver class is generally unnecessary.
Traceback (most recent call last):
  File "spark_trial.py", line 51, in <module>
    .option("password", password) \
  File "/home/dhruv/.local/lib/python3.6/site-packages/pyspark/sql/readwriter.py", line 184, in load
    return self._df(self._jreader.load())
  File "/home/dhruv/.local/lib/python3.6/site-packages/py4j/java_gateway.py", line 1305, in __call__
    answer, self.gateway_client, self.target_id, self.name)
  File "/home/dhruv/.local/lib/python3.6/site-packages/pyspark/sql/utils.py", line 128, in deco
    return f(*a, **kw)
  File "/home/dhruv/.local/lib/python3.6/site-packages/py4j/protocol.py", line 328, in get_return_value
    format(target_id, ".", name), value)
py4j.protocol.Py4JJavaError: An error occurred while calling o52.load.
: java.sql.SQLSyntaxErrorException: You have an error in your SQL syntax; check the manual that corresponds to your MySQL server version for the right syntax to use near 'show TABLES) WHERE 1=0' at line 1
        at com.mysql.cj.jdbc.exceptions.SQLError.createSQLException(SQLError.java:120)
        at com.mysql.cj.jdbc.exceptions.SQLError.createSQLException(SQLError.java:97)
        at com.mysql.cj.jdbc.exceptions.SQLExceptionsMapping.translateException(SQLExceptionsMapping.java:122)
        at com.mysql.cj.jdbc.ClientPreparedStatement.executeInternal(ClientPreparedStatement.java:953)
        at com.mysql.cj.jdbc.ClientPreparedStatement.executeQuery(ClientPreparedStatement.java:1003)
        at org.apache.spark.sql.execution.datasources.jdbc.JDBCRDD$.resolveTable(JDBCRDD.scala:61)
        at org.apache.spark.sql.execution.datasources.jdbc.JDBCRelation$.getSchema(JDBCRelation.scala:226)
        at org.apache.spark.sql.execution.datasources.jdbc.JdbcRelationProvider.createRelation(JdbcRelationProvider.scala:35)
        at org.apache.spark.sql.execution.datasources.DataSource.resolveRelation(DataSource.scala:344)
        at org.apache.spark.sql.DataFrameReader.loadV1Source(DataFrameReader.scala:297)
        at org.apache.spark.sql.DataFrameReader.$anonfun$load$2(DataFrameReader.scala:286)
        at scala.Option.getOrElse(Option.scala:189)
        at org.apache.spark.sql.DataFrameReader.load(DataFrameReader.scala:286)
        at org.apache.spark.sql.DataFrameReader.load(DataFrameReader.scala:221)
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
        at java.lang.reflect.Method.invoke(Method.java:498)
        at py4j.reflection.MethodInvoker.invoke(MethodInvoker.java:244)
        at py4j.reflection.ReflectionEngine.invoke(ReflectionEngine.java:357)
        at py4j.Gateway.invoke(Gateway.java:282)
        at py4j.commands.AbstractCommand.invokeMethod(AbstractCommand.java:132)
        at py4j.commands.CallCommand.execute(CallCommand.java:79)
        at py4j.GatewayConnection.run(GatewayConnection.java:238)
        at java.lang.Thread.run(Thread.java:748)
打印mysql\u选择\u查询:
(显示表格)
打印JDBC Url:JDBC:mysql://192.168.4.61:3306/candidate_screen_dev
正在加载类'com.mysql.jdbc.Driver'。这是不赞成的。新的驱动程序类是'com.mysql.cj.jdbc.driver'。驱动程序通过SPI自动注册,通常不需要手动加载驱动程序类。
回溯(最近一次呼叫最后一次):
文件“spark_trial.py”,第51行,在
.选项(“密码”,密码)\
加载文件“/home/dhruv/.local/lib/python3.6/site packages/pyspark/sql/readwriter.py”,第184行
返回self.\u df(self.\u jreader.load())
文件“/home/dhruv/.local/lib/python3.6/site packages/py4j/java_gateway.py”,第1305行,在调用中__
回答,self.gateway\u客户端,self.target\u id,self.name)
文件“/home/dhruv/.local/lib/python3.6/site packages/pyspark/sql/utils.py”,第128行,以装饰形式显示
返回f(*a,**kw)
文件“/home/dhruv/.local/lib/python3.6/site packages/py4j/protocol.py”,第328行,在get\u return\u值中
格式(目标id,“.”,名称),值)
py4j.protocol.Py4JJavaError:调用o52.load时出错。
:java.sql.SQLSyntaxErrorException:您的sql语法有错误;检查与您的MySQL服务器版本对应的手册,以获得在第1行的“show TABLES”(其中1=0)附近使用的正确语法
位于com.mysql.cj.jdbc.exceptions.SQLError.createSQLException(SQLError.java:120)
位于com.mysql.cj.jdbc.exceptions.SQLError.createSQLException(SQLError.java:97)
位于com.mysql.cj.jdbc.exceptions.SQLExceptionsMapping.translateException(SQLExceptionsMapping.java:122)
位于com.mysql.cj.jdbc.ClientPreparedStatement.executeInternal(ClientPreparedStatement.java:953)
位于com.mysql.cj.jdbc.ClientPreparedStatement.executeQuery(ClientPreparedStatement.java:1003)
位于org.apache.spark.sql.execution.datasources.jdbc.JDBCRDD$.resolveTable(JDBCRDD.scala:61)
位于org.apache.spark.sql.execution.datasources.jdbc.jdbcrations$.getSchema(jdbcrations.scala:226)
位于org.apache.spark.sql.execution.datasources.jdbc.jdbrelationprovider.createRelation(jdbrelationprovider.scala:35)
位于org.apache.spark.sql.execution.datasources.DataSource.resolveRelation(DataSource.scala:344)
位于org.apache.spark.sql.DataFrameReader.loadV1Source(DataFrameReader.scala:297)
位于org.apache.spark.sql.DataFrameReader.$anonfun$load$2(DataFrameReader.scala:286)
位于scala.Option.getOrElse(Option.scala:189)
位于org.apache.spark.sql.DataFrameReader.load(DataFrameReader.scala:286)
位于org.apache.spark.sql.DataFrameReader.load(DataFrameReader.scala:221)
在sun.reflect.NativeMethodAccessorImpl.invoke0(本机方法)处
位于sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
在sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)中
位于java.lang.reflect.Method.invoke(Method.java:498)
位于py4j.reflection.MethodInvoker.invoke(MethodInvoker.java:244)
位于py4j.reflection.ReflectionEngine.invoke(ReflectionEngine.java:357)
在py4j.Gateway.invoke处(Gateway.java:282)
位于py4j.commands.AbstractCommand.invokeMethod(AbstractCommand.java:132)
在py4j.commands.CallCommand.execute(CallCommand.java:79)
在py4j.GatewayConnection.run处(GatewayConnection.java:238)
运行(Thread.java:748)
试试这个