Warning: file_get_contents(/data/phpspider/zhask/data//catemap/2/python/360.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
Python 无法使用pyspark运行简单hql文件_Python_Pyspark_Hive_Pyspark Sql_Spark Hive - Fatal编程技术网

Python 无法使用pyspark运行简单hql文件

Python 无法使用pyspark运行简单hql文件,python,pyspark,hive,pyspark-sql,spark-hive,Python,Pyspark,Hive,Pyspark Sql,Spark Hive,我正在使用pyspark==2.4.3,我只想运行一个hql文件 use myDatabaseName; show tables; 这是我试过的 from os.path import expanduser, join, abspath from pyspark.sql import SparkSession from pyspark.sql import Row # warehouse_location points to the default location for managed

我正在使用pyspark==2.4.3,我只想运行一个hql文件

use myDatabaseName;
show tables;
这是我试过的

from os.path import expanduser, join, abspath

from pyspark.sql import SparkSession
from pyspark.sql import Row

# warehouse_location points to the default location for managed databases and tables
warehouse_location = abspath('spark-warehouse')

spark = SparkSession \
    .builder \
    .appName("Python Spark SQL Hive integration example") \
    .config("spark.sql.warehouse.dir", warehouse_location) \
    .enableHiveSupport() \
    .getOrCreate()

with open('full/path/to/my/hqlfile') as t:
    q=t.read()

print q
'use myDatabaseName;show tables;\n'
spark.sql(q)
但我明白了

Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/some/path/python2.7/site-packages/pyspark/sql/session.py", line 767, in sql
    return DataFrame(self._jsparkSession.sql(sqlQuery), self._wrapped)
  File "/some/path/python2.7/site-packages/py4j/java_gateway.py", line 1257, in __call__
    answer, self.gateway_client, self.target_id, self.name)
  File "/some/path/python2.7/site-packages/pyspark/sql/utils.py", line 73, in deco
    raise ParseException(s.split(': ', 1)[1], stackTrace)
pyspark.sql.utils.ParseException: u"\nmismatched input ';' expecting <EOF>(line 1, pos 11)\n\n== SQL ==\nuse myDatabaseName;show tables;\n-----------^^^\n"
回溯(最近一次呼叫最后一次):
文件“”,第1行,在
sql中的文件“/some/path/python2.7/site packages/pyspark/sql/session.py”,第767行
返回数据帧(self.\u jsparkSession.sql(sqlQuery),self.\u包装)
文件“/some/path/python2.7/site packages/py4j/java_gateway.py”,第1257行,在__
回答,self.gateway\u客户端,self.target\u id,self.name)
文件“/some/path/python2.7/site packages/pyspark/sql/utils.py”,第73行,deco格式
引发ParseException(s.split(“:”,1)[1],stackTrace)
pyspark.sql.utils.ParseException:u“\n匹配输入”;”应为(第1行,位置11)\n\n==sql===\n使用myDatabaseName;显示表格;\n------------^^^\n”

我做错了什么?

就像错误建议的那样,
在spark.sql中不是有效语法

其次,不能在一次spark.sql调用中调用两个命令

我将把
q
修改为一个没有
的查询字符串列表在其中,然后为循环

query_lt = q.split(";")[:-1]
for qs in query_lt:
    spark.sql(qs)

就像所建议的错误一样,
在spark.sql中不是有效语法

其次,不能在一次spark.sql调用中调用两个命令

我将把
q
修改为一个没有
的查询字符串列表在其中,然后为循环

query_lt = q.split(";")[:-1]
for qs in query_lt:
    spark.sql(qs)