Warning: file_get_contents(/data/phpspider/zhask/data//catemap/5/reporting-services/3.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
Python sparksql将变量传递到查询_Python_Pyspark - Fatal编程技术网

Python sparksql将变量传递到查询

Python sparksql将变量传递到查询,python,pyspark,Python,Pyspark,我已经找遍了这个问题的答案,并尝试了一切。似乎什么都不管用。我试图在python中的spark.sql查询中引用变量赋值。运行Python3和spark版本2.3.1 bkt = 1 prime = spark.sql(s"SELECT ((year(fdr_date))*100)+month(fdr_date) as fdr_year, count(*) as counts\ FROM pwrcrv_tmp\ where EXT

我已经找遍了这个问题的答案,并尝试了一切。似乎什么都不管用。我试图在python中的spark.sql查询中引用变量赋值。运行Python3和spark版本2.3.1

bkt = 1

prime = spark.sql(s"SELECT ((year(fdr_date))*100)+month(fdr_date) as fdr_year, count(*) as counts\
                FROM pwrcrv_tmp\
                where EXTR_CURR_NUM_CYC_DLQ=$bkt\
                and EXTR_ACCOUNT_TYPE in('PS','PT','PD','PC','HV','PA')\
                group by ((year(fdr_date))*100)+month(fdr_date)\
                order by ((year(fdr_date))*100)+month(fdr_date)")

prime.show(50)
错误:

prime = spark.sql(s"SELECT ((year(fdr_date))*100)+month(fdr_date) as fdr_year, count(*) as counts                FROM pwrcrv_tmp         where EXTR_CURR_NUM_CYC_DLQ=$bkt                and EXTR_ACCOUNT_TYPE in('PS','PT','PD','PC','HV','PA')                group by ((year(fdr_date))*100)+month(fdr_date)                order by ((year(fdr_date))*100)+month(fdr_date)")
                                                                                                                                                                                                                                                                                                                                                                                         ^
SyntaxError: invalid syntax

我在这篇databricks文章中找到了正确的语法

在查询前面添加小写的f,并在查询中变量的名称周围加上大括号

bkt = 1

prime = spark.sql(f"SELECT ((year(fdr_date))*100)+month(fdr_date) as fdr_year, count(*) as counts\
            FROM pwrcrv_tmp\
            where EXTR_CURR_NUM_CYC_DLQ={bkt}\
            and EXTR_ACCOUNT_TYPE in('PS','PT','PD','PC','HV','PA')\
            group by ((year(fdr_date))*100)+month(fdr_date)\
            order by ((year(fdr_date))*100)+month(fdr_date)")


prime.show(50)

bkt=1 prime=spark.sql(“选择((年(fdr_日期))*100)+月(fdr_日期)作为fdr_年,计数(*)作为pwrcrv_tmp中的计数,其中EXTR_CURR_NUM_CYC_DLQ=“%bkt%”和EXTR_ACCOUNT输入('PS','PT','PD','PC','HV','PA')\分组依据((年(fdr_日期))+100个月)\按((年(fdr_日期))*100)+月(fdr_日期)”)prime排序。显示(50)这是一个问题吗?也不知道为什么在评论中发布了更多代码。请也阅读。首先,
s“…”
是一个语法错误-这是什么意思?其次,尝试使用
$bkt
格式化字符串是无效的python语法。查找我的帖子标题是我的问题。我从这个答案中得到了s“…”,这个答案在stackoverflow上标记为正确。@email83我不知道那是什么语言,但您要找的答案是: