获取pyspark.sql.utils.ParseException:输入不匹配';(';期望{<;EOF>;
我试图根据以下数据框的收入对“产品”列进行排名-salesDF获取pyspark.sql.utils.ParseException:输入不匹配';(';期望{<;EOF>;,pyspark,apache-spark-sql,Pyspark,Apache Spark Sql,我试图根据以下数据框的收入对“产品”列进行排名-salesDF salesDF= +-------------+-------+---------+----------+-------+ |transactionID|Product| category|produtType|Revenue| +-------------+-------+---------+----------+-------+ | 105| Lenova| laptop| high| 4000
salesDF=
+-------------+-------+---------+----------+-------+
|transactionID|Product| category|produtType|Revenue|
+-------------+-------+---------+----------+-------+
| 105| Lenova| laptop| high| 40000|
| 111| Lenova| tablet| medium| 20000|
| 103| dell| laptop| medum| 25000|
| 107| iphone|cellPhone| small| 70000|
| 113| lenovo|cellPhone| medium| 8000|
| 108| mi|cellPhone| medum| 10000|
下面是iam使用spark sql根据收入对每个产品进行排名
rankTheRevenue= salesDF.createTempView("Ranking_DF")
rankProduct= session.sql("select Product, Revenue, rank() over(partion by Product order by Revenue) as Rank_revenue from Ranking_DF")
rankProduct.show()
但是我犯了以下错误
pyspark.sql.utils.ParseException:
不匹配的输入“(“应为{,,”,“群集”,“分布”,“除”,“自”,“组”,“有”,“交叉”,“横向”,“限制”,“顺序”,“减号”,“排序”,“联合”,“何处”,“窗口”,“-”)第1行,位置36)
如果有人能帮助我解决此类问题,我将不胜感激
谢谢您在
子句中有一个打字错误作为partitionby
尝试:
rankTheRevenue= salesDF.createTempView("Ranking_DF")
rankProduct= session.sql("select Product, Revenue, rank() over(partition by Product order by Revenue) as Rank_revenue from Ranking_DF")
rankProduct.show()
您在
分区依据
处有一个打字错误子句,作为分区依据
尝试:
rankTheRevenue= salesDF.createTempView("Ranking_DF")
rankProduct= session.sql("select Product, Revenue, rank() over(partition by Product order by Revenue) as Rank_revenue from Ranking_DF")
rankProduct.show()