Apache spark 将spark.sql数据帧结果写入拼花地板文件_Apache Spark_Hive_Pyspark_Hdfs

Apache spark 将spark.sql数据帧结果写入拼花地板文件

apache-spark hive pyspark

Apache spark 将spark.sql数据帧结果写入拼花地板文件,apache-spark,hive,pyspark,hdfs,Apache Spark,Hive,Pyspark,Hdfs,我启用了以下spark.sql会话： # creating Spark context and connection spark = (SparkSession.builder.appName("appName").enableHiveSupport().getOrCreate()) 并且我能够看到以下查询的结果： spark.sql("select year(plt_date) as Year, month(plt_date) as Mounth, count(build) as B_Co

我启用了以下spark.sql会话：

# creating Spark context and connection
spark = (SparkSession.builder.appName("appName").enableHiveSupport().getOrCreate())

并且我能够看到以下查询的结果：

spark.sql("select year(plt_date) as Year, month(plt_date) as Mounth, count(build) as B_Count, count(product) as P_Count from first_table full outer join second_table on key1=CONCAT('SS',key_2) group by year(plt_date), month(plt_date)").show()

但是，当我尝试将此查询产生的数据帧写入hdfs时，会出现以下错误：

我能够将此查询的简单版本的结果数据帧保存到同一路径。添加count（）、year（）等函数会出现问题

有什么问题？我如何将结果保存到hdfs？

由于“（“年（投票日期为日期）”列中的“出现”）而出现错误：

用于重命名：

data = data.selectExpr("year(CAST(plt_date AS DATE)) as nameofcolumn")

如果有效，请投票

参考：