Apache spark 将dataframe保存到配置单元(如果其类型为date,则如何将df架构类型更改为string,我不想硬编码列名称)
我想把熊猫df隐藏起来,点燃df并把它保存到蜂巢里Apache spark 将dataframe保存到配置单元(如果其类型为date,则如何将df架构类型更改为string,我不想硬编码列名称),apache-spark,pyspark,Apache Spark,Pyspark,我想把熊猫df隐藏起来,点燃df并把它保存到蜂巢里 #create spark df from panda dataframe df = self.ss.createDataFrame(dataframe) df.createOrReplaceTempView("table_Template") self.ss.sql("create table IF NOT EXISTS database."+ table_name
#create spark df from panda dataframe
df = self.ss.createDataFrame(dataframe)
df.createOrReplaceTempView("table_Template")
self.ss.sql("create table IF NOT EXISTS database."+ table_name +" STORED AS PARQUET as select * from table_Template")
错误:
pyspark.sql.utils.AnalysisException:'org.apache.hadoop.hive.ql.metadata.HiveException:java.lang.UnsupportedOperationException:拼花地板不支持日期。见HIVE-6384 尝试在下面的代码中,将所有
日期
类型列强制转换为字符串
df.select(map(lambda field: F.col(field.name).cast("string") if field.dataType.typeName == "date" else F.col(field.name), df.schema)).show()