Warning: file_get_contents(/data/phpspider/zhask/data//catemap/3/apache-spark/6.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
Apache spark 将RDD加载到配置单元中_Apache Spark_Dataframe_Hive_Pyspark_Pyspark Sql - Fatal编程技术网

Apache spark 将RDD加载到配置单元中

Apache spark 将RDD加载到配置单元中,apache-spark,dataframe,hive,pyspark,pyspark-sql,Apache Spark,Dataframe,Hive,Pyspark,Pyspark Sql,我想用spark版本1.6.x中的pyspark将一个RDD(k=table_name,v=content)加载到一个分区的配置单元表(年、月、日)中 在尝试使用此SQL查询的逻辑时,执行以下操作: ALTER TABLE db_schema.%FILENAME_WITHOUT_EXTENSION% DROP IF EXISTS PARTITION (year=%YEAR%, month=%MONTH%, day=%DAY%);LOAD DATA INTO TABLE db_schema.%FI

我想用spark版本1.6.x中的pyspark将一个RDD(k=table_name,v=content)加载到一个分区的配置单元表(年、月、日)中

在尝试使用此SQL查询的逻辑时,执行以下操作:

ALTER TABLE db_schema.%FILENAME_WITHOUT_EXTENSION% DROP IF EXISTS PARTITION (year=%YEAR%, month=%MONTH%, day=%DAY%);LOAD DATA INTO TABLE db_schema.%FILENAME_WITHOUT_EXTENSION% PARTITION (year=%YEAR%, month=%MONTH%, day=%DAY%);
有人能给点建议吗

spark = SparkSession.builder.enableHiveSupport().getOrCreate()
df = spark.sparkContext.parallelize([(1, 'cat', '2016-12-20'), (2, 'dog', '2016-12-21')])
df = spark.createDataFrame(df, schema=['id', 'val', 'dt'])
df.write.saveAsTable(name='default.test', format='orc', mode='overwrite', partitionBy='dt')
使用启用HIVESupport()df.write.saveAsTable()