Apache spark 如何为saveAsTable使用不同的配置单元元存储？_Apache Spark_Hive_Pyspark_Apache Spark Sql_Apache Spark 1.6

Apache spark 如何为saveAsTable使用不同的配置单元元存储？

apache-spark hive pyspark

Apache spark 如何为saveAsTable使用不同的配置单元元存储？,apache-spark,hive,pyspark,apache-spark-sql,apache-spark-1.6,Apache Spark,Hive,Pyspark,Apache Spark Sql,Apache Spark 1.6,我使用PySpark使用Spark SQL（Spark 1.6.1），我需要从一个配置单元元存储加载一个表，并将数据帧的结果写入另一个配置单元元存储我想知道如何为一个spark SQL脚本使用两个不同的元存储这是我的脚本看起来像 # Hive metastore 1 sc1 = SparkContext() hiveContext1 = HiveContext(sc1) hiveContext1.setConf("hive.metastore.warehouse.dir", "tmp/Met

我使用PySpark使用Spark SQL（Spark 1.6.1），我需要从一个配置单元元存储加载一个表，并将数据帧的结果写入另一个配置单元元存储

我想知道如何为一个spark SQL脚本使用两个不同的元存储

这是我的脚本看起来像

# Hive metastore 1
sc1 = SparkContext()
hiveContext1 = HiveContext(sc1)
hiveContext1.setConf("hive.metastore.warehouse.dir", "tmp/Metastore1")

#Hive metastore 2
sc2 = SparkContext()
hiveContext2 = HiveContext(sc2)
hiveContext2.setConf("hive.metastore.warehouse.dir", "tmp/Metastore2")

#Reading from a table presnt in metastore1
df_extract = hiveContext1.sql("select * from emp where emp_id =1")

# Need to write the result into a different dataframe
df_extract.saveAsTable('targetdbname.target_table',mode='append',path='maprfs:///abc/datapath...')

TL；DR无法使用一个配置单元元存储（对于某些表）和另一个（对于其他表）

因为Spark SQL支持单个配置单元元存储（在a中），而不考虑从不同配置单元元存储读取和写入的

SparkSessions

数量。HotelsDotCom专门为此开发了一个应用程序（WaggleDance）。使用它作为代理，您应该能够实现您的目标