Apache spark 如何将metastore的特定目录与HiveContext一起使用?

Apache spark 如何将metastore的特定目录与HiveContext一起使用?,apache-spark,hivecontext,Apache Spark,Hivecontext,这就是我在Spark Shell中尝试的 scala> import org.apache.spark.sql.hive.HiveContext import org.apache.spark.sql.hive.HiveContext scala> import java.nio.file.Files import java.nio.file.Files scala> val hiveDir = Files.createTempDirectory("hive") hiveD

这就是我在Spark Shell中尝试的

scala> import org.apache.spark.sql.hive.HiveContext
import org.apache.spark.sql.hive.HiveContext

scala> import java.nio.file.Files
import java.nio.file.Files

scala> val hiveDir = Files.createTempDirectory("hive")
hiveDir: java.nio.file.Path = /var/folders/gg/g3hk6fcj4rxc6lb1qsvxc_vdxxwf28/T/hive5050481206678469338

scala> val hiveContext = new HiveContext(sc)
15/12/31 12:05:27 INFO HiveContext: Initializing execution hive, version 0.13.1
hiveContext: org.apache.spark.sql.hive.HiveContext = org.apache.spark.sql.hive.HiveContext@6f959640

scala> hiveContext.sql(s"SET hive.metastore.warehouse.dir=${hiveDir.toUri}")
15/12/31 12:05:34 INFO HiveContext: Initializing HiveMetastoreConnection version 0.13.1 using Spark classes.
...
res0: org.apache.spark.sql.DataFrame = [: string]

scala> Seq("create database foo").foreach(hiveContext.sql)
15/12/31 12:05:42 INFO ParseDriver: Parsing command: create database foo
15/12/31 12:05:42 INFO ParseDriver: Parse Completed
...
15/12/31 12:05:43 INFO HiveMetaStore: 0: create_database: Database(name:foo, description:null, locationUri:null, parameters:null, ownerName:aa8y, ownerType:USER)
15/12/31 12:05:43 INFO audit: ugi=aa8y ip=unknown-ip-addr  cmd=create_database: Database(name:foo, description:null, locationUri:null, parameters:null, ownerName:aa8y, ownerType:USER)

15/12/31 12:05:43 INFO HiveMetaStore: 0: get_database: foo
15/12/31 12:05:43 INFO audit: ugi=aa8y ip=unknown-ip-addr  cmd=get_database: foo
15/12/31 12:05:43 ERROR RetryingHMSHandler: MetaException(message:Unable to create database path file:/user/hive/warehouse/foo.db, failed to create database foo)
  at org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.create_database_core(HiveMetaStore.java:734)
  ...
  at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)

15/12/31 12:05:43 ERROR DDLTask: org.apache.hadoop.hive.ql.metadata.HiveException: MetaException(message:Unable to create database path file:/user/hive/warehouse/foo.db, failed to create database foo)
  at org.apache.hadoop.hive.ql.metadata.Hive.createDatabase(Hive.java:248)
  ...
  at org.apache.hadoop.hive.ql.metadata.Hive.createDatabase(Hive.java:242)
  ... 78 more

15/12/31 12:05:43 ERROR Driver: FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.DDLTask. MetaException(message:Unable to create database path file:/user/hive/warehouse/foo.db, failed to create database foo)
15/12/31 12:05:43 ERROR ClientWrapper:
======================
HIVE FAILURE OUTPUT
======================
SET hive.metastore.warehouse.dir=file:///var/folders/gg/g3hk6fcj4rxc6lb1qsvxc_vdxxwf28/T/hive5050481206678469338/
FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.DDLTask. MetaException(message:Unable to create database path file:/user/hive/warehouse/foo.db, failed to create database foo)
======================
END HIVE FAILURE OUTPUT
======================

org.apache.spark.sql.execution.QueryExecutionException: FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.DDLTask. MetaException(message:Unable to create database path file:/user/hive/warehouse/foo.db, failed to create database foo)
  at org.apache.spark.sql.hive.client.ClientWrapper$$anonfun$runHive$1.apply(ClientWrapper.scala:349)
  ...
  at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)


scala>
但它似乎无法识别我正在设置的目录。我已经从stacktrace中删除了内容,因为它非常冗长。整个堆栈跟踪是

我不确定我做错了什么。非常感谢您提供的帮助。

本地文件系统上是否存在路径和目录“/user/hive/warehouse/?这更像是本地文件系统的权限问题。尝试在本地文件系统上手动创建目录/路径,然后再次运行程序。