Apache spark Jupyter笔记本电脑与远程配置单元的连接

Apache spark Jupyter笔记本电脑与远程配置单元的连接,apache-spark,hadoop,hive,pyspark,anaconda,Apache Spark,Hadoop,Hive,Pyspark,Anaconda,我正试图从我们公司的远程服务器的配置单元中获取数据。我使用Anaconda3(Windows 64位),我的Hadoop在Ambari上工作 我试着做像这样的smth 导入findspark findspark.init() 从pyspark导入SparkContext,SparkConf 从pyspark.sql导入HiveContext,SparkSession sparkSession=(sparkSession.builder.appName('example-pyspark-read-

我正试图从我们公司的远程服务器的配置单元中获取数据。我使用Anaconda3(Windows 64位),我的Hadoop在Ambari上工作

我试着做像这样的smth

导入findspark
findspark.init()
从pyspark导入SparkContext,SparkConf
从pyspark.sql导入HiveContext,SparkSession
sparkSession=(sparkSession.builder.appName('example-pyspark-read-from-hive').config(“hive.metastore.uri”),”http://serv_ip:serv_port“”。enableHiveSupport().getOrCreate())
sparkSession.sql('show databases').show()
也许我的配置有问题?也许我应该在蜂箱里做些配置。 错误是


错误
Py4JJavaError Traceback(最近一次调用)D:\Alanuccio\Progs\spark-2.3.0-bin-hadoop2.7\python\pyspark\sql\utils.py in deco(*a,**kw)62 try:->63返回f(*a,**kw)64,但py4j.protocol.Py4JJavaError为e:D:\Alanuccio\Progs\spark-2.3.0-bin-hadoop2.7\python\lib\py4j-0.10.6-src.zip\py4j\protocol.py4j\protocol
在get_return_value(应答、网关_客户端、目标_id、名称)319“调用{0}{1}{2}时发生错误。\n”。-->320格式(target_id,“.”,name),value)321其他:Py4JJavaError:调用o27.sql时出错:org.apache.spark.sql.AnalysisException:
java.lang.RuntimeException:java.lang.RuntimeException:无法实例化org.apache.hadoop.hive.ql.metadata.SessionHiveMetaStoreClient;位于org.apache.spark.sql.hive.HiveExternalCatalog.withClient(HiveExternalCatalog.scala:106),位于org.apache.spark.sql.hive.HiveExternalCatalog.databaseExists(HiveExternalCatalog.scala:194)
位于org.apache.spark.sql.internal.SharedState.externalCatalog$lzycompute(SharedState.scala:114)位于org.apache.spark.sql.internal.SharedState.externalCatalog(SharedState.scala:102)位于org.apache.spark.sql.HiveSessionStateBuilder.externalCatalog(HiveSessionStateBuilder.scala:39)
org.apache.spark.sql.hive.HiveSessionStateBuilder.catalog$lzycompute(HiveSessionStateBuilder.scala:54)org.apache.spark.sql.hive.HiveSessionStateBuilder.catalog(HiveSessionStateBuilder.scala:52)org.apache.spark.sql.hive.HiveSessionStateBuilder$$anon$1。
(HiveSessionStateBuilder.scala:69)位于org.apache.spark.sql.hive.HiveSessionStateBuilder.analyzer(HiveSessionStateBuilder.scala:69)位于org.apache.spark.sql.internal.BaseSessionStateBuilder$$anonfun$build$2.apply(BaseSessionStateBuilder.scala:293)位于
org.apache.spark.sql.internal.BaseSessionStateBuilder$$anonfun$build$2.apply(BaseSessionStateBuilder.scala:293)位于org.apache.spark.sql.internal.SessionState.analyzer$lzycompute(SessionState.scala:79)位于org.apache.spark.sql.internal.SessionState.SessionState.analyzer(SessionState:79)
org.apache.spark.sql.execution.QueryExecution.analyzed$lzycompute(QueryExecution.scala:57)org.apache.spark.sql.execution.QueryExecution.analyzed(QueryExecution.scala:55)org.apache.spark.sql.execution.QueryExecution.assertAnalyzed(QueryExecution.scala:47)
org.apache.spark.sql.Dataset$.ofRows(Dataset.scala:74)org.apache.spark.sql.SparkSession.sql(SparkSession.scala:638)sun.reflect.NativeMethodAccessorImpl.invoke0(NativeMethodAccessorImpl.invoke)(NativeMethodAccessorImpl.java:62)
在py4j.reflection.MethodInvoker.invoke(MethodInvoker.java:244)的py4j.reflection.ReflectionEngine.invoke(Method.java:357)的java.lang.reflect.Method.invoke(MethodInvoker.java:498)的sun.reflection.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)中
在py4j.Gateway.invoke(Gateway.java:282)在py4j.commands.AbstractCommand.invokeMethod(AbstractCommand.java:132)在py4j.commands.CallCommand.execute(CallCommand.java:79)在py4j.GatewayConnection.run(GatewayConnection.java:214)在java.lang.Thread.run(Thread.java:748)上
原因:java.lang.RuntimeException:java.lang.RuntimeException:无法实例化org.apache.hadoop.hive.ql.metadata.SessionHiveMetaStoreClient,位于org.apache.hadoop.hive.ql.session.SessionState.start(SessionState.java:522),位于org.apache.spark.sql.hive.client.HiveClientImpl.newState(HiveClientImpl.scala:180)
位于org.apache.spark.sql.hive.client.HiveClientImpl。
sun.reflect.NativeConstructorAccessorImpl.newInstance0(本机方法)的sun.reflect.nativeconstructoraccessormpl.newInstance(nativeconstructoraccessormpl.java:62)的sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
位于org.apache.spark.sql.hive.client.IsolatedClientLoader.createClient(IsolatedClientLoader.scala:264)的java.lang.reflect.Constructor.newInstance(Constructor.java:423)org.apache.spark.sql.hive.HiveUtils$.newClientForMetadata(HiveUtils.scala:385)
在org.apache.spark.sql.hive.HiveUtils$.newClientForMetadata(HiveUtils.scala:287)在org.apache.spark.sql.hive.HiveExternalCatalog.client$lzycompute(HiveExternalCatalog.scala:66)在org.apache.spark.sql.hive.HiveExternalCatalog.client(HiveExternalCatalog.scala:65)
在org.apache.spark.sql.hive.HiveExternalCatalog$$anonfun$databaseExists$1.apply$mcZ$sp(HiveExternalCatalog.scala:195)在org.apache.spark.sql.hive.HiveExternalCatalog$$anonfun$databaseExists$1.apply(HiveExternalCatalog.scala:195)在org.apache.spark.sql.HiveExternalCatalog$$anonfun$databaseExists$1.apply(hiveeternatalog.scala:195)
位于org.apache.spark.sql.hive.HiveExternalCatalog.withClient(HiveExternalCatalog.scala:97)。。。28更多原因:java.lang.RuntimeException:无法在org.apache.hadoop.hive.ql.metadata.SessionHiveMetaStoreClient上实例化org.apache.hadoop.hive.metastore.MetaStoreUtils.newInstance(MetaStoreUtils.java:1523)
位于org.apache.hadoop.hive.metastore.RetryingMetaStoreClient。
(RetryingMetaStoreClient.java:86)位于org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.getProxy(RetryingMetaStoreClient.java:132)位于org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.getProxy(RetryingMetaStoreClient.java:132)
config("hive.metastore.uris","thrift://serv_ip:serv_port")