Apache spark 更新后,纱线无法在某些节点上启动AM

Apache spark 更新后,纱线无法在某些节点上启动AM,apache-spark,yarn,hortonworks-data-platform,Apache Spark,Yarn,Hortonworks Data Platform,在将系统更新到HDP2.6.5之后,我遇到了一个问题。我有一个有三个节点的集群,并试图用python启动一个简单的应用程序: from pyspark import SparkConf, SparkContext from pyspark.sql import SQLContext, SparkSession, HiveContext sc = SparkContext() print sc.master 指挥 /usr/bin/spark-submit \ --master y

在将系统更新到HDP2.6.5之后,我遇到了一个问题。我有一个有三个节点的集群,并试图用python启动一个简单的应用程序:

from pyspark import SparkConf, SparkContext 
from pyspark.sql import SQLContext, SparkSession, HiveContext 
sc = SparkContext() 
print sc.master
指挥

/usr/bin/spark-submit \
    --master yarn \
    --deploy-mode cluster \
    --name 'test script' \
    /opt/test/youdmp/test/script.py
上面说

Exception in thread "main" java.lang.NoSuchMethodError: org.apache.hadoop.yarn.proto.YarnProtos$ResourceProtoOrBuilder.getMemory()I
    at org.apache.hadoop.yarn.api.records.impl.pb.ResourcePBImpl.getMemory(ResourcePBImpl.java:61)
    at org.apache.spark.deploy.yarn.Client.verifyClusterResources(Client.scala:313)
    at org.apache.spark.deploy.yarn.Client.submitApplication(Client.scala:166)
    at org.apache.spark.deploy.yarn.Client.run(Client.scala:1217)
    at org.apache.spark.deploy.yarn.YarnClusterApplication.start(Client.scala:1585)
    at org.apache.spark.deploy.SparkSubmit$.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:906)
    at org.apache.spark.deploy.SparkSubmit$.doRunMain$1(SparkSubmit.scala:197)
    at org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:227)
    at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:136)
    at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
当应用程序在一个或另一个节点上启动时,我可以看到这种情况。但第三个节点可以工作

客户端部署模式表示:

Traceback (most recent call last):
  File "/opt/test/youdmp/test/script.py", line 3, in <module>
    sc = SparkContext()
  File "/usr/hdp/current/spark2-client/python/lib/pyspark.zip/pyspark/context.py", line 119, in __init__
  File "/usr/hdp/current/spark2-client/python/lib/pyspark.zip/pyspark/context.py", line 181, in _do_init
  File "/usr/hdp/current/spark2-client/python/lib/pyspark.zip/pyspark/context.py", line 279, in _initialize_context
  File "/usr/hdp/current/spark2-client/python/lib/py4j-0.10.6-src.zip/py4j/java_gateway.py", line 1428, in __call__
  File "/usr/hdp/current/spark2-client/python/lib/py4j-0.10.6-src.zip/py4j/protocol.py", line 320, in get_return_value
py4j.protocol.Py4JJavaError: An error occurred while calling None.org.apache.spark.api.java.JavaSparkContext.
: java.lang.NoSuchMethodError: org.apache.hadoop.yarn.proto.YarnProtos$ResourceProtoOrBuilder.getMemory()I
    at org.apache.hadoop.yarn.api.records.impl.pb.ResourcePBImpl.getMemory(ResourcePBImpl.java:61)
    at org.apache.spark.deploy.yarn.Client.verifyClusterResources(Client.scala:313)
    at org.apache.spark.deploy.yarn.Client.submitApplication(Client.scala:166)
    at org.apache.spark.scheduler.cluster.YarnClientSchedulerBackend.start(YarnClientSchedulerBackend.scala:57)
    at org.apache.spark.scheduler.TaskSchedulerImpl.start(TaskSchedulerImpl.scala:164)
    at org.apache.spark.SparkContext.<init>(SparkContext.scala:500)
    at org.apache.spark.api.java.JavaSparkContext.<init>(JavaSparkContext.scala:58)
    at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
    at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
    at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
    at java.lang.reflect.Constructor.newInstance(Constructor.java:423)
    at py4j.reflection.MethodInvoker.invoke(MethodInvoker.java:247)
    at py4j.reflection.ReflectionEngine.invoke(ReflectionEngine.java:357)
    at py4j.Gateway.invoke(Gateway.java:238)
    at py4j.commands.ConstructorCommand.invokeConstructor(ConstructorCommand.java:80)
    at py4j.commands.ConstructorCommand.execute(ConstructorCommand.java:69)
    at py4j.GatewayConnection.run(GatewayConnection.java:214)
    at java.lang.Thread.run(Thread.java:745)
回溯(最近一次呼叫最后一次):
文件“/opt/test/youdmp/test/script.py”,第3行,在
sc=SparkContext()
文件“/usr/hdp/current/spark2 client/python/lib/pyspark.zip/pyspark/context.py”,第119行,在__
文件“/usr/hdp/current/spark2 client/python/lib/pyspark.zip/pyspark/context.py”,第181行,在
文件“/usr/hdp/current/spark2 client/python/lib/pyspark.zip/pyspark/context.py”,第279行,在上下文中
文件“/usr/hdp/current/spark2 client/python/lib/py4j-0.10.6-src.zip/py4j/java_gateway.py”,第1428行,在__
文件“/usr/hdp/current/spark2 client/python/lib/py4j-0.10.6-src.zip/py4j/protocol.py”,第320行,在get_return_值中
py4j.protocol.Py4JJavaError:调用None.org.apache.spark.api.java.JavaSparkContext时出错。
:java.lang.NoSuchMethodError:org.apache.hadoop.warn.proto.YarnProtos$ResourceProtorBuilder.getMemory()I
位于org.apache.hadoop.warn.api.records.impl.pb.resourcepbempl.getMemory(resourcepbempl.java:61)
位于org.apache.spark.deploy.warn.Client.verifyClusterResources(Client.scala:313)
位于org.apache.spark.deploy.warn.Client.submitApplication(Client.scala:166)
位于org.apache.spark.scheduler.cluster.YarnClientSchedulerBackend.start(YarnClientSchedulerBackend.scala:57)
位于org.apache.spark.scheduler.TaskSchedulerImpl.start(TaskSchedulerImpl.scala:164)
位于org.apache.spark.SparkContext(SparkContext.scala:500)
位于org.apache.spark.api.java.JavaSparkContext(JavaSparkContext.scala:58)
位于sun.reflect.NativeConstructorAccessorImpl.newInstance0(本机方法)
位于sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
在sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
位于java.lang.reflect.Constructor.newInstance(Constructor.java:423)
位于py4j.reflection.MethodInvoker.invoke(MethodInvoker.java:247)
位于py4j.reflection.ReflectionEngine.invoke(ReflectionEngine.java:357)
在py4j.Gateway.invoke处(Gateway.java:238)
位于py4j.commands.ConstructorCommand.invokeConstructor(ConstructorCommand.java:80)
在py4j.commands.ConstructorCommand.execute处(ConstructorCommand.java:69)
在py4j.GatewayConnection.run处(GatewayConnection.java:214)
运行(Thread.java:745)
这会是什么样的错误呢?我该修什么


看起来有些节点上的应用程序主机无法启动,因为有些lib在节点上是不同的。但是我无法找出哪些lib,两个节点都有相似的lib,在我的例子中有一些库冲突。 修正了我们应该添加选项
spark.driver.extraClassPath

就我们而言
spark.driver.extraClassPath/usr/hdp/current/hadoop warn client/hadoop warn common.jar:/usr/hdp/current/hadoop warn client/hadoop warn api.jar:/usr/hdp/current/hive-server2-hive2/lib/hive hcatalog core.jar