Warning: file_get_contents(/data/phpspider/zhask/data//catemap/3/apache-spark/6.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
Apache spark 无法连接hadoop spark_Apache Spark_Hadoop_Yarn - Fatal编程技术网

Apache spark 无法连接hadoop spark

Apache spark 无法连接hadoop spark,apache-spark,hadoop,yarn,Apache Spark,Hadoop,Yarn,我是spark作业和spark配置的初学者 我尝试提交一个spark作业,几分钟后(作业被接受并在几分钟内运行)该作业失败,连接被拒绝 User class threw exception: org.apache.spark.SparkException: Job aborted due to stage failure: ShuffleMapStage 2 我对乔布斯的成功也有这个错误 ERROR shuffle.RetryingBlockFetcher: Exception while

我是spark作业和spark配置的初学者

我尝试提交一个spark作业,几分钟后(作业被接受并在几分钟内运行)该作业失败,连接被拒绝

User class threw exception: org.apache.spark.SparkException: Job aborted due to stage failure: ShuffleMapStage 2 
我对乔布斯的成功也有这个错误

ERROR shuffle.RetryingBlockFetcher: Exception while beginning fetch of 1 outstanding blocks 
在我的电脑上,有了intellij Idea我的工作,这不是一个代码错误

我尝试了几次更改warn-site.xml和mapred-site.xml中的配置

这是一个hadoop hdfs群集,3个节点,每个节点上有2个内核,每个节点上有8GB RAM,我尝试使用以下命令行提交:

spark-submit --packages org.apache.spark:spark-avro_2.11:2.4.3 --class MyClass --master yarn --deploy-mode cluster myJar.jar
mapred-site.xml:

<property>
   <value>yarn</value>
   <name>mapreduce.framework.name</name>
</property>
<property>
   <name>yarn.app.mapreduce.am.env</name>
   <value>HADOOP_MAPRED_HOME=$HADOOP_HOME</value>
</property>
<property>
    <name>mapreduce.map.env</name>
    <value>HADOOP_MAPRED_HOME=$HADOOP_HOME</value>
</property>
<property>
     <name>mapreduce.reduce.env</name>
     <value>HADOOP_MAPRED_HOME=$HADOOP_HOME</value>
    </property>
<property>
    <name>mapreduce.map.memory.mb</name>
    <value>1000</value>
</property>

<property>
    <name>mapreduce.reduce.memory.mb</name>
    <value>1000</value>
</property>
<property>
    <name>yarn.app.mapreduce.am.resource.mb</name>
    <value>2000</value>
</property>

到底是什么让它无法连接?外部服务器?纱线名字节点?此外,Spark确实使用映射站点每个节点都会发送一条错误消息,
,原因是:io.netty.channel.AbstractChannel$AnnotatedConnectionException:连接被拒绝:主机名/ip:39318
,并且此错误
未能连接到主机名/ip:39318
,但每次运行作业时,端口都会发生更改,我在主hdfs服务器上启动命令,集群模式,我不知道spark是否使用mapre站点,怎么知道?非常感谢。
<property>
   <value>yarn</value>
   <name>mapreduce.framework.name</name>
</property>
<property>
   <name>yarn.app.mapreduce.am.env</name>
   <value>HADOOP_MAPRED_HOME=$HADOOP_HOME</value>
</property>
<property>
    <name>mapreduce.map.env</name>
    <value>HADOOP_MAPRED_HOME=$HADOOP_HOME</value>
</property>
<property>
     <name>mapreduce.reduce.env</name>
     <value>HADOOP_MAPRED_HOME=$HADOOP_HOME</value>
    </property>
<property>
    <name>mapreduce.map.memory.mb</name>
    <value>1000</value>
</property>

<property>
    <name>mapreduce.reduce.memory.mb</name>
    <value>1000</value>
</property>
<property>
    <name>yarn.app.mapreduce.am.resource.mb</name>
    <value>2000</value>
</property>
<property>
   <name>yarn.acl.enable</name>
   <value>0</value>
</property>

<property>
    <name>yarn.resourcemanager.hostname</name>
    <value>ipadress</value>
</property>

<property>
    <name>yarn.nodemanager.aux-services</name>
    <value>mapreduce_shuffle</value>
</property>
<property>
    <name>yarn.nodemanager.resource.memory-mb</name>
    <value>4000</value>
</property>

<property>
    <name>yarn.scheduler.minimum-allocation-mb</name>
    <value>500</value>
</property>

<property>
    <name>yarn.scheduler.minimum-allocation-mb</name>
    <value>2000</value>
</property>
spark.master yarn
spark.driver.memory 1g    
spark.history.fs.update.interval  30s
spark.history.ui.port  port
spark.core.connection.ack.wait.timeout 600s
spark.default.parallelism 2
spark.executor.memory 2g
spark.cores.max 2
spark.executor.cores 2