Apache spark 并行运行多个spark应用程序
我正在运行纱线+火花2.4.5群集。我使用docker作为客户机模式运行作业。 我按照如下方式配置了主机和docker客户端: 纱线站点.xmlApache spark 并行运行多个spark应用程序,apache-spark,yarn,Apache Spark,Yarn,我正在运行纱线+火花2.4.5群集。我使用docker作为客户机模式运行作业。 我按照如下方式配置了主机和docker客户端: 纱线站点.xml <property> <name>yarn.resourcemanager.scheduler.class</name> <value>org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler<
<property>
<name>yarn.resourcemanager.scheduler.class</name>
<value>org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler</value>
</property>
<property>
<name>yarn.scheduler.fair.preemption</name>
<value>true</value>
</property>
<property>
<name>yarn.scheduler.fair.assignmultiple</name>
<value>true</value>
</property>
输出从它们中的每一个开始:
20/04/02 13:52:08 INFO SparkUI: Bound SparkUI to 0.0.0.0, and started at http://namenode:4040
20/04/02 13:52:08 INFO FairSchedulableBuilder: Creating Fair Scheduler pools from /opt/spark/conf/fairscheduler.xml
20/04/02 13:52:09 INFO FairSchedulableBuilder: Created pool: default, schedulingMode: FAIR, minShare: 2, weight: 2
20/04/02 13:52:09 INFO FairSchedulableBuilder: Created pool: today, schedulingMode: FAIR, minShare: 2, weight: 2
20/04/02 13:52:09 INFO FairSchedulableBuilder: Created pool: yesterday, schedulingMode: FAIR, minShare: 2, weight: 1
20/04/02 13:52:09 INFO Utils: Using initial executors = 0, max of spark.dynamicAllocation.initialExecutors, spark.dynamicAllocation.minExecutors and spark.executor.instances
20/04/02 13:52:09 INFO RMProxy: Connecting to ResourceManager at namenode/10.128.15.208:8032
20/04/02 13:52:09 INFO Client: Requesting a new application from cluster with 2 NodeManagers
第一个应用程序占用所有可用资源,第二个应用程序在第一个应用程序完成后启动。
我错过了什么?
谢谢
编辑 spark defaults.conf:
spark.master yarn;
spark.dynamicAllocation.enabled true;
spark.shuffle.service.enabled true;
spark.yarn.shuffle.stopOnFailure false;
我找不到来源,但正如我所记得的,spark需要处于动态资源分配模式,以允许Thread在需要时杀死执行器 尝试使用以下设置启用它:
(您还需要设置洗牌服务:)已经在这样做了:这是我的spark-defaults.conf spark.master;spark.dynamicAllocation.enabled true;spark.shuffle.service.enabled true;spark.Thread.shuffle.stopOnFailure错误
20/04/02 13:52:08 INFO SparkUI: Bound SparkUI to 0.0.0.0, and started at http://namenode:4040
20/04/02 13:52:08 INFO FairSchedulableBuilder: Creating Fair Scheduler pools from /opt/spark/conf/fairscheduler.xml
20/04/02 13:52:09 INFO FairSchedulableBuilder: Created pool: default, schedulingMode: FAIR, minShare: 2, weight: 2
20/04/02 13:52:09 INFO FairSchedulableBuilder: Created pool: today, schedulingMode: FAIR, minShare: 2, weight: 2
20/04/02 13:52:09 INFO FairSchedulableBuilder: Created pool: yesterday, schedulingMode: FAIR, minShare: 2, weight: 1
20/04/02 13:52:09 INFO Utils: Using initial executors = 0, max of spark.dynamicAllocation.initialExecutors, spark.dynamicAllocation.minExecutors and spark.executor.instances
20/04/02 13:52:09 INFO RMProxy: Connecting to ResourceManager at namenode/10.128.15.208:8032
20/04/02 13:52:09 INFO Client: Requesting a new application from cluster with 2 NodeManagers
spark.master yarn;
spark.dynamicAllocation.enabled true;
spark.shuffle.service.enabled true;
spark.yarn.shuffle.stopOnFailure false;