Apache spark 执行器被阻塞在";UserGroupInformation.doAs“;
明天我已经构建了一个Spark集群,今天我想在集群上运行一个WordCount程序 我的环境:jdk1.8.121+scala2.10.4+hadoop2.6.5+spark1.6.2 集群:主+从01+从02 客户:客户 附加环境:master、slave01、slave02、client都在同一LAN中[master、slave01、slave02可以在互不保密的情况下登录],并且登录的用户都是root用户 演示代码如下:Apache spark 执行器被阻塞在";UserGroupInformation.doAs“;,apache-spark,Apache Spark,明天我已经构建了一个Spark集群,今天我想在集群上运行一个WordCount程序 我的环境:jdk1.8.121+scala2.10.4+hadoop2.6.5+spark1.6.2 集群:主+从01+从02 客户:客户 附加环境:master、slave01、slave02、client都在同一LAN中[master、slave01、slave02可以在互不保密的情况下登录],并且登录的用户都是root用户 演示代码如下: defmain(args:Array[String])={ val输
defmain(args:Array[String])={
val输入路径=”hdfs://master/970655147/input/01WordCount/"
//1.本地模式
//val conf=new SparkConf().setMaster(“本地”).setAppName(“字数”)
//2.标准模式
val conf=new SparkConf().setMaster(“spark://master:7077)。setAppName(“字数”)
.set(“spark.executor.memory”,“64M”)
.set(“spark.executor.cores”、“1”)
val sc=新的SparkContext(配置)
val line=sc.textFile(inputPath)
行。foreach(println)
sc.停止
}
root@slave02:~# jps
7984 CoarseGrainedExecutorBackend
6468 NodeManager
8037 Jps
955 Worker
7981 CoarseGrainedExecutorBackend
7982 CoarseGrainedExecutorBackend
6366 DataNode
7983 CoarseGrainedExecutorBackend
root@slave02:~# ps -ef | grep 7983
root 7983 955 14 06:21 ? 00:00:03 /usr/local/ProgramFiles/jdk1.8.0_121/bin/java -cp /usr/local/ProgramFiles/spark-1.6.2-bin-hadoop2.6/conf/:/usr/local/ProgramFiles/spark-1.6.2-bin-hadoop2.6/lib/spark-assembly-1.6.2-hadoop2.6.0.jar:/usr/local/ProgramFiles/spark-1.6.2-bin-hadoop2.6/lib/datanucleus-core-3.2.10.jar:/usr/local/ProgramFiles/spark-1.6.2-bin-hadoop2.6/lib/datanucleus-api-jdo-3.2.6.jar:/usr/local/ProgramFiles/spark-1.6.2-bin-hadoop2.6/lib/datanucleus-rdbms-3.2.9.jar:/usr/local/ProgramFiles/hadoop-2.6.5/etc/hadoop/ -Xms64M -Xmx64M -Dspark.driver.port=37230 org.apache.spark.executor.CoarseGrainedExecutorBackend --driver-url spark://CoarseGrainedScheduler@192.168.0.191:37230 --executor-id 1 --hostname 192.168.0.182 --cores 1 --app-id app-20170408062155-0015 --worker-url spark://Worker@192.168.0.182:46466
root 8050 4249 4 06:22 pts/1 00:00:00 grep --color=auto 7983
root@slave02:~#
root@slave02:/usr/local/ProgramFiles/spark-1.6.2-bin-hadoop2.6# cat work/app-20170408062155-0015/0/stderr
17/04/08 06:22:20 INFO executor.CoarseGrainedExecutorBackend: Registered signal handlers for [TERM, HUP, INT]
17/04/08 06:22:24 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
17/04/08 06:22:28 INFO spark.SecurityManager: Changing view acls to: root
17/04/08 06:22:28 INFO spark.SecurityManager: Changing modify acls to: root
17/04/08 06:22:28 INFO spark.SecurityManager: SecurityManager: authentication disabled; ui acls disabled; users with view permissions: Set(root); users with modify permissions: Set(root)
17/04/08 06:23:06 INFO spark.SecurityManager: Changing view acls to: root
17/04/08 06:23:06 INFO spark.SecurityManager: Changing modify acls to: root
17/04/08 06:23:06 INFO spark.SecurityManager: SecurityManager: authentication disabled; ui acls disabled; users with view permissions: Set(root); users with modify permissions: Set(root)
17/04/08 06:23:24 INFO slf4j.Slf4jLogger: Slf4jLogger started
17/04/08 06:23:29 INFO Remoting: Starting remoting
Exception in thread "main" 17/04/08 06:23:46 INFO remote.RemoteActorRefProvider$RemotingTerminator: Shutting down remote daemon.
17/04/08 06:23:47 INFO remote.RemoteActorRefProvider$RemotingTerminator: Remote daemon shut down; proceeding with flushing remote transports.
java.lang.reflect.UndeclaredThrowableException
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1643)
at org.apache.spark.deploy.SparkHadoopUtil.runAsSparkUser(SparkHadoopUtil.scala:68)
at org.apache.spark.executor.CoarseGrainedExecutorBackend$.run(CoarseGrainedExecutorBackend.scala:151)
at org.apache.spark.executor.CoarseGrainedExecutorBackend$.main(CoarseGrainedExecutorBackend.scala:253)
at org.apache.spark.executor.CoarseGrainedExecutorBackend.main(CoarseGrainedExecutorBackend.scala)
Caused by: java.util.concurrent.TimeoutException: Futures timed out after [10000 milliseconds]
at scala.concurrent.impl.Promise$DefaultPromise.ready(Promise.scala:219)
at scala.concurrent.impl.Promise$DefaultPromise.result(Promise.scala:223)
at scala.concurrent.Await$$anonfun$result$1.apply(package.scala:107)
at scala.concurrent.BlockContext$DefaultBlockContext$.blockOn(BlockContext.scala:53)
at scala.concurrent.Await$.result(package.scala:107)
at akka.remote.Remoting.start(Remoting.scala:179)
at akka.remote.RemoteActorRefProvider.init(RemoteActorRefProvider.scala:184)
at akka.actor.ActorSystemImpl.liftedTree2$1(ActorSystem.scala:620)
at akka.actor.ActorSystemImpl._start$lzycompute(ActorSystem.scala:617)
at akka.actor.ActorSystemImpl._start(ActorSystem.scala:617)
at akka.actor.ActorSystemImpl.start(ActorSystem.scala:634)
at akka.actor.ActorSystem$.apply(ActorSystem.scala:142)
at akka.actor.ActorSystem$.apply(ActorSystem.scala:119)
at org.apache.spark.util.AkkaUtils$.org$apache$spark$util$AkkaUtils$$doCreateActorSystem(AkkaUtils.scala:121)
at org.apache.spark.util.AkkaUtils$$anonfun$1.apply(AkkaUtils.scala:53)
at org.apache.spark.util.AkkaUtils$$anonfun$1.apply(AkkaUtils.scala:52)
at org.apache.spark.util.Utils$$anonfun$startServiceOnPort$1.apply$mcVI$sp(Utils.scala:2024)
at scala.collection.immutable.Range.foreach$mVc$sp(Range.scala:141)
at org.apache.spark.util.Utils$.startServiceOnPort(Utils.scala:2015)
at org.apache.spark.util.AkkaUtils$.createActorSystem(AkkaUtils.scala:55)
at org.apache.spark.SparkEnv$.create(SparkEnv.scala:266)
at org.apache.spark.SparkEnv$.createExecutorEnv(SparkEnv.scala:217)
at org.apache.spark.executor.CoarseGrainedExecutorBackend$$anonfun$run$1.apply$mcV$sp(CoarseGrainedExecutorBackend.scala:186)
at org.apache.spark.deploy.SparkHadoopUtil$$anon$1.run(SparkHadoopUtil.scala:69)
at org.apache.spark.deploy.SparkHadoopUtil$$anon$1.run(SparkHadoopUtil.scala:68)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1628)
... 4 more
root@slave02:/usr/local/ProgramFiles/spark-1.6.2-bin-hadoop2.6# cat work/app-20170408062155-0015/0/stdout
root@slave02:/usr/local/ProgramFiles/spark-1.6.2-bin-hadoop2.6#
Using Spark's default log4j profile: org/apache/spark/log4j-defaults.properties
17/04/08 21:21:44 INFO SparkContext: Running Spark version 1.6.2
17/04/08 21:21:45 WARN NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
17/04/08 21:21:45 INFO SecurityManager: Changing view acls to: root
17/04/08 21:21:45 INFO SecurityManager: Changing modify acls to: root
17/04/08 21:21:45 INFO SecurityManager: SecurityManager: authentication disabled; ui acls disabled; users with view permissions: Set(root); users with modify permissions: Set(root)
17/04/08 21:21:46 INFO Utils: Successfully started service 'sparkDriver' on port 37230.
17/04/08 21:21:47 INFO Slf4jLogger: Slf4jLogger started
17/04/08 21:21:47 INFO Remoting: Starting remoting
17/04/08 21:21:48 INFO Remoting: Remoting started; listening on addresses :[akka.tcp://sparkDriverActorSystem@192.168.0.191:43974]
17/04/08 21:21:48 INFO Utils: Successfully started service 'sparkDriverActorSystem' on port 43974.
17/04/08 21:21:48 INFO SparkEnv: Registering MapOutputTracker
17/04/08 21:21:48 INFO SparkEnv: Registering BlockManagerMaster
17/04/08 21:21:48 INFO DiskBlockManager: Created local directory at /tmp/blockmgr-ef79b656-b7f4-4cb3-be3e-0f8bb61baa9d
17/04/08 21:21:48 INFO MemoryStore: MemoryStore started with capacity 431.3 MB
17/04/08 21:21:48 INFO SparkEnv: Registering OutputCommitCoordinator
17/04/08 21:21:54 INFO Utils: Successfully started service 'SparkUI' on port 4040.
17/04/08 21:21:54 INFO SparkUI: Started SparkUI at http://192.168.0.191:4040
17/04/08 21:21:54 INFO AppClient$ClientEndpoint: Connecting to master spark://master:7077...
17/04/08 21:21:55 INFO SparkDeploySchedulerBackend: Connected to Spark cluster with app ID app-20170408062155-0015
17/04/08 21:21:55 INFO AppClient$ClientEndpoint: Executor added: app-20170408062155-0015/0 on worker-20170408024004-192.168.0.182-46466 (192.168.0.182:46466) with 1 cores
17/04/08 21:21:55 INFO SparkDeploySchedulerBackend: Granted executor ID app-20170408062155-0015/0 on hostPort 192.168.0.182:46466 with 1 cores, 64.0 MB RAM
17/04/08 21:21:55 INFO AppClient$ClientEndpoint: Executor added: app-20170408062155-0015/1 on worker-20170408024004-192.168.0.182-46466 (192.168.0.182:46466) with 1 cores
17/04/08 21:21:55 INFO SparkDeploySchedulerBackend: Granted executor ID app-20170408062155-0015/1 on hostPort 192.168.0.182:46466 with 1 cores, 64.0 MB RAM
17/04/08 21:21:55 INFO AppClient$ClientEndpoint: Executor added: app-20170408062155-0015/2 on worker-20170408024004-192.168.0.182-46466 (192.168.0.182:46466) with 1 cores
17/04/08 21:21:55 INFO SparkDeploySchedulerBackend: Granted executor ID app-20170408062155-0015/2 on hostPort 192.168.0.182:46466 with 1 cores, 64.0 MB RAM
17/04/08 21:21:55 INFO AppClient$ClientEndpoint: Executor added: app-20170408062155-0015/3 on worker-20170408024004-192.168.0.182-46466 (192.168.0.182:46466) with 1 cores
17/04/08 21:21:55 INFO SparkDeploySchedulerBackend: Granted executor ID app-20170408062155-0015/3 on hostPort 192.168.0.182:46466 with 1 cores, 64.0 MB RAM
17/04/08 21:21:55 INFO AppClient$ClientEndpoint: Executor added: app-20170408062155-0015/4 on worker-20170408024003-192.168.0.181-45183 (192.168.0.181:45183) with 1 cores
17/04/08 21:21:55 INFO SparkDeploySchedulerBackend: Granted executor ID app-20170408062155-0015/4 on hostPort 192.168.0.181:45183 with 1 cores, 64.0 MB RAM
17/04/08 21:21:55 INFO AppClient$ClientEndpoint: Executor added: app-20170408062155-0015/5 on worker-20170408024003-192.168.0.181-45183 (192.168.0.181:45183) with 1 cores
17/04/08 21:21:55 INFO SparkDeploySchedulerBackend: Granted executor ID app-20170408062155-0015/5 on hostPort 192.168.0.181:45183 with 1 cores, 64.0 MB RAM
17/04/08 21:21:55 INFO AppClient$ClientEndpoint: Executor added: app-20170408062155-0015/6 on worker-20170408024003-192.168.0.181-45183 (192.168.0.181:45183) with 1 cores
17/04/08 21:21:55 INFO SparkDeploySchedulerBackend: Granted executor ID app-20170408062155-0015/6 on hostPort 192.168.0.181:45183 with 1 cores, 64.0 MB RAM
17/04/08 21:21:55 INFO AppClient$ClientEndpoint: Executor added: app-20170408062155-0015/7 on worker-20170408024003-192.168.0.181-45183 (192.168.0.181:45183) with 1 cores
17/04/08 21:21:55 INFO SparkDeploySchedulerBackend: Granted executor ID app-20170408062155-0015/7 on hostPort 192.168.0.181:45183 with 1 cores, 64.0 MB RAM
17/04/08 21:21:55 INFO Utils: Successfully started service 'org.apache.spark.network.netty.NettyBlockTransferService' on port 42255.
17/04/08 21:21:55 INFO NettyBlockTransferService: Server created on 42255
17/04/08 21:21:56 INFO BlockManagerMaster: Trying to register BlockManager
17/04/08 21:21:57 INFO BlockManagerMasterEndpoint: Registering block manager 192.168.0.191:42255 with 431.3 MB RAM, BlockManagerId(driver, 192.168.0.191, 42255)
17/04/08 21:21:57 INFO BlockManagerMaster: Registered BlockManager
17/04/08 21:21:58 INFO AppClient$ClientEndpoint: Executor updated: app-20170408062155-0015/0 is now RUNNING
17/04/08 21:21:58 INFO AppClient$ClientEndpoint: Executor updated: app-20170408062155-0015/1 is now RUNNING
17/04/08 21:21:58 INFO AppClient$ClientEndpoint: Executor updated: app-20170408062155-0015/2 is now RUNNING
17/04/08 21:21:58 INFO AppClient$ClientEndpoint: Executor updated: app-20170408062155-0015/3 is now RUNNING
17/04/08 21:22:00 INFO AppClient$ClientEndpoint: Executor updated: app-20170408062155-0015/4 is now RUNNING
17/04/08 21:22:01 INFO AppClient$ClientEndpoint: Executor updated: app-20170408062155-0015/5 is now RUNNING
17/04/08 21:22:01 INFO AppClient$ClientEndpoint: Executor updated: app-20170408062155-0015/6 is now RUNNING
17/04/08 21:22:01 INFO AppClient$ClientEndpoint: Executor updated: app-20170408062155-0015/7 is now RUNNING
17/04/08 21:22:03 INFO SparkDeploySchedulerBackend: SchedulerBackend is ready for scheduling beginning after reached minRegisteredResourcesRatio: 0.0
17/04/08 21:22:05 INFO MemoryStore: Block broadcast_0 stored as values in memory (estimated size 107.7 KB, free 107.7 KB)
17/04/08 21:22:06 INFO MemoryStore: Block broadcast_0_piece0 stored as bytes in memory (estimated size 9.8 KB, free 117.5 KB)
17/04/08 21:22:06 INFO BlockManagerInfo: Added broadcast_0_piece0 in memory on 192.168.0.191:42255 (size: 9.8 KB, free: 431.2 MB)
17/04/08 21:22:06 INFO SparkContext: Created broadcast 0 from textFile at Test01WordCount.scala:30
17/04/08 21:22:21 INFO FileInputFormat: Total input paths to process : 1
17/04/08 21:22:21 INFO SparkContext: Starting job: foreach at Test01WordCount.scala:33
17/04/08 21:22:21 INFO DAGScheduler: Got job 0 (foreach at Test01WordCount.scala:33) with 2 output partitions
17/04/08 21:22:21 INFO DAGScheduler: Final stage: ResultStage 0 (foreach at Test01WordCount.scala:33)
17/04/08 21:22:21 INFO DAGScheduler: Parents of final stage: List()
17/04/08 21:22:21 INFO DAGScheduler: Missing parents: List()
17/04/08 21:22:21 INFO DAGScheduler: Submitting ResultStage 0 (hdfs://master/970655147/input/01WordCount/ MapPartitionsRDD[1] at textFile at Test01WordCount.scala:30), which has no missing parents
17/04/08 21:22:21 INFO MemoryStore: Block broadcast_1 stored as values in memory (estimated size 3.0 KB, free 120.5 KB)
17/04/08 21:22:21 INFO MemoryStore: Block broadcast_1_piece0 stored as bytes in memory (estimated size 1842.0 B, free 122.3 KB)
17/04/08 21:22:21 INFO BlockManagerInfo: Added broadcast_1_piece0 in memory on 192.168.0.191:42255 (size: 1842.0 B, free: 431.2 MB)
17/04/08 21:22:21 INFO SparkContext: Created broadcast 1 from broadcast at DAGScheduler.scala:1006
17/04/08 21:22:21 INFO DAGScheduler: Submitting 2 missing tasks from ResultStage 0 (hdfs://master/970655147/input/01WordCount/ MapPartitionsRDD[1] at textFile at Test01WordCount.scala:30)
17/04/08 21:22:21 INFO TaskSchedulerImpl: Adding task set 0.0 with 2 tasks
17/04/08 21:22:36 WARN TaskSchedulerImpl: Initial job has not accepted any resources; check your cluster UI to ensure that workers are registered and have sufficient resources
17/04/08 21:23:04 WARN TaskSchedulerImpl: Initial job has not accepted any resources; check your cluster UI to ensure that workers are registered and have sufficient resources
17/04/08 21:23:06 WARN TaskSchedulerImpl: Initial job has not accepted any resources; check your cluster UI to ensure that workers are registered and have sufficient resources
17/04/08 21:23:21 WARN TaskSchedulerImpl: Initial job has not accepted any resources; check your cluster UI to ensure that workers are registered and have sufficient resources
17/04/08 21:23:36 WARN TaskSchedulerImpl: Initial job has not accepted any resources; check your cluster UI to ensure that workers are registered and have sufficient resources
17/04/08 21:23:51 WARN TaskSchedulerImpl: Initial job has not accepted any resources; check your cluster UI to ensure that workers are registered and have sufficient resources
17/04/08 21:24:02 INFO AppClient$ClientEndpoint: Executor updated: app-20170408062155-0015/1 is now EXITED (Command exited with code 1)
17/04/08 21:24:02 INFO SparkDeploySchedulerBackend: Executor app-20170408062155-0015/1 removed: Command exited with code 1
17/04/08 21:24:06 WARN TaskSchedulerImpl: Initial job has not accepted any resources; check your cluster UI to ensure that workers are registered and have sufficient resources
17/04/08 21:24:21 WARN TaskSchedulerImpl: Initial job has not accepted any resources; check your cluster UI to ensure that workers are registered and have sufficient resources
17/04/08 21:24:36 WARN TaskSchedulerImpl: Initial job has not accepted any resources; check your cluster UI to ensure that workers are registered and have sufficient resources
17/04/08 21:24:51 WARN TaskSchedulerImpl: Initial job has not accepted any resources; check your cluster UI to ensure that workers are registered and have sufficient resources
17/04/08 21:25:06 WARN TaskSchedulerImpl: Initial job has not accepted any resources; check your cluster UI to ensure that workers are registered and have sufficient resources
17/04/08 21:25:21 WARN TaskSchedulerImpl: Initial job has not accepted any resources; check your cluster UI to ensure that workers are registered and have sufficient resources
17/04/08 21:25:36 WARN TaskSchedulerImpl: Initial job has not accepted any resources; check your cluster UI to ensure that workers are registered and have sufficient resources
17/04/08 21:25:51 WARN TaskSchedulerImpl: Initial job has not accepted any resources; check your cluster UI to ensure that workers are registered and have sufficient resources
17/04/08 21:26:02 WARN NettyRpcEndpointRef: Error sending message [message = RemoveExecutor(1,Command exited with code 1)] in 1 attempts
org.apache.spark.rpc.RpcTimeoutException: Futures timed out after [120 seconds]. This timeout is controlled by spark.rpc.askTimeout
at org.apache.spark.rpc.RpcTimeout.org$apache$spark$rpc$RpcTimeout$$createRpcTimeoutException(RpcTimeout.scala:48)
at org.apache.spark.rpc.RpcTimeout$$anonfun$addMessageIfTimeout$1.applyOrElse(RpcTimeout.scala:63)
at org.apache.spark.rpc.RpcTimeout$$anonfun$addMessageIfTimeout$1.applyOrElse(RpcTimeout.scala:59)
at scala.runtime.AbstractPartialFunction.apply(AbstractPartialFunction.scala:33)
at org.apache.spark.rpc.RpcTimeout.awaitResult(RpcTimeout.scala:76)
at org.apache.spark.rpc.RpcEndpointRef.askWithRetry(RpcEndpointRef.scala:101)
at org.apache.spark.rpc.RpcEndpointRef.askWithRetry(RpcEndpointRef.scala:77)
at org.apache.spark.scheduler.cluster.CoarseGrainedSchedulerBackend.removeExecutor(CoarseGrainedSchedulerBackend.scala:370)
at org.apache.spark.scheduler.cluster.SparkDeploySchedulerBackend.executorRemoved(SparkDeploySchedulerBackend.scala:144)
at org.apache.spark.deploy.client.AppClient$ClientEndpoint$$anonfun$receive$1.applyOrElse(AppClient.scala:184)
at org.apache.spark.rpc.netty.Inbox$$anonfun$process$1.apply$mcV$sp(Inbox.scala:116)
at org.apache.spark.rpc.netty.Inbox.safelyCall(Inbox.scala:204)
at org.apache.spark.rpc.netty.Inbox.process(Inbox.scala:100)
at org.apache.spark.rpc.netty.Dispatcher$MessageLoop.run(Dispatcher.scala:215)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)
Caused by: java.util.concurrent.TimeoutException: Futures timed out after [120 seconds]
at scala.concurrent.impl.Promise$DefaultPromise.ready(Promise.scala:219)
at scala.concurrent.impl.Promise$DefaultPromise.result(Promise.scala:223)
at scala.concurrent.Await$$anonfun$result$1.apply(package.scala:107)
at scala.concurrent.BlockContext$DefaultBlockContext$.blockOn(BlockContext.scala:53)
at scala.concurrent.Await$.result(package.scala:107)
at org.apache.spark.rpc.RpcTimeout.awaitResult(RpcTimeout.scala:75)
... 12 more
http://apache-spark-user-list.1001560.n3.nabble.com/Submitting-Spark-job-on-Unix-cluster-from-dev-environment-Windows-td16989.html
https://issues.streamsets.com/browse/SDC-4249
https://community.cloudera.com/t5/Advanced-Analytics-Apache-Spark/Unable-to-create-SparkContext-to-Spark-1-3-Standalone-service-in/td-p/29176
root@master:/usr/local/ProgramFiles# ufw status
Status: inactive
root@master:/usr/local/ProgramFiles# ssh slave01
Welcome to Ubuntu 16.04.2 LTS (GNU/Linux 4.4.0-62-generic x86_64)
* Documentation: https://help.ubuntu.com
* Management: https://landscape.canonical.com
* Support: https://ubuntu.com/advantage
Last login: Sat Apr 8 21:33:44 2017 from 192.168.0.119
root@slave01:~# ufw status
Status: inactive
root@slave01:~# ssh slave02
Welcome to Ubuntu 16.04.2 LTS (GNU/Linux 4.4.0-62-generic x86_64)
* Documentation: https://help.ubuntu.com
* Management: https://landscape.canonical.com
* Support: https://ubuntu.com/advantage
Last login: Sat Apr 8 21:10:33 2017 from 192.168.0.119
root@slave02:~# ufw status
Status: inactive
root@slave02:~#
2.1. nc in master
root@master:/usr/local/ProgramFiles# netcat -l 12306
root@master:/usr/local/ProgramFiles# nc -l 12306
root@master:/usr/local/ProgramFiles# nc -l 12306
root@master:/usr/local/ProgramFiles# nc -l 12306
2.2. nc in slave01
root@slave01:~# nc -vz 192.168.0.180 12306
Connection to 192.168.0.180 12306 port [tcp/*] succeeded!
root@slave01:~# nc -vz master 12306
Connection to master 12306 port [tcp/*] succeeded!
2.3. nc in slave02
root@slave02:/usr/local/ProgramFiles# nc -vz 192.168.0.180 12306
Connection to 192.168.0.180 12306 port [tcp/*] succeeded!
root@slave02:/usr/local/ProgramFiles# nc -vz master 12306
Connection to master 12306 port [tcp/*] succeeded!
root@slave02:/usr/local/ProgramFiles#
我重新安装了Spark的其他版本,但同样的问题仍然存在,并且环境中可能存在一些问题
请给我一些建议,谢谢今天,我想改变java、scala的环境,然后我找到一篇帖子,上面说他用jdk1.7.0_80和scala2.11.8构建了Spark 然后我下载jdk1.7.0_40和scala2.11.8,然后安装在我的集群上[master、slave01、slave02] 更新环境变量hadoop的hadoop-env.sh,spark的spark.env-sh,然后关闭spark,hadoop,启动hadoop,spark 然后我使用“/bin/sparkshell--masterspark://master:7077 --执行器存储器64M“,运行火花外壳 然后spark shell没有正常运行,但是日志似乎与以前不同,然后我查找执行器日志 我知道
Exception in thread "main" java.lang.IllegalArgumentException: System memory 64880640 must be at least 4.718592E8. Please use a larger heap size.
at org.apache.spark.memory.UnifiedMemoryManager$.getMaxMemory(UnifiedMemoryManager.scala:198)
at org.apache.spark.memory.UnifiedMemoryManager$.apply(UnifiedMemoryManager.scala:180)
at org.apache.spark.SparkEnv$.create(SparkEnv.scala:354)
at org.apache.spark.SparkEnv$.createExecutorEnv(SparkEnv.scala:217)
at org.apache.spark.executor.CoarseGrainedExecutorBackend$$anonfun$run$1.apply$mcV$sp(CoarseGrainedExecutorBackend.scala:186)
at org.apache.spark.deploy.SparkHadoopUtil$$anon$1.run(SparkHadoopUtil.scala:69)
at org.apache.spark.deploy.SparkHadoopUtil$$anon$1.run(SparkHadoopUtil.scala:68)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:415)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1628)
at org.apache.spark.deploy.SparkHadoopUtil.runAsSparkUser(SparkHadoopUtil.scala:68)
at org.apache.spark.executor.CoarseGrainedExecutorBackend$.run(CoarseGrainedExecutorBackend.scala:151)
at org.apache.spark.executor.CoarseGrainedExecutorBackend$.main(CoarseGrainedExecutorBackend.scala:253)
at org.apache.spark.executor.CoarseGrainedExecutorBackend.main(CoarseGrainedExecutorBackend.scala)
然后我得到了“GrossGrainedExecutorBackend.scala:186”,根据错误日志信息,它似乎没有在“UserGroupInformation.doAs(UserGroupInformation.java:1643)”处被阻止
我改了--“执行者记忆”,指定为“512M”
然后重新键入命令sparkshell
此时,我成功地进入,然后我尝试在集群上运行“WordCount”
首先更新“.set”(“spark.executor.memory”,“64M”)”
然后构建jar,并将其放入集群,使其正常运行
明白了,看来这个问题已经解决了 然后我想找出这个问题发生的原因,在jdk或scala中 然后我测试更新环境变量spark的spark-env.sh,hadoop的hadoop-env.sh,介于'jdk1.8.0_121&scala2.10.4'和'jdk1.7.0_40&scala 2.11.8'之间,但此时我发现这两个环境对于spark shell和'WordCount'来说都是可以的 即使我删除了“jdk1.7.0(40&scala2.11.8)”,并且[当我遇到这个问题时]所有配置都恢复到之前的状态,但仍然可以 哦,天哪,这是个神秘的问题。。。 即使我没有找到这个问题的原始原因,但我仍然很满意,至少我现在集群还可以
谢谢,@Kaushal
警告TaskSchedulerImpl:初始作业未接受任何资源;检查您的群集UI以确保工作人员已注册并且有足够的资源
意味着您的群集没有启动应用程序的可用资源。打开群集用户界面,检查是否有其他应用程序正在运行并使用群集的全部资源。@Kaushal感谢您的回复,在我的群集中,只有一个应用程序[我只是在测试],这是最明显的原因,更深层次的原因是以下步骤造成的,1。主计划为驱动程序分配执行器,2。驱动程序的工人启动执行器,3。驱动程序的执行器寄存器[在此阶段,执行器在'GrossGrainedExecutorBackend$.main'被阻止],4。驱动程序计划执行器执行驱动程序程序,依此类推..您确定,您的hdfs Url“hdfs://master/970655147/input/01WordCount/“
是否正确,或者您需要指定hdfs端口?恩,是的,我使用默认端口,在本地运行,可以在Spark程序中检测到,并且可以检测到hadoop程序too@Kaushal嗨,谢谢,这个问题很难解决,请看下面的评论