Apache spark 无法连接到spark群集中的从机
我想创建一个spark独立群集。我有两台工作站和笔记本电脑。所有人都有Ubuntu作为他们的操作系统。每个系统都有不同的用户名。 我关注这个博客。我编辑了主机文件Apache spark 无法连接到spark群集中的从机,apache-spark,ssh,Apache Spark,Ssh,我想创建一个spark独立群集。我有两台工作站和笔记本电脑。所有人都有Ubuntu作为他们的操作系统。每个系统都有不同的用户名。 我关注这个博客。我编辑了主机文件 sudo gedit /etc/hosts 10.8.9.13 master 10.8.19.23 slave01 10.8.5.158 slave02 user-name of Master: lab user-name of Slave01: lab-zero user-name of Slave02:
sudo gedit /etc/hosts
10.8.9.13 master
10.8.19.23 slave01
10.8.5.158 slave02
user-name of Master: lab
user-name of Slave01: lab-zero
user-name of Slave02: computer
我还生成了键值对ssh keygen-t rsa
,并将其添加到.ssh/authorized_keys文件中。
因此,当我ssh这两台机器时,我可以不用密码登录。
但是当我运行/start all.sh
时
lab@slave02's password: lab@slave01's password: localhost: starting org.apache.spark.deploy.worker.Worker, logging to /home/lab/Downloads/spark-2.1.1-bin-hadoop2.7/logs/spark-acs-lab-rg.apache.spark.deploy.worker.Worker-1-M1.out
它卡在这里,两个从机都使用我的默认用户名lab
而不是远程主机的用户名(在本例中,从机的用户名:labzero
和计算机
)
当我检查Spark Master UI时,它会给我一个错误:
The requested URL could not be retrieved
当我输入/stop slaves.sh
时,它也会返回
no org.apache.spark.deploy.worker.Worker to stop
这是我的工作日志:
17/11/30 01:53:40 INFO Worker: Retrying connection to master (attempt # 16) 17/11/30 01:53:40 INFO Worker: Connecting to master
10.8.9.13:7077... 17/11/30 01:53:40 WARN Worker: Failed to connect to master 10.8.9.13:7077 org.apache.spark.SparkException: Exception thrown in awaitResult
at org.apache.spark.rpc.RpcTimeout$$anonfun$1.applyOrElse(RpcTimeout.scala:77)
at org.apache.spark.rpc.RpcTimeout$$anonfun$1.applyOrElse(RpcTimeout.scala:75)
at scala.runtime.AbstractPartialFunction.apply(AbstractPartialFunction.scala:36)
at org.apache.spark.rpc.RpcTimeout$$anonfun$addMessageIfTimeout$1.applyOrElse(RpcTimeout.scala:59)
at org.apache.spark.rpc.RpcTimeout$$anonfun$addMessageIfTimeout$1.applyOrElse(RpcTimeout.scala:59)
at scala.PartialFunction$OrElse.apply(PartialFunction.scala:167)
at org.apache.spark.rpc.RpcTimeout.awaitResult(RpcTimeout.scala:83)
at org.apache.spark.rpc.RpcEnv.setupEndpointRefByURI(RpcEnv.scala:100)
at org.apache.spark.rpc.RpcEnv.setupEndpointRef(RpcEnv.scala:108)
at org.apache.spark.deploy.worker.Worker$$anonfun$org$apache$spark$deploy$worker$Worker$$tryRegisterAllMasters$1$$anon$1.run(Worker.scala:218)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748) Caused by: java.lang.RuntimeException: java.io.EOFException
at java.io.DataInputStream.readFully(DataInputStream.java:197)
at java.io.DataInputStream.readUTF(DataInputStream.java:609)
at java.io.DataInputStream.readUTF(DataInputStream.java:564)
at org.apache.spark.rpc.netty.RequestMessage$.readRpcAddress(NettyRpcEnv.scala:582)
at org.apache.spark.rpc.netty.RequestMessage$.apply(NettyRpcEnv.scala:592)
at org.apache.spark.rpc.netty.NettyRpcHandler.internalReceive(NettyRpcEnv.scala:651)
at org.apache.spark.rpc.netty.NettyRpcHandler.receive(NettyRpcEnv.scala:636)
at org.apache.spark.network.server.TransportRequestHandler.processRpcRequest(TransportRequestHandler.java:157)
at org.apache.spark.network.server.TransportRequestHandler.handle(TransportRequestHandler.java:105)
at org.apache.spark.network.server.TransportChannelHandler.channelRead(TransportChannelHandler.java:118)
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:357)
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:343)
at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:336)
at io.netty.handler.timeout.IdleStateHandler.channelRead(IdleStateHandler.java:287)
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:357)
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:343)
at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:336)
at io.netty.handler.codec.MessageToMessageDecoder.channelRead(MessageToMessageDecoder.java:102)
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:357)
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:343)
at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:336)
at org.apache.spark.network.util.TransportFrameDecoder.channelRead(TransportFrameDecoder.java:85)
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:357)
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:343)
at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:336)
at io.netty.channel.DefaultChannelPipeline$HeadContext.channelRead(DefaultChannelPipeline.java:1294)
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:357)
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:343)
at io.netty.channel.DefaultChannelPipeline.fireChannelRead(DefaultChannelPipeline.java:911)
at io.netty.channel.nio.AbstractNioByteChannel$NioByteUnsafe.read(AbstractNioByteChannel.java:131)
at io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:643)
at io.netty.channel.nio.NioEventLoop.processSelectedKeysOptimized(NioEventLoop.java:566)
at io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:480)
at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:442)
at io.netty.util.concurrent.SingleThreadEventExecutor$2.run(SingleThreadEventExecutor.java:131)
at io.netty.util.concurrent.DefaultThreadFactory$DefaultRunnableDecorator.run(DefaultThreadFactory.java:144)
at java.lang.Thread.run(Thread.java:748)
at org.apache.spark.network.client.TransportResponseHandler.handle(TransportResponseHandler.java:189)
at org.apache.spark.network.server.TransportChannelHandler.channelRead(TransportChannelHandler.java:120)
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:367)
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:353)
at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:346)
at io.netty.handler.timeout.IdleStateHandler.channelRead(IdleStateHandler.java:266)
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:367)
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:353)
at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:346)
at io.netty.handler.codec.MessageToMessageDecoder.channelRead(MessageToMessageDecoder.java:102)
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:367)
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:353)
at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:346)
at org.apache.spark.network.util.TransportFrameDecoder.channelRead(TransportFrameDecoder.java:85)
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:367)
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:353)
at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:346)
at io.netty.channel.DefaultChannelPipeline$HeadContext.channelRead(DefaultChannelPipeline.java:1294)
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:367)
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:353)
at io.netty.channel.DefaultChannelPipeline.fireChannelRead(DefaultChannelPipeline.java:911)
at io.netty.channel.nio.AbstractNioByteChannel$NioByteUnsafe.read(AbstractNioByteChannel.java:131)
at io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:652)
at io.netty.channel.nio.NioEventLoop.processSelectedKeysOptimized(NioEventLoop.java:575)
at io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:489)
at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:451)
at io.netty.util.concurrent.SingleThreadEventExecutor$2.run(SingleThreadEventExecutor.java:140)
at io.netty.util.concurrent.DefaultThreadFactory$DefaultRunnableDecorator.run(DefaultThreadFactory.java:144)
... 1 more 17/11/30 01:54:43 ERROR Worker: All masters are unresponsive! Giving up.
- 在主主机上运行
,检查 在ui中,默认端口为/sbin/start master.sh
8080
- 运行
在每个从属主机上 有关更多信息,请参阅/sbin/start-slave.shspark://10.8.9.13:7077
slaves
文件。slaves文件包含以下内容:
# A Spark Worker will be started on each of the machines listed below.
10.8.9.13
10.8.19.23
10.8.5.158
还将主IP地址添加到~/spark-2.0.2-bin-hadoop2.7
/conf/spark-env.sh文件
export SPARK_MASTER_HOST=10.8.9.13
当我运行
/sbin/start master.sh
并检查UI上的可用性时,我在spark://M1:7077/之后,我运行/sbin/start-slave.shspark://10.8.9.13:7077
但主机未显示任何工作进程处于活动状态。我还在我的conf/SPARK-env.sh中添加了>SPARK\u MASTER\u HOST=your\u HOST\u ip SPARK\u LOCAL\u ip=your\u HOST\u ip。但事实似乎并非如此working@abhinavchoudhury您是否尝试运行/sbin/start-slave.shspark://M1:7077
而不是/sbin/start-slave.shspark://10.8.9.13:7077
?是的,我做了,但仍然没有在主web UI上显示任何内容。我还检查了我的从机中的日志。它给出了一个连接到主机M1:7077失败的错误,/etc/hosts文件是否包含主IP和从IP?所有从属主机文件是否也应与主主机文件相同?系统的主机名(M1)和主机名是否应该相同?i、 e.在我的情况下,主机名是M1,我已经在主机文件中写入了10.8.9.13 master。尝试在主节点上运行netstat-nlp | grep 7077
,查看主节点是否在侦听端口7077,如果是,从节点上运行telnet M1 7077
,查看主节点上的端口7077是否打开