从ec2实例内调用spark-ec2:到主机的ssh连接被拒绝

从ec2实例内调用spark-ec2:到主机的ssh连接被拒绝,ssh,amazon-ec2,apache-spark,Ssh,Amazon Ec2,Apache Spark,为了运行Amplab的培训练习,我在us-east-1上创建了一个密钥对,并安装了培训脚本(git clone)git://github.com/amplab/training-scripts.git -b ampcamp4)并创建了环境。变量AWS_ACCESS_KEY_ID和AWS_SECRET_ACCESS_KEY遵循中的说明 正在运行 ./spark-ec2 -i ~/.ssh/myspark.pem -r us-east-1 -k myspark --copy launch try

为了运行Amplab的培训练习,我在
us-east-1
上创建了一个密钥对,并安装了培训脚本(
git clone)git://github.com/amplab/training-scripts.git -b ampcamp4
)并创建了环境。变量AWS_ACCESS_KEY_ID和AWS_SECRET_ACCESS_KEY遵循中的说明

正在运行

 ./spark-ec2 -i ~/.ssh/myspark.pem -r us-east-1  -k myspark --copy launch try1
生成以下消息:

 johndoe@ip-some-instance:~/projects/spark/training-scripts$ ./spark-ec2 -i ~/.ssh/myspark.pem -r us-east-1  -k myspark --copy launch try1
 Setting up security groups...
 Searching for existing cluster try1...
 Latest Spark AMI: ami-19474270
 Launching instances...
 Launched 5 slaves in us-east-1b, regid = r-0c5e5ee3
 Launched master in us-east-1b, regid = r-316060de
 Waiting for instances to start up...
 Waiting 120 more seconds...
 Copying SSH key /home/johndoe/.ssh/myspark.pem to master...
 ssh: connect to host ec2-54-90-57-174.compute-1.amazonaws.com port 22: Connection refused
 Error connecting to host Command 'ssh -t -o StrictHostKeyChecking=no -i /home/johndoe/.ssh/myspark.pem root@ec2-54-90-57-174.compute-1.amazonaws.com 'mkdir -p ~/.ssh'' returned  non-zero exit status 255, sleeping 30
 ssh: connect to host ec2-54-90-57-174.compute-1.amazonaws.com port 22: Connection refused
 Error connecting to host Command 'ssh -t -o StrictHostKeyChecking=no -i /home/johndoe/.ssh/myspark.pem root@ec2-54-90-57-174.compute-1.amazonaws.com 'mkdir -p ~/.ssh'' returned non-zero exit status 255, sleeping 30
 ...
 ...
 subprocess.CalledProcessError: Command 'ssh -t -o StrictHostKeyChecking=no -i /home/johndoe/.ssh/myspark.pem root@ec2-54-90-57-174.compute-1.amazonaws.com '/root/spark/bin/stop-all.sh'' returned non-zero exit status 127
其中
root@ec2-54-90-57-174.compute-1.amazonaws.com
是用户和主实例。我尝试了
-u ec2 user
并将
-w
一直增加到600,但得到了相同的错误

当我登录AWS控制台时,我可以在
us-east-1
中看到主实例和从实例,并且我实际上可以从'local'
ip some instance
shell ssh到主实例

我的理解是spark-ec2脚本负责定义主/从安全组(监听哪些端口等等),我不必调整这些设置。这就是说,主设备和从设备都会监听post 22(
端口:22,协议:tcp,来源:ampcamp3从设备/主设备组中的0.0.0.0/0


我在这里不知所措,在我把所有研发资金都花在EC2实例上之前,如果有人能给我指点一下,我将不胜感激。。。。谢谢。

这很可能是由于SSH在实例上花费很长时间启动,导致120秒超时在机器登录之前过期。你应该能跑

./spark-ec2 -i ~/.ssh/myspark.pem -r us-east-1  -k myspark --copy launch --resume try1

(使用
--resume
标志)从停止的位置继续,而不重新启动新实例。这个问题将在Spark 1.2.0中修复,在Spark 1.2.0中,我们有一种新的机制,可以智能地检查SSH状态,而不依赖于固定的超时。我们还通过构建新的AMI来解决SSH启动延迟时间过长的根本原因。

谢谢。仍然无法访问群集,可能我需要等待更长的时间(已经1.5小时了,所以我假设还有其他工作在进行).
打开url时出现异常http://ec2-54-90-57-174.compute-1.amazonaws.com:8080/json 连接到主机命令“ssh-t-o StrictHostKeyChecking=no-i/home/johndoe/.ssh/myspark.pem”时出错root@ec2-54-90-57-174.compute-1.amazonaws.com'/root/spark/bin/stop all.sh''返回非零退出状态127,sleeping 30
您正在使用哪一版本的Spark?实际上,Spark的1.2.0版本(2014年12月18日发布)不支持“等待”标志,并自动等待实例处于“ssh就绪”状态。