Amazon web services 在AWS上远程连接到spark cluster的正确设置是什么?

Amazon web services 在AWS上远程连接到spark cluster的正确设置是什么?,amazon-web-services,apache-spark,Amazon Web Services,Apache Spark,我在AWS上有一个spark群集,所有端口都打开,我的驱动程序在笔记本电脑上本地运行。我得到以下跟踪,我不知道如何修复它?由于以下设置,我相信我的驱动程序可以连接到master export SPARK_PUBLIC_DNS="52.44.36.224" export SPARK_WORKER_CORES=6 但是,我的笔记本电脑上本地运行的驱动程序可能无法直接连接到worker/executor,因为AWS Spark cluster返回的私有ip在AWS外部无法访问,因此我不确定如何修复它

我在AWS上有一个spark群集,所有端口都打开,我的驱动程序在笔记本电脑上本地运行。我得到以下跟踪,我不知道如何修复它?由于以下设置,我相信我的驱动程序可以连接到master

export SPARK_PUBLIC_DNS="52.44.36.224"
export SPARK_WORKER_CORES=6
但是,我的笔记本电脑上本地运行的驱动程序可能无法直接连接到worker/executor,因为AWS Spark cluster返回的私有ip在AWS外部无法访问,因此我不确定如何修复它

Using Spark's default log4j profile: org/apache/spark/log4j-defaults.properties
16/09/04 01:22:17 INFO SparkContext: Running Spark version 2.0.0
16/09/04 01:22:17 WARN NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
16/09/04 01:22:18 INFO SecurityManager: Changing view acls to: kantkodali
16/09/04 01:22:18 INFO SecurityManager: Changing modify acls to: kantkodali
16/09/04 01:22:18 INFO SecurityManager: Changing view acls groups to: 
16/09/04 01:22:18 INFO SecurityManager: Changing modify acls groups to: 
16/09/04 01:22:18 INFO SecurityManager: SecurityManager: authentication disabled; ui acls disabled; users  with view permissions: Set(kantkodali); groups with view permissions: Set(); users  with modify permissions: Set(kantkodali); groups with modify permissions: Set()
16/09/04 01:22:18 INFO Utils: Successfully started service 'sparkDriver' on port 55091.
16/09/04 01:22:18 INFO SparkEnv: Registering MapOutputTracker
16/09/04 01:22:18 INFO SparkEnv: Registering BlockManagerMaster
16/09/04 01:22:18 INFO DiskBlockManager: Created local directory at /private/var/folders/_6/lfxt933j3bd_xhq0m7dwm8s00000gn/T/blockmgr-cc8a4985-f9c0-4c1a-b17f-876e146cbd87
16/09/04 01:22:18 INFO MemoryStore: MemoryStore started with capacity 2004.6 MB
16/09/04 01:22:18 INFO SparkEnv: Registering OutputCommitCoordinator
16/09/04 01:22:19 INFO Utils: Successfully started service 'SparkUI' on port 4040.
16/09/04 01:22:19 INFO SparkUI: Bound SparkUI to 0.0.0.0, and started at http://192.168.0.191:4040
16/09/04 01:22:19 INFO StandaloneAppClient$ClientEndpoint: Connecting to master spark://52.44.36.224:7077...
16/09/04 01:22:19 INFO TransportClientFactory: Successfully created connection to /52.44.36.224:7077 after 89 ms (0 ms spent in bootstraps)
16/09/04 01:22:19 INFO StandaloneSchedulerBackend: Connected to Spark cluster with app ID app-20160904082232-0001
16/09/04 01:22:19 INFO StandaloneAppClient$ClientEndpoint: Executor added: app-20160904082232-0001/0 on worker-20160904003146-172.31.3.246-40675 (172.31.3.246:40675) with 2 cores
16/09/04 01:22:19 INFO StandaloneSchedulerBackend: Granted executor ID app-20160904082232-0001/0 on hostPort 172.31.3.246:40675 with 2 cores, 1024.0 MB RAM
16/09/04 01:22:19 INFO StandaloneAppClient$ClientEndpoint: Executor added: app-20160904082232-0001/1 on worker-20160904003205-172.31.3.245-35631 (172.31.3.245:35631) with 2 cores
16/09/04 01:22:19 INFO StandaloneSchedulerBackend: Granted executor ID app-20160904082232-0001/1 on hostPort 172.31.3.245:35631 with 2 cores, 1024.0 MB RAM
16/09/04 01:22:19 INFO Utils: Successfully started service 'org.apache.spark.network.netty.NettyBlockTransferService' on port 55093.
16/09/04 01:22:19 INFO NettyBlockTransferService: Server created on 192.168.0.191:55093
16/09/04 01:22:19 INFO BlockManagerMaster: Registering BlockManager BlockManagerId(driver, 192.168.0.191, 55093)
16/09/04 01:22:19 INFO BlockManagerMasterEndpoint: Registering block manager 192.168.0.191:55093 with 2004.6 MB RAM, BlockManagerId(driver, 192.168.0.191, 55093)
16/09/04 01:22:19 INFO StandaloneAppClient$ClientEndpoint: Executor updated: app-20160904082232-0001/0 is now RUNNING
16/09/04 01:22:19 INFO BlockManagerMaster: Registered BlockManager BlockManagerId(driver, 192.168.0.191, 55093)
16/09/04 01:22:19 INFO StandaloneAppClient$ClientEndpoint: Executor updated: app-20160904082232-0001/1 is now RUNNING
16/09/04 01:22:19 INFO StandaloneSchedulerBackend: SchedulerBackend is ready for scheduling beginning after reached minRegisteredResourcesRatio: 0.0
16/09/04 01:22:20 INFO SparkContext: Starting job: start at Consumer.java:41
16/09/04 01:22:20 INFO DAGScheduler: Registering RDD 1 (start at Consumer.java:41)
16/09/04 01:22:20 INFO DAGScheduler: Got job 0 (start at Consumer.java:41) with 20 output partitions
16/09/04 01:22:20 INFO DAGScheduler: Final stage: ResultStage 1 (start at Consumer.java:41)
16/09/04 01:22:20 INFO DAGScheduler: Parents of final stage: List(ShuffleMapStage 0)
16/09/04 01:22:20 INFO DAGScheduler: Missing parents: List(ShuffleMapStage 0)
16/09/04 01:22:20 INFO DAGScheduler: Submitting ShuffleMapStage 0 (MapPartitionsRDD[1] at start at Consumer.java:41), which has no missing parents
16/09/04 01:22:20 INFO MemoryStore: Block broadcast_0 stored as values in memory (estimated size 3.1 KB, free 2004.6 MB)
16/09/04 01:22:20 INFO MemoryStore: Block broadcast_0_piece0 stored as bytes in memory (estimated size 2001.0 B, free 2004.6 MB)
16/09/04 01:22:20 INFO BlockManagerInfo: Added broadcast_0_piece0 in memory on 192.168.0.191:55093 (size: 2001.0 B, free: 2004.6 MB)
16/09/04 01:22:20 INFO SparkContext: Created broadcast 0 from broadcast at DAGScheduler.scala:1012
16/09/04 01:22:20 INFO DAGScheduler: Submitting 50 missing tasks from ShuffleMapStage 0 (MapPartitionsRDD[1] at start at Consumer.java:41)
16/09/04 01:22:20 INFO TaskSchedulerImpl: Adding task set 0.0 with 50 tasks
16/09/04 01:22:35 WARN TaskSchedulerImpl: Initial job has not accepted any resources; check your cluster UI to ensure that workers are registered and have sufficient resources
16/09/04 01:22:50 WARN TaskSchedulerImpl: Initial job has not accepted any resources; check your cluster UI to ensure that workers are registered and have sufficient resources
16/09/04 01:23:05 WARN TaskSchedulerImpl: Initial job has not accepted any resources; check your cluster UI to ensure that workers are registered and have sufficient resources
16/09/04 01:23:20 WARN TaskSchedulerImpl: Initial job has not accepted any resources; check your cluster UI to ensure that workers are registered and have sufficient resources
16/09/04 01:23:35 WARN TaskSchedulerImpl: Initial job has not accepted any resources; check your cluster UI to ensure that workers are registered and have sufficient resources
16/09/04 01:23:50 WARN TaskSchedulerImpl: Initial job has not accepted any resources; check your cluster UI to ensure that workers are registered and have sufficient resources
16/09/04 01:24:05 WARN TaskSchedulerImpl: Initial job has not accepted any resources; check your cluster UI to ensure that workers are registered and have sufficient resources
16/09/04 01:24:20 WARN TaskSchedulerImpl: Initial job has not accepted any resources; check your cluster UI to ensure that workers are registered and have sufficient resources
16/09/04 01:24:20 INFO StandaloneAppClient$ClientEndpoint: Executor updated: app-20160904082232-0001/1 is now EXITED (Command exited with code 1)
16/09/04 01:24:20 INFO StandaloneSchedulerBackend: Executor app-20160904082232-0001/1 removed: Command exited with code 1
16/09/04 01:24:20 INFO BlockManagerMaster: Removal of executor 1 requested
16/09/04 01:24:20 INFO CoarseGrainedSchedulerBackend$DriverEndpoint: Asked to remove non-existent executor 1
16/09/04 01:24:20 INFO StandaloneAppClient$ClientEndpoint: Executor added: app-20160904082232-0001/2 on worker-20160904003205-172.31.3.245-35631 (172.31.3.245:35631) with 1 cores
16/09/04 01:24:20 INFO StandaloneSchedulerBackend: Granted executor ID app-20160904082232-0001/2 on hostPort 172.31.3.245:35631 with 1 cores, 1024.0 MB RAM
16/09/04 01:24:20 INFO StandaloneAppClient$ClientEndpoint: Executor added: app-20160904082232-0001/3 on worker-20160904003146-172.31.3.246-40675 (172.31.3.246:40675) with 1 cores
16/09/04 01:24:20 INFO StandaloneSchedulerBackend: Granted executor ID app-20160904082232-0001/3 on hostPort 172.31.3.246:40675 with 1 cores, 1024.0 MB RAM

港口在哪里开放?在Spark群集的安全组中?所有端口均打开。我为所有端口都打开了它,因为我想首先让事情正常进行使用本地端口转发设置到主节点的SSH隧道。端口在哪里打开?在Spark群集的安全组中?所有端口均打开。我为所有端口打开了它,因为我想先让事情正常运行,然后使用本地端口转发设置到主节点的SSH隧道。