Warning: file_get_contents(/data/phpspider/zhask/data//catemap/3/apache-spark/6.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
Apache spark Scala Spark应用程序已提交到纱线集群并在中取消注册,但未执行任何操作 目标_Apache Spark_Cluster Computing_Yarn_Spark Dataframe - Fatal编程技术网

Apache spark Scala Spark应用程序已提交到纱线集群并在中取消注册,但未执行任何操作 目标

Apache spark Scala Spark应用程序已提交到纱线集群并在中取消注册,但未执行任何操作 目标,apache-spark,cluster-computing,yarn,spark-dataframe,Apache Spark,Cluster Computing,Yarn,Spark Dataframe,在纱线簇模式下运行scala spark应用程序jar。它适用于独立群集模式和纱线客户端,但由于某些原因,它不能在纱线群集模式下运行到完成 细节 它似乎执行的代码的最后一部分是在读取输入文件时将初始值分配给数据帧。看起来在那之后它什么也没做。没有任何日志看起来异常,也没有警告或错误。它突然被取消注册,状态成功,所有东西都被杀死。在任何其他部署模式(例如,Thread客户端、独立集群模式)上,所有操作都可以顺利完成 15/07/22 15:57:00 INFO yarn.ApplicationMa

在纱线簇模式下运行scala spark应用程序jar。它适用于独立群集模式和纱线客户端,但由于某些原因,它不能在纱线群集模式下运行到完成

细节 它似乎执行的代码的最后一部分是在读取输入文件时将初始值分配给数据帧。看起来在那之后它什么也没做。没有任何日志看起来异常,也没有警告或错误。它突然被取消注册,状态成功,所有东西都被杀死。在任何其他部署模式(例如,Thread客户端、独立集群模式)上,所有操作都可以顺利完成

15/07/22 15:57:00 INFO yarn.ApplicationMaster: Unregistering ApplicationMaster with SUCCEEDED
我也在spark 1.3.x和1.4.x上运行了这项工作,分别在香草spark/纱线簇和cdh 5.4.3簇上运行。结果都一样。可能是什么问题

作业是使用下面的命令运行的,可以通过hdfs访问输入文件

bin/spark-submit --master yarn-cluster --class AssocApp ../associationRulesScala/target/scala-2.10/AssociationRule_2.10.4-1.0.0.SNAPSHOT.jar hdfs://sparkMaster-hk:9000/user/root/BreastCancer.csv
代码片段 这是数据帧加载区域中的代码。它会弹出日志消息“上传数据帧…”,但之后就没有其他消息了。请参阅下面的驾驶员日志

//...
  logger.info("Uploading Dataframe from %s".format(filename))
  sparkParams.sqlContext.csvFile(filename)

  MDC.put("jobID",jobID.takeRight(3))
  logger.info("Extracting Unique Vals from each of %d columns...".format(frame.columns.length))
  private val uniqueVals = frame.columns.zipWithIndex.map(colname => (colname._2, colname._1, frame.select(colname._1).distinct.cache)).
//...
驱动程序日志
SLF4J:类路径包含多个SLF4J绑定。
SLF4J:在[jar:file:/tmp/hadoop root/nm local dir/usercache/root/filecache/60/spark-assembly-1.4.0-hadoop2.6.0.jar!/org/SLF4J/impl/StaticLoggerBinder.class]中找到绑定
SLF4J:在[jar:file:/root/hadoop-2.6.0/share/hadoop/common/lib/SLF4J-log4j12-1.7.5.jar!/org/SLF4J/impl/StaticLoggerBinder.class]中找到绑定
SLF4J:参见http://www.slf4j.org/codes.html#multiple_bindings 我需要一个解释。
SLF4J:实际绑定的类型为[org.SLF4J.impl.Log4jLoggerFactory]
15/07/22 15:56:52 INFO.ApplicationMaster:[TERM,HUP,INT]的注册信号处理程序
15/07/22 15:56:54 INFO.ApplicationMaster:applicationattentid:appattempt\u 1434116948302\u 0097\u000001
15/07/22 15:56:55 INFO spark.SecurityManager:将视图ACL更改为:root
15/07/22 15:56:55 INFO spark.SecurityManager:将修改ACL更改为:root
15/07/22 15:56:55 INFO spark.SecurityManager:SecurityManager:身份验证已禁用;ui ACL被禁用;具有查看权限的用户:Set(root);具有修改权限的用户:Set(root)
15/07/22 15:56:55 INFO.ApplicationMaster:在单独的线程中启动用户应用程序
15/07/22 15:56:55 INFO.ApplicationMaster:正在等待spark上下文初始化
15/07/22 15:56:55 INFO.ApplicationMaster:正在等待spark上下文初始化。。。
15/07/22 15:56:56信息AssocApp$:开始新的关联规则计算。从文件:hdfs://sparkMaster-hk:9000/user/root/BreastCancer.csv
15/07/22 15:56:56信息。应用程序管理员:最终应用程序状态:成功,退出代码:0
15/07/22 15:56:57信息关联规则。primaryPackageSpark:从上载数据帧hdfs://sparkMaster-hk:9000/user/root/BreastCancer.csv 
15/07/22 15:56:57信息spark.SparkContext:运行spark版本1.4.0
15/07/22 15:56:57 INFO spark.SecurityManager:将视图ACL更改为:root
15/07/22 15:56:57 INFO spark.SecurityManager:将修改ACL更改为:root
15/07/22 15:56:57 INFO spark.SecurityManager:SecurityManager:身份验证已禁用;ui ACL被禁用;具有查看权限的用户:Set(root);具有修改权限的用户:Set(root)
15/07/22 15:56:57信息slf4j.Slf4jLogger:Slf4jLogger已启动
15/07/22 15:56:57信息远程处理:开始远程处理
15/07/22 15:56:57信息远程处理:远程处理已开始;收听地址:[阿克卡。tcp://sparkDriver@119.81.232.13:41459]
15/07/22 15:56:57信息实用程序:已在端口41459上成功启动服务“sparkDriver”。
15/07/22 15:56:57信息spark.SparkEnv:注册MapOutputTracker
15/07/22 15:56:57信息spark.SparkEnv:注册BlockManagerMaster
15/07/22 15:56:57 INFO storage.DiskBlockManager:在/tmp/hadoop root/nm local dir/usercache/root/appcache/application\u 1434116948302\u 0097/blockmgr-f0e66040-1fdb-4a05-87e1-160194829f84创建本地目录
15/07/22 15:56:57信息存储。内存存储:内存存储已启动,容量为267.3 MB
15/07/22 15:56:58 INFO spark.HttpFileServer:HTTP文件服务器目录为/tmp/hadoop root/nm local dir/usercache/root/appcache/application_14341; 1434116948302_0097/httpd-79b304a1-3cf4-4951-9e22-bbdfac435824
15/07/22 15:56:58 INFO spark.HttpServer:正在启动HTTP服务器
15/07/22 15:56:58信息服务器。服务器:jetty-8.y.z-SNAPSHOT
15/07/22 15:56:58信息服务器。抽象连接器:已启动SocketConnector@0.0.0.0:36021
15/07/22 15:56:58 INFO util.Utils:已在端口36021上成功启动服务“HTTP文件服务器”。
15/07/22 15:56:58信息spark.SparkEnv:正在注册OutputCommitCoordinator
15/07/22 15:56:58信息ui.JettyUtils:添加过滤器:org.apache.hadoop.warn.server.webproxy.amfilter.AmIpFilter
15/07/22 15:56:58信息服务器。服务器:jetty-8.y.z-SNAPSHOT
15/07/22 15:56:58信息服务器。抽象连接器:已启动SelectChannelConnector@0.0.0.0:53274
15/07/22 15:56:58 INFO util.Utils:已成功启动端口53274上的服务“SparkUI”。
15/07/22 15:56:58信息ui.SparkUI:从http://119.XX.XXX.XX:53274
15/07/22 15:56:58 INFO cluster.YANCLUsterScheduler:已创建的YANCLUsterScheduler
15/07/22 15:56:59 INFO util.Utils:已在端口34498上成功启动服务“org.apache.spark.network.netty.NettyBlockTransferService”。
15/07/22 15:56:59 INFO netty.NettyBlockTransferService:在34498上创建的服务器
15/07/22 15:56:59信息存储。BlockManager管理员:正在尝试注册BlockManager
15/07/22 15:56:59 INFO storage.BlockManagerMasterEndpoint:使用267.3 MB RAM注册块管理器119.81.232.13:34498,BlockManagerId(驱动程序,119.81.232.13,34498)
15/07/22 15:56:59信息存储。BlockManager管理员:已注册的BlockManager
15/07/22 15:56:59信息集群。YarnSchedulerBackend$YarnSchedulerEndpoint:ApplicationMaster注册为AkkaRpcEndpointRef(参与者[akka://sparkDriver/user/YarnAM#-819146876])
15/07/22 15:56:59 INFO client.RMProxy:连接到资源
SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in [jar:file:/tmp/hadoop-root/nm-local-dir/usercache/root/filecache/60/spark-assembly-1.4.0-hadoop2.6.0.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/root/hadoop-2.6.0/share/hadoop/common/lib/slf4j-log4j12-1.7.5.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory]
15/07/22 15:56:52 INFO yarn.ApplicationMaster: Registered signal handlers for [TERM, HUP, INT]
15/07/22 15:56:54 INFO yarn.ApplicationMaster: ApplicationAttemptId: appattempt_1434116948302_0097_000001
15/07/22 15:56:55 INFO spark.SecurityManager: Changing view acls to: root
15/07/22 15:56:55 INFO spark.SecurityManager: Changing modify acls to: root
15/07/22 15:56:55 INFO spark.SecurityManager: SecurityManager: authentication disabled; ui acls disabled; users with view permissions: Set(root); users with modify permissions: Set(root)
15/07/22 15:56:55 INFO yarn.ApplicationMaster: Starting the user application in a separate Thread
15/07/22 15:56:55 INFO yarn.ApplicationMaster: Waiting for spark context initialization
15/07/22 15:56:55 INFO yarn.ApplicationMaster: Waiting for spark context initialization ... 
15/07/22 15:56:56 INFO AssocApp$: Starting new Association Rules calculation. From File: hdfs://sparkMaster-hk:9000/user/root/BreastCancer.csv
15/07/22 15:56:56 INFO yarn.ApplicationMaster: Final app status: SUCCEEDED, exitCode: 0
15/07/22 15:56:57 INFO associationRules.primaryPackageSpark: Uploading Dataframe from hdfs://sparkMaster-hk:9000/user/root/BreastCancer.csv 
15/07/22 15:56:57 INFO spark.SparkContext: Running Spark version 1.4.0
15/07/22 15:56:57 INFO spark.SecurityManager: Changing view acls to: root
15/07/22 15:56:57 INFO spark.SecurityManager: Changing modify acls to: root
15/07/22 15:56:57 INFO spark.SecurityManager: SecurityManager: authentication disabled; ui acls disabled; users with view permissions: Set(root); users with modify permissions: Set(root)
15/07/22 15:56:57 INFO slf4j.Slf4jLogger: Slf4jLogger started
15/07/22 15:56:57 INFO Remoting: Starting remoting
15/07/22 15:56:57 INFO Remoting: Remoting started; listening on addresses :[akka.tcp://sparkDriver@119.81.232.13:41459]
15/07/22 15:56:57 INFO util.Utils: Successfully started service 'sparkDriver' on port 41459.
15/07/22 15:56:57 INFO spark.SparkEnv: Registering MapOutputTracker
15/07/22 15:56:57 INFO spark.SparkEnv: Registering BlockManagerMaster
15/07/22 15:56:57 INFO storage.DiskBlockManager: Created local directory at /tmp/hadoop-root/nm-local-dir/usercache/root/appcache/application_1434116948302_0097/blockmgr-f0e66040-1fdb-4a05-87e1-160194829f84
15/07/22 15:56:57 INFO storage.MemoryStore: MemoryStore started with capacity 267.3 MB
15/07/22 15:56:58 INFO spark.HttpFileServer: HTTP File server directory is /tmp/hadoop-root/nm-local-dir/usercache/root/appcache/application_1434116948302_0097/httpd-79b304a1-3cf4-4951-9e22-bbdfac435824
15/07/22 15:56:58 INFO spark.HttpServer: Starting HTTP Server
15/07/22 15:56:58 INFO server.Server: jetty-8.y.z-SNAPSHOT
15/07/22 15:56:58 INFO server.AbstractConnector: Started SocketConnector@0.0.0.0:36021
15/07/22 15:56:58 INFO util.Utils: Successfully started service 'HTTP file server' on port 36021.
15/07/22 15:56:58 INFO spark.SparkEnv: Registering OutputCommitCoordinator
15/07/22 15:56:58 INFO ui.JettyUtils: Adding filter: org.apache.hadoop.yarn.server.webproxy.amfilter.AmIpFilter
15/07/22 15:56:58 INFO server.Server: jetty-8.y.z-SNAPSHOT
15/07/22 15:56:58 INFO server.AbstractConnector: Started SelectChannelConnector@0.0.0.0:53274
15/07/22 15:56:58 INFO util.Utils: Successfully started service 'SparkUI' on port 53274.
15/07/22 15:56:58 INFO ui.SparkUI: Started SparkUI at http://119.XX.XXX.XX:53274
15/07/22 15:56:58 INFO cluster.YarnClusterScheduler: Created YarnClusterScheduler
15/07/22 15:56:59 INFO util.Utils: Successfully started service 'org.apache.spark.network.netty.NettyBlockTransferService' on port 34498.
15/07/22 15:56:59 INFO netty.NettyBlockTransferService: Server created on 34498
15/07/22 15:56:59 INFO storage.BlockManagerMaster: Trying to register BlockManager
15/07/22 15:56:59 INFO storage.BlockManagerMasterEndpoint: Registering block manager 119.81.232.13:34498 with 267.3 MB RAM, BlockManagerId(driver, 119.81.232.13, 34498)
15/07/22 15:56:59 INFO storage.BlockManagerMaster: Registered BlockManager
15/07/22 15:56:59 INFO cluster.YarnSchedulerBackend$YarnSchedulerEndpoint: ApplicationMaster registered as AkkaRpcEndpointRef(Actor[akka://sparkDriver/user/YarnAM#-819146876])
15/07/22 15:56:59 INFO client.RMProxy: Connecting to ResourceManager at sparkMaster-hk/119.81.232.24:8030
15/07/22 15:56:59 INFO yarn.YarnRMClient: Registering the ApplicationMaster
15/07/22 15:57:00 INFO yarn.YarnAllocator: Will request 2 executor containers, each with 1 cores and 1408 MB memory including 384 MB overhead
15/07/22 15:57:00 INFO yarn.YarnAllocator: Container request (host: Any, capability: <memory:1408, vCores:1>)
15/07/22 15:57:00 INFO yarn.YarnAllocator: Container request (host: Any, capability: <memory:1408, vCores:1>)
15/07/22 15:57:00 INFO yarn.ApplicationMaster: Started progress reporter thread - sleep time : 5000
15/07/22 15:57:00 INFO yarn.ApplicationMaster: Unregistering ApplicationMaster with SUCCEEDED
15/07/22 15:57:00 INFO impl.AMRMClientImpl: Waiting for application to be successfully unregistered.
15/07/22 15:57:00 INFO yarn.ApplicationMaster: Deleting staging directory .sparkStaging/application_1434116948302_0097
15/07/22 15:57:00 INFO storage.DiskBlockManager: Shutdown hook called
15/07/22 15:57:00 INFO util.Utils: Shutdown hook called
15/07/22 15:57:00 INFO util.Utils: Deleting directory /tmp/hadoop-root/nm-local-dir/usercache/root/appcache/application_1434116948302_0097/httpd-79b304a1-3cf4-4951-9e22-bbdfac435824
15/07/22 15:57:00 INFO util.Utils: Deleting directory /tmp/hadoop-root/nm-local-dir/usercache/root/appcache/application_1434116948302_0097/userFiles-e01b4dd2-681c-4108-aec6-879774652c7a