Apache spark 火花作业在活动阶段显示未知,并卡滞

Apache spark 火花作业在活动阶段显示未知,并卡滞,apache-spark,apache-spark-1.5.2,Apache Spark,Apache Spark 1.5.2,我正在运行一个Spark作业来计算交互。映射后,我按我想要的键分组,Spark保持在挂起状态,不显示任何错误和阶段的未知信息 我想知道是什么原因造成的,我如何检查它,因为我在本地运行,这是正常的 检查日志,没有错误消息 6/01/05 14:44:47 DEBUG AkkaRpcEnv$$anonfun$actorRef$lzycompute$1$1$$anon$1: [actor] received message AkkaMessage(ExpireDeadHosts,true) from

我正在运行一个Spark作业来计算交互。映射后,我按我想要的键分组,Spark保持在挂起状态,不显示任何错误和阶段的未知信息

我想知道是什么原因造成的,我如何检查它,因为我在本地运行,这是正常的

检查日志,没有错误消息

6/01/05 14:44:47 DEBUG AkkaRpcEnv$$anonfun$actorRef$lzycompute$1$1$$anon$1: [actor] received message AkkaMessage(ExpireDeadHosts,true) from Actor[akka://sparkDriver/temp/$Sm]
16/01/05 14:44:47 DEBUG AkkaRpcEnv$$anonfun$actorRef$lzycompute$1$1$$anon$1: Received RPC message: AkkaMessage(ExpireDeadHosts,true)
16/01/05 14:44:47 DEBUG AkkaRpcEnv$$anonfun$actorRef$lzycompute$1$1$$anon$1: [actor] handled message (0.262362 ms) AkkaMessage(ExpireDeadHosts,true) from Actor[akka://sparkDriver/temp/$Sm]

16/01/05 14:44:53 DEBUG AkkaRpcEnv$$anonfun$actorRef$lzycompute$1$1$$anon$1: [actor] received message AkkaMessage(Heartbeat(driver,[Lscala.Tuple2;@5757087f,BlockManagerId(driver, localhost, 56860)),true) from Actor[akka://sparkDriver/temp/$Tm]
        16/01/05 14:45:03 DEBUG AkkaRpcEnv$$anonfun$actorRef$lzycompute$1$1$$anon$1: Received RPC message: AkkaMessage(BlockManagerHeartbeat(BlockManagerId(driver, localhost, 56860)),true)
        16/01/05 14:45:03 DEBUG AkkaRpcEnv$$anonfun$actorRef$lzycompute$1$1$$anon$1: [actor] handled message (0.319169 ms) AkkaMessage(BlockManagerHeartbeat(BlockManagerId(driver, localhost, 56860)),true) from Actor[akka://sparkDriver/temp/$Wm]
        16/01/05 14:45:13 DEBUG AkkaRpcEnv$$anonfun$actorRef$lzycompute$1$1$$anon$1: [actor] received message AkkaMessage(Heartbeat(driver,[Lscala.Tuple2;@682d459,BlockManagerId(driver, localhost, 56860)),true) from Actor[akka://sparkDriver/temp/$Xm]
我正在使用spark 1.5.2,并在Amazon实例上进行了测试

我通过命令运行作业:

spark提交--类com.knx.analytics.InteractionProcessor--文件 dev.conf——conf 'spark.executor.extraJavaOptions=-Dconfig.fuction.conf'--conf 'spark.driver.extraJavaOptions=-Dconfig.file=dev.conf'--jars fast-aggregate-assembly-1.0-deps.jar——驱动程序内存5g fast-aggregate-1.jar-s 2015-11-02-e 2015-11-06

更新

ubuntu@adedge-bd-test:~ [23:20:53]$ jps -lm
10903 sun.tools.jps.Jps -lm
7834 org.apache.spark.deploy.SparkSubmit --conf spark.driver.memory=3g --conf spark.executor.extraJavaOptions=-Dconfig.fuction.conf --conf spark.driver.extraJavaOptions=-Dconfig.file=dev.conf --class com.knx.analytics.InteractionProcessor --files dev.conf --jars fast-aggregate-assembly-1.0-deps.jar fast-aggregate.jar -s 2015-11-02 -e 2015-11-02
完整的jstack日志是

其中一些

"main" prio=10 tid=0x00007f2bb8008000 nid=0x1ebd in Object.wait() [0x00007f2bc19d5000]
   java.lang.Thread.State: WAITING (on object monitor)
    at java.lang.Object.wait(Native Method)
    - waiting on <0x0000000744008a88> (a org.apache.spark.scheduler.JobWaiter)
    at java.lang.Object.wait(Object.java:503)
    at org.apache.spark.scheduler.JobWaiter.awaitResult(JobWaiter.scala:73)
    - locked <0x0000000744008a88> (a org.apache.spark.scheduler.JobWaiter)
    at org.apache.spark.scheduler.DAGScheduler.runJob(DAGScheduler.scala:559)
    at org.apache.spark.SparkContext.runJob(SparkContext.scala:1824)
    at org.apache.spark.SparkContext.runJob(SparkContext.scala:1837)
    at org.apache.spark.SparkContext.runJob(SparkContext.scala:1914)
    at org.apache.spark.rdd.PairRDDFunctions$$anonfun$saveAsNewAPIHadoopDataset$1.apply$mcV$sp(PairRDDFunctions.scala:1055)
    at org.apache.spark.rdd.PairRDDFunctions$$anonfun$saveAsNewAPIHadoopDataset$1.apply(PairRDDFunctions.scala:998)
    at org.apache.spark.rdd.PairRDDFunctions$$anonfun$saveAsNewAPIHadoopDataset$1.apply(PairRDDFunctions.scala:998)
    at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:147)
    at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:108)
    at org.apache.spark.rdd.RDD.withScope(RDD.scala:310)
    at org.apache.spark.rdd.PairRDDFunctions.saveAsNewAPIHadoopDataset(PairRDDFunctions.scala:998)
    at org.apache.spark.rdd.PairRDDFunctions$$anonfun$saveAsNewAPIHadoopFile$2.apply$mcV$sp(PairRDDFunctions.scala:938)
    at org.apache.spark.rdd.PairRDDFunctions$$anonfun$saveAsNewAPIHadoopFile$2.apply(PairRDDFunctions.scala:930)
    at org.apache.spark.rdd.PairRDDFunctions$$anonfun$saveAsNewAPIHadoopFile$2.apply(PairRDDFunctions.scala:930)
    at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:147)
    at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:108)
    at org.apache.spark.rdd.RDD.withScope(RDD.scala:310)
    at org.apache.spark.rdd.PairRDDFunctions.saveAsNewAPIHadoopFile(PairRDDFunctions.scala:930)
    at com.knx.analytics.InteractionProcessor$.writeToMongo(InteractionProcessor.scala:150)
    at com.knx.analytics.InteractionProcessor$.main(InteractionProcessor.scala:138)
    at com.knx.analytics.InteractionProcessor.main(InteractionProcessor.scala)
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
    at java.lang.reflect.Method.invoke(Method.java:606)
    at org.apache.spark.deploy.SparkSubmit$.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:674)
    at org.apache.spark.deploy.SparkSubmit$.doRunMain$1(SparkSubmit.scala:180)
    at org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:205)
    at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:120)
    at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)

   Locked ownable synchronizers:
    - None
“main”prio=10 tid=0x00007f2bb8008000 nid=0x1ebd在Object.wait()中[0x00007f2bc19d5000]
java.lang.Thread.State:正在等待(在对象监视器上)
在java.lang.Object.wait(本机方法)
-等待(org.apache.spark.scheduler.jobwater)
等待(Object.java:503)
在org.apache.spark.scheduler.JobWaiter.waitresult(JobWaiter.scala:73)上
-锁定(org.apache.spark.scheduler.jobwater)
位于org.apache.spark.scheduler.DAGScheduler.runJob(DAGScheduler.scala:559)
位于org.apache.spark.SparkContext.runJob(SparkContext.scala:1824)
位于org.apache.spark.SparkContext.runJob(SparkContext.scala:1837)
位于org.apache.spark.SparkContext.runJob(SparkContext.scala:1914)
位于org.apache.spark.rdd.pairddfunctions$$anonfun$saveAsNewAPIHadoopDataset$1.apply$mcV$sp(pairddfunctions.scala:1055)
位于org.apache.spark.rdd.pairddfunctions$$anonfun$saveAsNewAPIHadoopDataset$1.apply(pairddfunctions.scala:998)
位于org.apache.spark.rdd.pairddfunctions$$anonfun$saveAsNewAPIHadoopDataset$1.apply(pairddfunctions.scala:998)
位于org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:147)
位于org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:108)
位于org.apache.spark.rdd.rdd.withScope(rdd.scala:310)
位于org.apache.spark.rdd.pairddfunctions.saveAsNewAPIHadoopDataset(pairddfunctions.scala:998)
位于org.apache.spark.rdd.pairddfunctions$$anonfun$saveAsNewAPIHadoopFile$2.apply$mcV$sp(pairddfunctions.scala:938)
位于org.apache.spark.rdd.pairddfunctions$$anonfun$saveAsNewAPIHadoopFile$2.apply(pairddfunctions.scala:930)
位于org.apache.spark.rdd.pairddfunctions$$anonfun$saveAsNewAPIHadoopFile$2.apply(pairddfunctions.scala:930)
位于org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:147)
位于org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:108)
位于org.apache.spark.rdd.rdd.withScope(rdd.scala:310)
位于org.apache.spark.rdd.pairddfunctions.saveAsNewAPIHadoopFile(pairddfunctions.scala:930)
在com.knx.analytics.InteractionProcessor$.writeToMongo(InteractionProcessor.scala:150)上
在com.knx.analytics.InteractionProcessor$.main上(InteractionProcessor.scala:138)
位于com.knx.analytics.InteractionProcessor.main(InteractionProcessor.scala)
在sun.reflect.NativeMethodAccessorImpl.invoke0(本机方法)处
在sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)中
在sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)中
位于java.lang.reflect.Method.invoke(Method.java:606)
位于org.apache.spark.deploy.SparkSubmit$.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:674)
位于org.apache.spark.deploy.SparkSubmit$.doRunMain$1(SparkSubmit.scala:180)
位于org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:205)
位于org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:120)
位于org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
锁定可拥有的同步器:
-没有
在搜索后发现了一些与

你能分享你的代码吗?你能
jps-lm
然后
jstack
it吗?用输出编辑问题。@JacekLaskowski请帮助检查日志。我通过谷歌硬盘连接。由于字符限制,无法在此处输入谢谢。主URL是什么?你在用MongoDB吗?你从哪里读取数据?您知道为什么您的作业有200个任务,即200个分区吗?我使用local[4]以本地模式运行。是的,我正在使用Mongodb。我从本地主机上读取。我不知道为什么有200个任务。我从mongo创建数据帧数据。
"main" prio=10 tid=0x00007f2bb8008000 nid=0x1ebd in Object.wait() [0x00007f2bc19d5000]
   java.lang.Thread.State: WAITING (on object monitor)
    at java.lang.Object.wait(Native Method)
    - waiting on <0x0000000744008a88> (a org.apache.spark.scheduler.JobWaiter)
    at java.lang.Object.wait(Object.java:503)
    at org.apache.spark.scheduler.JobWaiter.awaitResult(JobWaiter.scala:73)
    - locked <0x0000000744008a88> (a org.apache.spark.scheduler.JobWaiter)
    at org.apache.spark.scheduler.DAGScheduler.runJob(DAGScheduler.scala:559)
    at org.apache.spark.SparkContext.runJob(SparkContext.scala:1824)
    at org.apache.spark.SparkContext.runJob(SparkContext.scala:1837)
    at org.apache.spark.SparkContext.runJob(SparkContext.scala:1914)
    at org.apache.spark.rdd.PairRDDFunctions$$anonfun$saveAsNewAPIHadoopDataset$1.apply$mcV$sp(PairRDDFunctions.scala:1055)
    at org.apache.spark.rdd.PairRDDFunctions$$anonfun$saveAsNewAPIHadoopDataset$1.apply(PairRDDFunctions.scala:998)
    at org.apache.spark.rdd.PairRDDFunctions$$anonfun$saveAsNewAPIHadoopDataset$1.apply(PairRDDFunctions.scala:998)
    at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:147)
    at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:108)
    at org.apache.spark.rdd.RDD.withScope(RDD.scala:310)
    at org.apache.spark.rdd.PairRDDFunctions.saveAsNewAPIHadoopDataset(PairRDDFunctions.scala:998)
    at org.apache.spark.rdd.PairRDDFunctions$$anonfun$saveAsNewAPIHadoopFile$2.apply$mcV$sp(PairRDDFunctions.scala:938)
    at org.apache.spark.rdd.PairRDDFunctions$$anonfun$saveAsNewAPIHadoopFile$2.apply(PairRDDFunctions.scala:930)
    at org.apache.spark.rdd.PairRDDFunctions$$anonfun$saveAsNewAPIHadoopFile$2.apply(PairRDDFunctions.scala:930)
    at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:147)
    at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:108)
    at org.apache.spark.rdd.RDD.withScope(RDD.scala:310)
    at org.apache.spark.rdd.PairRDDFunctions.saveAsNewAPIHadoopFile(PairRDDFunctions.scala:930)
    at com.knx.analytics.InteractionProcessor$.writeToMongo(InteractionProcessor.scala:150)
    at com.knx.analytics.InteractionProcessor$.main(InteractionProcessor.scala:138)
    at com.knx.analytics.InteractionProcessor.main(InteractionProcessor.scala)
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
    at java.lang.reflect.Method.invoke(Method.java:606)
    at org.apache.spark.deploy.SparkSubmit$.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:674)
    at org.apache.spark.deploy.SparkSubmit$.doRunMain$1(SparkSubmit.scala:180)
    at org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:205)
    at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:120)
    at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)

   Locked ownable synchronizers:
    - None