Hadoop 为什么MapReduce进度报告不是单调递增的?
我向Hadoop提交了一个MapReduce作业,并在屏幕上观看进度报告。对于map任务和reduce任务,进度报告应单调递增(例如0%、10%、25%、60%、78%、95%和100%)。但事实上,所报告的进展并不是单调增长的:Hadoop 为什么MapReduce进度报告不是单调递增的?,hadoop,mapreduce,report,task,Hadoop,Mapreduce,Report,Task,我向Hadoop提交了一个MapReduce作业,并在屏幕上观看进度报告。对于map任务和reduce任务,进度报告应单调递增(例如0%、10%、25%、60%、78%、95%和100%)。但事实上,所报告的进展并不是单调增长的: 14/01/21 11:05:37 INFO mapred.JobClient: Running job: job_201401201555_0036 14/01/21 11:05:38 INFO mapred.JobClient: map 0% reduce 0
14/01/21 11:05:37 INFO mapred.JobClient: Running job: job_201401201555_0036
14/01/21 11:05:38 INFO mapred.JobClient: map 0% reduce 0%
14/01/21 11:06:07 INFO mapred.JobClient: map 11% reduce 0%
14/01/21 11:06:10 INFO mapred.JobClient: map 0% reduce 0%
14/01/21 11:06:19 INFO mapred.JobClient: map 9% reduce 0%
14/01/21 11:06:22 INFO mapred.JobClient: map 22% reduce 0%
14/01/21 11:06:25 INFO mapred.JobClient: map 31% reduce 0%
14/01/21 11:06:28 INFO mapred.JobClient: map 39% reduce 0%
14/01/21 11:06:29 INFO mapred.JobClient: map 53% reduce 0%
14/01/21 11:06:30 INFO mapred.JobClient: map 57% reduce 0%
14/01/21 11:06:32 INFO mapred.JobClient: map 50% reduce 0%
14/01/21 11:06:33 INFO mapred.JobClient: map 55% reduce 0%
14/01/21 11:06:34 INFO mapred.JobClient: map 43% reduce 0%
14/01/21 11:06:35 INFO mapred.JobClient: map 48% reduce 0%
14/01/21 11:06:36 INFO mapred.JobClient: map 40% reduce 0%
14/01/21 11:06:38 INFO mapred.JobClient: map 30% reduce 0%
14/01/21 11:06:40 INFO mapred.JobClient: map 40% reduce 0%
14/01/21 11:06:41 INFO mapred.JobClient: map 49% reduce 0%
14/01/21 11:06:43 INFO mapred.JobClient: map 57% reduce 0%
14/01/21 11:06:44 INFO mapred.JobClient: map 70% reduce 0%
14/01/21 11:06:46 INFO mapred.JobClient: map 73% reduce 0%
14/01/21 11:06:47 INFO mapred.JobClient: map 82% reduce 0%
14/01/21 11:06:48 INFO mapred.JobClient: map 93% reduce 0%
14/01/21 11:06:50 INFO mapred.JobClient: map 94% reduce 0%
14/01/21 11:06:52 INFO mapred.JobClient: map 95% reduce 0%
14/01/21 11:06:53 INFO mapred.JobClient: map 96% reduce 0%
14/01/21 11:06:56 INFO mapred.JobClient: map 98% reduce 0%
14/01/21 11:06:59 INFO mapred.JobClient: map 99% reduce 0%
14/01/21 11:07:00 INFO mapred.JobClient: map 100% reduce 0%
14/01/21 11:07:19 INFO mapred.JobClient: map 100% reduce 4%
14/01/21 11:07:22 INFO mapred.JobClient: map 100% reduce 8%
14/01/21 11:07:25 INFO mapred.JobClient: map 100% reduce 66%
14/01/21 11:07:29 INFO mapred.JobClient: map 100% reduce 67%
14/01/21 11:07:32 INFO mapred.JobClient: map 100% reduce 68%
14/01/21 11:07:35 INFO mapred.JobClient: map 100% reduce 69%
14/01/21 11:07:41 INFO mapred.JobClient: map 100% reduce 70%
14/01/21 11:07:47 INFO mapred.JobClient: map 100% reduce 71%
14/01/21 11:07:53 INFO mapred.JobClient: map 100% reduce 72%
14/01/21 11:07:59 INFO mapred.JobClient: map 100% reduce 73%
14/01/21 11:08:02 INFO mapred.JobClient: map 100% reduce 100%
14/01/21 11:08:03 INFO mapred.JobClient: Job complete: job_201401201555_0036
进度由所有输入拆分中已处理的拆分百分比表示。但为什么进度报告不是单调地增加 检查TaskTracker和jobtracker的日志。map阶段是否有任何故障?如果一台机器无法执行某项任务,或者主机再也无法到达,则该任务将由另一台机器从头开始重新执行;因此进度不是单调增加的。我检查了UI(),发现没有失败/终止的任务尝试,我认为这意味着映射阶段没有失败。然而,这项工作的进度并不是单调增加的。您是否在任何地方使用
conf.setProgress(float progress)
(我不这么认为)?如果是的话,那么你可能会在某个地方把它设置为一个较低的值。您使用了多少个节点?一个还是多个?是否捕获到任何异常(检查每个tasktracker的日志)。你查过求职者的日志了吗?我已经看过很多次了。请检查日志:可能会发生错误。我的集群中有10个节点,其中一个为namenode/jobtracker,另一个为datanode/tasktracker。我没有使用conf.setProgress(浮点进程)
。我检查了每个tasktracker的日志,发现:a)没有发现异常;b) 有2次任务尝试被杀死;c) 所有TaskTracker都收到“KillJobAction”。事实上,结果都是正确的。这个问题不会影响结果,因为作业成功完成。这只是一个重新做已经做过的事情的问题,可能是由于一些(沟通)错误。我认为,这两次失败的尝试并不能解释你的产出中出现的五次(11%->0%,57%->50%,55%->43%,48%->40%,40%->30%)进度下降。jobtracker日志的输出如何?