Warning: file_get_contents(/data/phpspider/zhask/data//catemap/0/hadoop/6.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
Hadoop 为什么MapReduce进度报告不是单调递增的?_Hadoop_Mapreduce_Report_Task - Fatal编程技术网

Hadoop 为什么MapReduce进度报告不是单调递增的?

Hadoop 为什么MapReduce进度报告不是单调递增的?,hadoop,mapreduce,report,task,Hadoop,Mapreduce,Report,Task,我向Hadoop提交了一个MapReduce作业,并在屏幕上观看进度报告。对于map任务和reduce任务,进度报告应单调递增(例如0%、10%、25%、60%、78%、95%和100%)。但事实上,所报告的进展并不是单调增长的: 14/01/21 11:05:37 INFO mapred.JobClient: Running job: job_201401201555_0036 14/01/21 11:05:38 INFO mapred.JobClient: map 0% reduce 0

我向Hadoop提交了一个MapReduce作业,并在屏幕上观看进度报告。对于map任务和reduce任务,进度报告应单调递增(例如0%、10%、25%、60%、78%、95%和100%)。但事实上,所报告的进展并不是单调增长的:

14/01/21 11:05:37 INFO mapred.JobClient: Running job: job_201401201555_0036
14/01/21 11:05:38 INFO mapred.JobClient:  map 0%  reduce 0%
14/01/21 11:06:07 INFO mapred.JobClient:  map 11% reduce 0%
14/01/21 11:06:10 INFO mapred.JobClient:  map 0%  reduce 0%
14/01/21 11:06:19 INFO mapred.JobClient:  map 9%  reduce 0%
14/01/21 11:06:22 INFO mapred.JobClient:  map 22% reduce 0%
14/01/21 11:06:25 INFO mapred.JobClient:  map 31% reduce 0%
14/01/21 11:06:28 INFO mapred.JobClient:  map 39% reduce 0%
14/01/21 11:06:29 INFO mapred.JobClient:  map 53% reduce 0%
14/01/21 11:06:30 INFO mapred.JobClient:  map 57% reduce 0%
14/01/21 11:06:32 INFO mapred.JobClient:  map 50% reduce 0%
14/01/21 11:06:33 INFO mapred.JobClient:  map 55% reduce 0%
14/01/21 11:06:34 INFO mapred.JobClient:  map 43% reduce 0%
14/01/21 11:06:35 INFO mapred.JobClient:  map 48% reduce 0%
14/01/21 11:06:36 INFO mapred.JobClient:  map 40% reduce 0%
14/01/21 11:06:38 INFO mapred.JobClient:  map 30% reduce 0%
14/01/21 11:06:40 INFO mapred.JobClient:  map 40% reduce 0%
14/01/21 11:06:41 INFO mapred.JobClient:  map 49% reduce 0%
14/01/21 11:06:43 INFO mapred.JobClient:  map 57% reduce 0%
14/01/21 11:06:44 INFO mapred.JobClient:  map 70% reduce 0%
14/01/21 11:06:46 INFO mapred.JobClient:  map 73% reduce 0%
14/01/21 11:06:47 INFO mapred.JobClient:  map 82% reduce 0%
14/01/21 11:06:48 INFO mapred.JobClient:  map 93% reduce 0%
14/01/21 11:06:50 INFO mapred.JobClient:  map 94% reduce 0%
14/01/21 11:06:52 INFO mapred.JobClient:  map 95% reduce 0%
14/01/21 11:06:53 INFO mapred.JobClient:  map 96% reduce 0%
14/01/21 11:06:56 INFO mapred.JobClient:  map 98% reduce 0%
14/01/21 11:06:59 INFO mapred.JobClient:  map 99% reduce 0%
14/01/21 11:07:00 INFO mapred.JobClient:  map 100% reduce 0%
14/01/21 11:07:19 INFO mapred.JobClient:  map 100% reduce 4%
14/01/21 11:07:22 INFO mapred.JobClient:  map 100% reduce 8%
14/01/21 11:07:25 INFO mapred.JobClient:  map 100% reduce 66%
14/01/21 11:07:29 INFO mapred.JobClient:  map 100% reduce 67%
14/01/21 11:07:32 INFO mapred.JobClient:  map 100% reduce 68%
14/01/21 11:07:35 INFO mapred.JobClient:  map 100% reduce 69%
14/01/21 11:07:41 INFO mapred.JobClient:  map 100% reduce 70%
14/01/21 11:07:47 INFO mapred.JobClient:  map 100% reduce 71%
14/01/21 11:07:53 INFO mapred.JobClient:  map 100% reduce 72%
14/01/21 11:07:59 INFO mapred.JobClient:  map 100% reduce 73%
14/01/21 11:08:02 INFO mapred.JobClient:  map 100% reduce 100%
14/01/21 11:08:03 INFO mapred.JobClient: Job complete: job_201401201555_0036

进度由所有输入拆分中已处理的拆分百分比表示。但为什么进度报告不是单调地增加

检查TaskTracker和jobtracker的日志。map阶段是否有任何故障?如果一台机器无法执行某项任务,或者主机再也无法到达,则该任务将由另一台机器从头开始重新执行;因此进度不是单调增加的。

我检查了UI(),发现没有失败/终止的任务尝试,我认为这意味着映射阶段没有失败。然而,这项工作的进度并不是单调增加的。您是否在任何地方使用
conf.setProgress(float progress)
(我不这么认为)?如果是的话,那么你可能会在某个地方把它设置为一个较低的值。您使用了多少个节点?一个还是多个?是否捕获到任何异常(检查每个tasktracker的日志)。你查过求职者的日志了吗?我已经看过很多次了。请检查日志:可能会发生错误。我的集群中有10个节点,其中一个为namenode/jobtracker,另一个为datanode/tasktracker。我没有使用
conf.setProgress(浮点进程)
。我检查了每个tasktracker的日志,发现:a)没有发现异常;b) 有2次任务尝试被杀死;c) 所有TaskTracker都收到“KillJobAction”。事实上,结果都是正确的。这个问题不会影响结果,因为作业成功完成。这只是一个重新做已经做过的事情的问题,可能是由于一些(沟通)错误。我认为,这两次失败的尝试并不能解释你的产出中出现的五次(11%->0%,57%->50%,55%->43%,48%->40%,40%->30%)进度下降。jobtracker日志的输出如何?