Warning: file_get_contents(/data/phpspider/zhask/data//catemap/9/java/340.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
Java Hadoop作业在地图上死机100%减少0%_Java_Hadoop_Nutch - Fatal编程技术网

Java Hadoop作业在地图上死机100%减少0%

Java Hadoop作业在地图上死机100%减少0%,java,hadoop,nutch,Java,Hadoop,Nutch,我同时使用hadoop 2.7.2和nutch 1.12 17/10/03 14:01:52 INFO mapreduce.Job: Running job: job_1506573729189_0223 17/10/03 14:02:05 INFO mapreduce.Job: Job job_1506573729189_0223 running in uber mode : false 17/10/03 14:02:05 INFO mapreduce.Job: map 0% reduce

我同时使用hadoop 2.7.2和nutch 1.12

17/10/03 14:01:52 INFO mapreduce.Job: Running job: job_1506573729189_0223
17/10/03 14:02:05 INFO mapreduce.Job: Job job_1506573729189_0223 running in uber mode : false
17/10/03 14:02:05 INFO mapreduce.Job:  map 0% reduce 0%
17/10/03 14:02:15 INFO mapreduce.Job:  map 1% reduce 0%
17/10/03 14:02:18 INFO mapreduce.Job:  map 2% reduce 0%
17/10/03 14:02:21 INFO mapreduce.Job:  map 3% reduce 0%
17/10/03 14:02:24 INFO mapreduce.Job:  map 4% reduce 0%
17/10/03 14:02:27 INFO mapreduce.Job:  map 8% reduce 0%
17/10/03 14:02:30 INFO mapreduce.Job:  map 12% reduce 0%
17/10/03 14:03:35 INFO mapreduce.Job: Task Id : attempt_1506573729189_0223_m_000000_0, Status : FAILED
Error: Java heap space
17/10/03 14:03:36 INFO mapreduce.Job:  map 11% reduce 0%
17/10/03 14:03:46 INFO mapreduce.Job:  map 14% reduce 0%
17/10/03 14:04:48 INFO mapreduce.Job: Task Id : attempt_1506573729189_0223_m_000000_1, Status : FAILED
Error: Java heap space
为了消除上述错误,我将以下更改添加到hadoops mapred-site.xml中

<property>
    <name>mapred.child.java.opts</name>
    <value>-Xmx4096m</value>
  </property>
通过在中设置以下属性,可以删除这些错误 我还删除了上面的属性“mapred.child.java.opts”

<property>
    <name>mapreduce.map.memory.mb</name>
    <value>4096</value>
  </property>
  <property>
    <name>mapreduce.reduce.memory.mb</name>
    <value>8192</value>
  </property>
  <property>
    <name>mapreduce.map.java.opts</name>
    <value>-Xmx3072m</value>
  </property>
  <property>
    <name>mapreduce.reduce.java.opts</name>
    <value>-Xmx6144m</value>
  </property>

问题发生时您是否进行过堆转储?它告诉你什么?只是一个假设:一个大文档(可能格式不正确)会导致解析器挂起并消耗大量内存。属性
http.content.limit
parser.skip.truncated
parser.timeout
的值是什么?还可以查看挂起任务的任务日志文件(ev.set
log4j.logger.org.apache.nutch.parse.ParserDebug=DEBUG
)。如果您可以登录到运行任务的机器,请尝试获取堆栈跟踪,例如,通过
sudo-u warn jstack
。问题发生时您是否进行了堆转储?它告诉你什么?只是一个假设:一个大文档(可能格式不正确)会导致解析器挂起并消耗大量内存。属性
http.content.limit
parser.skip.truncated
parser.timeout
的值是什么?还可以查看挂起任务的任务日志文件(ev.set
log4j.logger.org.apache.nutch.parse.ParserDebug=DEBUG
)。如果您可以登录到运行任务的机器,请尝试获取堆栈跟踪,例如,通过
sudo-u-jstack
<property>
    <name>mapreduce.map.memory.mb</name>
    <value>4096</value>
  </property>
  <property>
    <name>mapreduce.reduce.memory.mb</name>
    <value>8192</value>
  </property>
  <property>
    <name>mapreduce.map.java.opts</name>
    <value>-Xmx3072m</value>
  </property>
  <property>
    <name>mapreduce.reduce.java.opts</name>
    <value>-Xmx6144m</value>
  </property>
17/10/11 17:00:32 INFO mapreduce.Job: Running job: job_1507721357521_0001
17/10/11 17:00:56 INFO mapreduce.Job: Job job_1507721357521_0001 running in uber mode : false
17/10/11 17:00:56 INFO mapreduce.Job:  map 0% reduce 0%
17/10/11 17:01:08 INFO mapreduce.Job:  map 100% reduce 0%