Warning: file_get_contents(/data/phpspider/zhask/data//catemap/0/hadoop/6.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
Java 地图任务停留在50%_Java_Hadoop_Mapreduce - Fatal编程技术网

Java 地图任务停留在50%

Java 地图任务停留在50%,java,hadoop,mapreduce,Java,Hadoop,Mapreduce,我有一个mapper和reducer类,其输入和输出值如下所示 //Reducer job.setOutputKeyClass(LongWritable.class); job.setOutputValueClass(MapperOutput.class); //Mapper job.setMapOutputKeyClass(LongWritable.class); job.setMapOutputValueClass(MapperOutput.class); 这里的MapperOutput

我有一个mapper和reducer类,其输入和输出值如下所示

//Reducer
job.setOutputKeyClass(LongWritable.class);
job.setOutputValueClass(MapperOutput.class);

//Mapper
job.setMapOutputKeyClass(LongWritable.class);
job.setMapOutputValueClass(MapperOutput.class);
这里的
MapperOutput
是我定义的一个自定义类,它实现了
Writable
接口

映射器函数的一部分如下所示

public void map(LongWritable arg0, Text arg1,
        Context context)
        throws IOException 
{
    try
    {
        String tran = null;
        String ip = arg1.toString();
        System.out.println(ip);
        BufferedReader br = new BufferedReader(new StringReader(ip));
        Hsynopsis bdelta = null;
        Hsynopsis b = null, bnew = null;

        hashEntries = (int) Math.floor(calculateHashEntries()); //Hash table size
        System.out.println("Hash entries: "+hashEntries);

        //Initialize the main hash table and delta hashtable
        hashTable = new ArrayList<>(hashEntries);
        for(int i = 0; i < hashEntries; i++)
        {
            hashTable.add(i, null);
        }

        deltahashTable = new ArrayList<>(hashEntries);  
        for(int i = 0; i < hashEntries; i++)
        {
            deltahashTable.add(i, null);
        }

        while((tran = br.readLine())!=null)
        {
            createBinaryRep(tran);
            for(int i = 0; i < deltahashTable.size(); i++)
            {
                bdelta = deltahashTable.get(i);
                if(bdelta != null)
                {
                    if(bdelta.NLast_Access >= (alpha * transactionCount))
                    {
                        //Transmit bdelta to the coordinator
                        MapperOutput mp = new MapperOutput(transactionCount, bdelta);
                        context.write(new LongWritable(i), mp);

                        //Merge bdelta into b
                        b = hashTable.get(i);
                        bnew = merge(b,bdelta);
                        hashTable.set(i, bnew);

                        //Release bdelta
                        deltahashTable.set(i, null);
                    }
                }
            }
        }
    }
    catch(Exception e)
    {
        e.printStackTrace();
    }       
}
public void reduce(LongWritable index, Iterator<MapperOutput> mpValues, Context context)
{
    while(mpValues.hasNext())
    {
        /*Some code here */
    }

    context.write(index, mp);
}
映射任务已停留在50%,无法继续

当我单独运行map函数时(不是在Hadoop中),我没有任何无限循环的问题

谁能帮我拿这个吗

编辑1:我的输入文件的大小顺序为KB。这是否会导致向地图绘制者分发数据的问题

编辑2:如答案中所述,我将迭代器更改为Iterable。地图仍然会在100%时卡住,一段时间后会重新启动

我可以在jobtracker日志中看到以下内容:

2015-04-29 13:26:28,026 INFO org.apache.hadoop.mapred.TaskInProgress: Error from attempt_201504291300_0003_m_000000_0: Task attempt_201504291300_0003_m_000000_0 failed to report status for 600 seconds. Killing!
2015-04-29 13:26:28,026 INFO org.apache.hadoop.mapred.JobTracker: Removing task 'attempt_201504291300_0003_m_000000_0'

您在reduce函数中错误地使用了迭代器,而不是iterable

您需要在使用新的map reduce API时使用iterable,因为 reduce(Object,Iterable,org.apache.hadoop.mapreduce.Reducer.Context)


方法将为排序后的输入中的每一个调用。

如果代码卡在无限循环中,有时会发生这种情况。尝试检查。

使用MR单元测试用例测试代码。我认为这将有助于发现代码中的错误。。。若你们可以更新完整的映射方法,那个么我们可以看看,只是为了澄清一下,MapperOutput类已经实现了可写类???@VenkataKarthik是的,它已经实现了。我在问题中提到过。@VenkataKarthik我已经实现了可写接口,但没有覆盖函数public void readFields(DataInput arg0)和public void write(DataOutput arg0)。应该这样做吗?这会产生问题吗?如果您没有重写函数,那么序列化和反序列化将如何进行我将其更改为Iterable,但映射仍然停留在100%。我已经用JobTracker日志更新了这个问题。请看一看。您可以尝试增加mapred-site.xml mapred.task.timeout 1800000中的超时参数吗?我将超时更改为120000(1200秒),但仍然得到了此日志<代码>任务尝试\u 201504291345\u 0001\u m\u000000\u 0未能报告1200的状态更改配置值后是否需要重新格式化namenode?是否重新启动作业跟踪器和任务跟踪器?
2015-04-29 13:26:28,026 INFO org.apache.hadoop.mapred.TaskInProgress: Error from attempt_201504291300_0003_m_000000_0: Task attempt_201504291300_0003_m_000000_0 failed to report status for 600 seconds. Killing!
2015-04-29 13:26:28,026 INFO org.apache.hadoop.mapred.JobTracker: Removing task 'attempt_201504291300_0003_m_000000_0'