Java Hadoop单节点静默冻结_Java_Hadoop_Mapreduce

Java Hadoop单节点静默冻结

java hadoop mapreduce

Java Hadoop单节点静默冻结,java,hadoop,mapreduce,Java,Hadoop,Mapreduce,我有一个MapReduce工具，它在第一个mapper上冻结，没有明显的输出。因为这是单节点安装，所以我无法访问job tracker web界面进行调试。无论输入文件大小如何，我都会得到这种行为。我已经花了整整一天的时间来研究这个问题，现在我准备把我的头发拔出来。输出如下所示： 13/09/12 15:12:14 INFO util.NativeCodeLoader: Loaded the native-hadoop library 13/09/12 15:12:14 WA RN mapred

我有一个MapReduce工具，它在第一个mapper上冻结，没有明显的输出。因为这是单节点安装，所以我无法访问job tracker web界面进行调试。无论输入文件大小如何，我都会得到这种行为。我已经花了整整一天的时间来研究这个问题，现在我准备把我的头发拔出来。输出如下所示：

13/09/12 15:12:14 INFO util.NativeCodeLoader: Loaded the native-hadoop library
13/09/12 15:12:14 WA
RN mapred.JobClient: Use GenericOptionsParser for parsing the arguments. Applications should implement Tool for the same.
13/09/12 15:12:14 INFO input.FileInputFormat: Total input paths to process : 1
13/09/12 15:12:14 INFO mapred.JobClient: Running job: job_local1132137425_0001
13/09/12 15:12:14 INFO mapred.LocalJobRunner: Waiting for map tasks
13/09/12 15:12:14 INFO mapred.LocalJobRunner: Starting task: attempt_local1132137425_0001_m_000000_0
13/09/12 15:12:14 INFO util.ProcessTree: setsid exited with exit code 0
13/09/12 15:12:14 INFO mapred.Task:  Using ResourceCalculatorPlugin : org.apache.hadoop.util.LinuxResourceCalculatorPlugin@339c98d3
13/09/12 15:12:14 INFO mapred.MapTask: Processing split: file:/home/axelmagn/EclipseWorkspace/AxelMagnusonCoursework/assign-2/data/in/input.csv:0+33554432
13/09/12 15:12:14 WARN snappy.LoadSnappy: Snappy native library not loaded
13/09/12 15:12:14 INFO mapred.MapTask: io.sort.mb = 100
13/09/12 15:12:14 INFO mapred.MapTask: data buffer = 79691776/99614720
13/09/12 15:12:14 INFO mapred.MapTask: record buffer = 262144/327680
13/09/12 15:12:15 INFO mapred.JobClient:  map 0% reduce 0%
13/09/12 15:12:15 INFO mapred.MapTask: Starting flush of map output
13/09/12 15:12:15 INFO mapred.MapTask: Starting flush of map output
13/09/12 15:12:20 INFO mapred.LocalJobRunner: 
13/09/12 15:12:21 INFO mapred.JobClient:  map 20% reduce 0%

然后它就无限期地挂起

工具程序（节略）：

工作：

public class VisitorCountJob extends Job {

    public static final String TAB = "\t";

    public VisitorCountJob(Path inputPath, Path outputPath)
            throws IOException {
        super();
        this.setJarByClass(VisitorCountJob.class);
        this.setJobName("Visitor Count");

        this.setInputFormatClass(VisitInputFormat.class);

        VisitInputFormat.setInputPaths(this, inputPath);
        FileOutputFormat.setOutputPath(this, outputPath);

        this.setMapperClass(VisitorCountMapper.class);
        this.setReducerClass(VisitorCountReducer.class);

        this.setOutputKeyClass(Person.class);
        this.setOutputValueClass(IntWritable.class);

        this.setOutputFormatClass(SequenceFileOutputFormat.class);
    }

}

制图员：

public class VisitorCountMapper extends
        Mapper<LongWritable, Visit, Person, IntWritable> {

    @Override
    public void map(LongWritable key, Visit value, Context context)
            throws IOException, InterruptedException {

        try {
            Person visitor = value.getVisitor();
            context.write(visitor, new IntWritable(1));
        } catch (IOException e) {
            e.printStackTrace();
            throw e;
        } catch (InterruptedException e) {
            e.printStackTrace();
            throw e;
        }
    }
}

公共类VisitorCountMapper扩展
制图员{
@凌驾
公共void映射（可长写键、访问值、上下文）
抛出IOException、InterruptedException{
试一试{
Person visitor=value.getVisitor（）；
write（访问者，新的intwriteable（1））；
}捕获（IOE异常）{
e、 printStackTrace（）；
投掷e；
}捕捉（中断异常e）{
e、 printStackTrace（）；
投掷e；
}
}
}

减速器：

public class VisitorCountReducer extends
        Reducer<Person, IntWritable, Person, IntWritable> {

    @Override
    public void reduce(Person visitor, Iterable<IntWritable> values,
            Context context) throws IOException, InterruptedException {

        int count = 0;
        for (IntWritable value : values) {
            count += value.get();
        }
        context.write(visitor, new IntWritable(count));
    }

}

公共类VisitorCountReducer扩展
减速器{
@凌驾
公共空间减少（访客人数、Iterable值、，
上下文）抛出IOException、InterruptedException{
整数计数=0；
for（可写入值：值）{
count+=value.get（）；
}
write（visitor，newintwriteable（count））；
}
}

我还编写了InputFormat和RecordReader来从原始文本生成访问对象，但为了简洁起见，我将省略它们，除非有人认为它们相关

我真是束手无策，所以非常感谢你的帮助

编辑：由于表达了兴趣，以下是我的一些数据类型实现：

人:

public class Person implements WritableComparable<Person>  {

    public Text firstName;
    public Text lastName;

    public Person() {}

    public Person(Text firstName, Text lastName) {
        this.firstName = firstName;
        this.lastName = lastName;
    }

    public Person(String firstName, String lastName) {
        this(new Text(firstName), new Text(lastName));
    }

    public void readFields(DataInput in) throws IOException {
        firstName.readFields(in);
        lastName.readFields(in);

    }

    public void write(DataOutput out) throws IOException {
        firstName.write(out);
        lastName.write(out);
    }

    public int compareTo(Person other) {

        int out;

        // give sorting preference to first name
        out = firstName.compareTo(other.firstName);
        if(out != 0)
            return out;
        return lastName.compareTo(other.lastName);
    }

}

public类Person实现可写性{
公共文本名；
公共文本姓氏；
公众人物（）{}
公众人物（文本姓氏、文本姓氏）{
this.firstName=firstName；
this.lastName=lastName；
}
公众人物（字符串名、字符串名）{
此（新文本（名）、新文本（名））；
}
public void readFields（DataInput in）引发IOException{
firstName.readFields（in）；
lastName.readFields（在中）；
}
public void write（DataOutput out）引发IOException{
名字。写下来；
姓氏。写出；
}
公共int比较（其他人）{
指出；
//优先排序第一个名字
out=firstName.compareTo（其他.firstName）；
如果（输出！=0）
返回；
返回lastName.compareTo（其他.lastName）；
}
}

VisitInputFormat：

public class VisitInputFormat extends FileInputFormat<LongWritable, Visit> {

    public RecordReader<LongWritable, Visit> createRecordReader(
            InputSplit split, TaskAttemptContext context) 
            throws IOException, InterruptedException {

        VisitRecordReader reader = new VisitRecordReader();
        reader.initialize(split, context);
        return reader;
    }
}

公共类VisitInputFormat扩展了FileInputFormat{
公共记录阅读器createRecordReader(
InputSplit拆分，TaskAttemptContext（上下文）
抛出IOException、InterruptedException{
VisitRecordReader=新建VisitRecordReader（）；
初始化（拆分，上下文）；
返回读取器；
}
}

VisitRecordReader：

public class VisitRecordReader extends RecordReader<LongWritable, Visit> {
    private LineRecordReader lineReader;
    private LongWritable lineKey;
    private Text lineValue;

    public VisitRecordReader() {
        lineReader = new LineRecordReader();
    }

    public void initialize(InputSplit genericSplit, TaskAttemptContext context)
            throws IOException {
        lineReader.initialize(genericSplit, context);
    }

    public boolean nextKeyValue() throws IOException {
        return lineReader.nextKeyValue();
    }

    public LongWritable getCurrentKey() {
        return lineReader.getCurrentKey();
    }

    public Visit getCurrentValue() {
        String raw = lineReader.getCurrentValue().toString();
        return new Visit(raw);
    }

    public float getProgress() throws IOException {
        return lineReader.getProgress();
    }

    public void close() throws IOException {
        lineReader.close();
    }

}

公共类VisitRecordReader扩展了RecordReader{
专用LineRecordReader lineReader；
私有长写线路密钥；
私有文本行值；
公众访问记录阅读器（）{
lineReader=新的LineRecordReader（）；
}
公共void初始化（InputSplit genericSplit，TaskAttemptContext上下文）
抛出IOException{
lineReader.initialize（genericSplit，context）；
}
公共布尔值nextKeyValue（）引发IOException{
返回lineReader.nextKeyValue（）；
}
公共长可写getCurrentKey（）{
return lineReader.getCurrentKey（）；
}
公众访问getCurrentValue（）{
字符串原始值=lineReader.getCurrentValue（）.toString（）；
回访（raw）；
}
公共浮点getProgress（）引发IOException{
返回lineReader.getProgress（）；
}
public void close（）引发IOException{
lineReader.close（）；
}
}

访问：

public class VisitRecordReader extends RecordReader<LongWritable, Visit> {
    private LineRecordReader lineReader;
    private LongWritable lineKey;
    private Text lineValue;

    public VisitRecordReader() {
        lineReader = new LineRecordReader();
    }

    public void initialize(InputSplit genericSplit, TaskAttemptContext context)
            throws IOException {
        lineReader.initialize(genericSplit, context);
    }

    public boolean nextKeyValue() throws IOException {
        return lineReader.nextKeyValue();
    }

    public LongWritable getCurrentKey() {
        return lineReader.getCurrentKey();
    }

    public Visit getCurrentValue() {
        String raw = lineReader.getCurrentValue().toString();
        return new Visit(raw);
    }

    public float getProgress() throws IOException {
        return lineReader.getProgress();
    }

    public void close() throws IOException {
        lineReader.close();
    }

}

公共类VisitRecordReader扩展了RecordReader{
专用LineRecordReader lineReader；
私有长写线路密钥；
私有文本行值；
公众访问记录阅读器（）{
lineReader=新的LineRecordReader（）；
}
公共void初始化（InputSplit genericSplit，TaskAttemptContext上下文）
抛出IOException{
lineReader.initialize（genericSplit，context）；
}
公共布尔值nextKeyValue（）引发IOException{
返回lineReader.nextKeyValue（）；
}
公共长可写getCurrentKey（）{
return lineReader.getCurrentKey（）；
}
公众访问getCurrentValue（）{
字符串原始值=lineReader.getCurrentValue（）.toString（）；
回访（raw）；
}
公共浮点getProgress（）引发IOException{
返回lineReader.getProgress（）；
}
public void close（）引发IOException{
lineReader.close（）；
}
}

person是如何实现的？你的输入格式和记录阅读器也应该很有趣。我用这个信息更新了这篇文章，为什么你需要自己的输入格式呢？只需使用简单的

TextInputFormat

并在map方法中创建您的

Visit

s。然而，在你的情况下，这不应该是一个问题。您能否运行探查器/调试器来查看它挂起的位置？通常这是一个GC问题，因此您应该会看到大量CPU使用情况或GC活动。主要是因为我对该工具集不熟悉，想尝试编写自己的InputFormat。我还没有完全掌握hadoop评测的艺术，但快速查看顶部显示CPU和Mem的容量都在15%以下。如果我在mapper中实例化访问并取消InputFormat，问题也会继续存在。

public class VisitRecordReader extends RecordReader<LongWritable, Visit> {
    private LineRecordReader lineReader;
    private LongWritable lineKey;
    private Text lineValue;

    public VisitRecordReader() {
        lineReader = new LineRecordReader();
    }

    public void initialize(InputSplit genericSplit, TaskAttemptContext context)
            throws IOException {
        lineReader.initialize(genericSplit, context);
    }

    public boolean nextKeyValue() throws IOException {
        return lineReader.nextKeyValue();
    }

    public LongWritable getCurrentKey() {
        return lineReader.getCurrentKey();
    }

    public Visit getCurrentValue() {
        String raw = lineReader.getCurrentValue().toString();
        return new Visit(raw);
    }

    public float getProgress() throws IOException {
        return lineReader.getProgress();
    }

    public void close() throws IOException {
        lineReader.close();
    }

}