Warning: file_get_contents(/data/phpspider/zhask/data//catemap/0/hadoop/6.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
Java 具有ArrayWritable的Hadoop MapReduce链_Java_Hadoop_Mapreduce_Writable - Fatal编程技术网

Java 具有ArrayWritable的Hadoop MapReduce链

Java 具有ArrayWritable的Hadoop MapReduce链,java,hadoop,mapreduce,writable,Java,Hadoop,Mapreduce,Writable,我正在尝试创建一个由两个步骤组成的mapreduce链。 第一个reduce将键值对作为(key,value)发出,其中value是自定义对象的列表,第二个映射器应该读取第一个reducer的输出。 该列表是一个自定义的可写入阵列。以下是相关代码: 自定义对象: public class Custom implements Writable { private Text document; private IntWritable count; public Custom

我正在尝试创建一个由两个步骤组成的mapreduce链。 第一个reduce将键值对作为(key,value)发出,其中value是自定义对象的列表,第二个映射器应该读取第一个reducer的输出。 该列表是一个自定义的可写入阵列。以下是相关代码:

自定义对象:

public class Custom implements Writable {
    private Text document;
    private IntWritable count;

    public Custom(){
        setDocument("");
        setCount(0);
    }

    public Custom(String document, int count) {
        setDocument(document);
        setCount(count);
    }

    @Override
    public void readFields(DataInput in) throws IOException {
        // TODO Auto-generated method stub
        document.readFields(in);
        count.readFields(in);
    }

    @Override
    public void write(DataOutput out) throws IOException {
        document.write(out);
        count.write(out);
    }

    @Override
    public String toString() {
        return this.document.toString() + "\t" + this.count.toString();
    }

    public int getCount() {
        return count.get();
    }

    public void setCount(int count) {
        this.count = new IntWritable(count);
    }

    public String getDocument() {
        return document.toString();
    }

    public void setDocument(String document) {
        this.document = new Text(document);
    }
}
自定义可写入阵列:

 class MyArrayWritable extends ArrayWritable {
    public MyArrayWritable(Writable[] values) {
        super(Custom.class, values);
    }

    public MyArrayWritable() {
        super(Custom.class);
    }

    @Override
    public Custom[] get() {
        return (Custom[]) super.get();
    }

    @Override
    public String toString() {
      return Arrays.toString(get());
    }

    @Override
    public void write(DataOutput arg0) throws IOException {
        super.write(arg0);
    }
}
第一减速器:

public static class NGramReducer extends Reducer<Text, Text, Text, MyArrayWritable> {
    public void reduce(Text key, Iterable<Text> values, Context context)
            throws IOException, InterruptedException {
        //other code
        context.write(key, mArrayWritable);
    }
}
当我运行它时,我得到了这个错误 错误:java.lang.ClassCastException:org.apache.hadoop.io.Text无法强制转换为检测器$MyArrayWritable


有什么问题?我必须写一个FileInputFormat吗?(job1工作正常)

这似乎是因为您的Job2
InputFormat
KeyValueTextInputFormat.class
需要一个键和值,它们都是
Text
对象。由于作业1输出
(Text,MyArrayWritable)
,因此与值存在冲突

幸运的是,您不必编写自定义的
OutputFormat
来满足您的数据需求!只需将作业1数据的输出写入序列文件,这样可以保持数据的二进制形式:

//...
job1.setOutputKeyClass(Text.class);
job1.setOutputValueClass(MyArrayWritable.class);
job1.setInputFormatClass(WholeFileInputFormat.class);
job1.setOutputFormatClass(SequenceFileOutputFormat.class);

FileInputFormat.addInputPath(job1, new Path(args[2]));
SequenceFileOutputFormat.setOutputPath(job1, TEMP_PATH);
//...
job2.setInputFormatClass(SequenceFileInputFormat.class);
SequenceFileInputFormat.addInputPath(job2, TEMP_PATH);
FileOutputFormat.setOutputPath(job2, new Path(args[3]));

我修改了输入和输出,但现在我得到了这个错误‘error:java.lang.RuntimeException:java.lang.NoSuchMethodException:Detector$MyArrayWritable。(’)似乎回答了这个问题
    //...
    job1.setOutputKeyClass(Text.class);
    job1.setOutputValueClass(MyArrayWritable.class);
    job1.setInputFormatClass(WholeFileInputFormat.class);
    FileInputFormat.addInputPath(job1, new Path(args[2]));
    FileOutputFormat.setOutputPath(job1, TEMP_PATH);
    //...
    job2.setInputFormatClass(KeyValueTextInputFormat.class);
    FileInputFormat.addInputPath(job2, TEMP_PATH);
    FileOutputFormat.setOutputPath(job2, new Path(args[3]));
//...
job1.setOutputKeyClass(Text.class);
job1.setOutputValueClass(MyArrayWritable.class);
job1.setInputFormatClass(WholeFileInputFormat.class);
job1.setOutputFormatClass(SequenceFileOutputFormat.class);

FileInputFormat.addInputPath(job1, new Path(args[2]));
SequenceFileOutputFormat.setOutputPath(job1, TEMP_PATH);
//...
job2.setInputFormatClass(SequenceFileInputFormat.class);
SequenceFileInputFormat.addInputPath(job2, TEMP_PATH);
FileOutputFormat.setOutputPath(job2, new Path(args[3]));