Java 在Hadoop中获取WordCount程序中的异常_Java_Hadoop_Mapreduce_Word Count_Mapper

Java 在Hadoop中获取WordCount程序中的异常

java hadoop mapreduce

Java 在Hadoop中获取WordCount程序中的异常,java,hadoop,mapreduce,word-count,mapper,Java,Hadoop,Mapreduce,Word Count,Mapper,在hadoop上运行第一个程序时，我遇到了这个异常。（我在版本0.20.2上使用hadoop新API）。我在网上搜索过，看起来大多数人在配置逻辑中没有设置MapperClass和ReducerClass时都遇到了这个问题。但是我检查了一下，代码看起来还可以。如果有人能帮我，我会非常感激的 java.io.IOException:map中的键类型不匹配：预期为org.apache.hadoop.io.Text，received org.apache.hadoop.io.LongWritable

在hadoop上运行第一个程序时，我遇到了这个异常。（我在版本0.20.2上使用hadoop新API）。我在网上搜索过，看起来大多数人在配置逻辑中没有设置MapperClass和ReducerClass时都遇到了这个问题。但是我检查了一下，代码看起来还可以。如果有人能帮我，我会非常感激的

java.io.IOException:map中的键类型不匹配：预期为org.apache.hadoop.io.Text，received org.apache.hadoop.io.LongWritable 位于org.apache.hadoop.mapred.MapTask$MapOutputBuffer.collect（MapTask.java:871）

package com.test.wc；
导入java.io.IOException；
导入org.apache.hadoop.io.IntWritable；
导入org.apache.hadoop.io.LongWritable；
导入org.apache.hadoop.io.Text；
导入org.apache.hadoop.mapreduce.Mapper；
公共类WordCountMapper扩展了映射器{
公共void映射（LongWritable键、文本值、上下文ctx）引发IOException、InterruptedException{
字符串行=value.toString（）；
用于（字符串字：行分割（\\W+））{
if（word.length（）>0）{
ctx.write（新的文本（word），新的可写入（1））；
}
}
}
}
包com.test.wc；
导入java.io.IOException；
导入org.apache.hadoop.io.IntWritable；
导入org.apache.hadoop.io.Text；
导入org.apache.hadoop.mapreduce.Reducer；
公共类WordCountReducer扩展了Reducer{
公共void reduce（文本键、Iterable值、上下文ctx）引发IOException、InterruptedException{
int字数=0；
for（可写入值：值）
{
wordCount+=value.get（）；
}
write（key，新的intwriteable（wordCount））；
}
}
包com.test.wc；
导入java.io.IOException；
导入org.apache.hadoop.fs.Path；
导入org.apache.hadoop.io.IntWritable；
导入org.apache.hadoop.io.Text；
导入org.apache.hadoop.mapreduce.Job；
导入org.apache.hadoop.mapreduce.lib.input.FileInputFormat；
导入org.apache.hadoop.mapreduce.lib.output.FileOutputFormat；
公共类WordCountJob{
公共静态void main（字符串args[]）引发IOException、InterruptedException、ClassNotFoundException{
如果（参数长度！=2）{
System.out.println（“无效使用”）；
系统退出（-1）；
}
作业=新作业（）；
job.setJarByClass（WordCountJob.class）；
job.setJobName（“WordCountJob”）；
addInputPath（作业，新路径（args[0]）；
setOutputPath（作业，新路径（args[1]）；
setMapperClass（WordCountMapper.class）；
job.setReducerClass（WordCountReducer.class）；
//job.setCombinerClass（WordCountReducer.class）；
job.setMapOutputKeyClass（Text.class）；
setMapOutputValueClass（IntWritable.class）；
job.setOutputKeyClass（Text.class）；
job.setOutputValueClass（IntWritable.class）；
系统退出（作业等待完成（真）？0:1；
}
}

您的

Map（）

方法无法覆盖

Mapper的Map（）
方法，因为您使用大写字母M代替小写字母M
因此，将使用默认的标识映射方法，这将导致用作输入的同一密钥和值对也用作输出。由于您的映射程序指定了扩展映射程序
，您尝试输出的LongWritable，Text
而不是Text，IntWritable
导致异常
将Map（）
方法更改为Map（）
并添加@Override
注释应该可以做到这一点-如果您使用的是IDE，我强烈建议使用它内置的方法重写功能，以避免类似错误。
只需从
public void映射（可长写键、文本值、上下文ctx）
到
public void映射（可长写键、文本值、上下文ctx）
它对我有用
Hadoop版本：-Hadoop 1.0.3
您是否尝试过在中粘贴@Override
注释？您的map（）
方法有一个大写的M
，可能会导致使用默认的map（）
，而不是您的版本。@QuetzalCatl注释是您遇到的问题-默认的map方法是一个标识函数，将输出相同的输入键/值对-将您的map方法名称更改为小写，并在方法中添加一个@Override注释。谢谢各位……这是一个愚蠢的错误……不知怎的，我没能抓住它……在修复map方法的名称后，它现在工作正常了
package com.test.wc;
import java.io.IOException;
import org.apache.hadoop.io.IntWritable;
import org.apache.hadoop.io.LongWritable;
import org.apache.hadoop.io.Text;
import org.apache.hadoop.mapreduce.Mapper;

public class WordCountMapper extends Mapper<LongWritable,Text,Text,IntWritable> {

public void Map(LongWritable key,Text value,Context ctx) throws IOException , InterruptedException {
    String line = value.toString();
    for(String word:line.split("\\W+")) {
        if(word.length()> 0){
            ctx.write(new Text(word), new IntWritable(1));
        }
    }
}
}


package com.test.wc;
import java.io.IOException;
import org.apache.hadoop.io.IntWritable;
import org.apache.hadoop.io.Text;
import org.apache.hadoop.mapreduce.Reducer;
public class WordCountReducer extends Reducer<Text,IntWritable,Text,IntWritable> {

public void reduce(Text key, Iterable<IntWritable> values, Context ctx) throws IOException,InterruptedException {
 int wordCount = 0;
    for(IntWritable value:values)
    {
        wordCount+=value.get();
    }
    ctx.write(key,new IntWritable(wordCount));
}

}


package com.test.wc;
import java.io.IOException;
import org.apache.hadoop.fs.Path;
import org.apache.hadoop.io.IntWritable;
import org.apache.hadoop.io.Text;
import org.apache.hadoop.mapreduce.Job;
import org.apache.hadoop.mapreduce.lib.input.FileInputFormat;
import org.apache.hadoop.mapreduce.lib.output.FileOutputFormat;
public class WordCountJob {
public static void main(String args[]) throws IOException, InterruptedException, ClassNotFoundException{
    if(args.length!=2){
        System.out.println("invalid usage");
        System.exit(-1);
    }

    Job job = new Job();
    job.setJarByClass(WordCountJob.class);
    job.setJobName("WordCountJob");



    FileInputFormat.addInputPath(job, new Path(args[0]));
    FileOutputFormat.setOutputPath(job, new Path(args[1]));

    job.setMapperClass(WordCountMapper.class);
    job.setReducerClass(WordCountReducer.class);

    //job.setCombinerClass(WordCountReducer.class);

    job.setMapOutputKeyClass(Text.class);
    job.setMapOutputValueClass(IntWritable.class);

    job.setOutputKeyClass(Text.class);
    job.setOutputValueClass(IntWritable.class);


    System.exit(job.waitForCompletion(true) ? 0:1);

}
}