Java MapReduce中的字数

Java MapReduce中的字数,java,hadoop,Java,Hadoop,我已经看到线程解决了这个错误,但不知何故,解决方案对我没有帮助。 我正在运行下面的代码: public class WordCount { public static class MyMapper extends Mapper<LongWritable,Text,Text,IntWritable> { final static IntWritable one=new IntWritable(1); Text word=new Text(

我已经看到线程解决了这个错误,但不知何故,解决方案对我没有帮助。 我正在运行下面的代码:

public class WordCount
{
    public static class MyMapper extends Mapper<LongWritable,Text,Text,IntWritable>
    {
        final static IntWritable one=new IntWritable(1);
        Text word=new Text();

        public void map(LongWritable key,Text Value,Context context) throws IOException,InterruptedException
        {
            String line=Value.toString();
            StringTokenizer itr=new StringTokenizer(line);
            while (itr.hasMoreTokens())
            {
                word.set(itr.nextToken());
                context.write(word, one);
            }
        }

    }

    public static class MyReducer extends Reducer<Text,IntWritable,Text,IntWritable>{

        public void reduce(Text Key,Iterable<IntWritable> Values,Context context) throws IOException,InterruptedException
        {
            int sum=0;
            for (IntWritable value:Values)
            {
                sum=sum+value.get();
            }
            context.write(Key, new IntWritable(sum));}



        public static void main(String[] args) throws IOException, InterruptedException, ClassNotFoundException
        {

            //Job job=new Job();
            Configuration conf=new Configuration();
            Job job=new Job(conf,"Word Count");

            job.setJarByClass(WordCount.class);
            job.setMapperClass(MyMapper.class);
            job.setReducerClass(MyReducer.class);
            job.setOutputKeyClass(Text.class);
            job.setOutputValueClass(IntWritable.class);
            FileInputFormat.addInputPath(job, new Path(args[0]));
            FileOutputFormat.setOutputPath(job,new Path(args[1]));
            System.exit(job.waitForCompletion(true) ? 0 : 1);
        }
    }
}
公共类字数
{
公共静态类MyMapper扩展了Mapper
{
最终静态IntWritable one=新的IntWritable(1);
Text word=新文本();
公共void映射(LongWritable键、文本值、上下文上下文)引发IOException、InterruptedException
{
字符串行=Value.toString();
StringTokenizer itr=新的StringTokenizer(行);
而(itr.hasMoreTokens())
{
set(itr.nextToken());
上下文。写(单词,一);
}
}
}
公共静态类MyReducer扩展了Reducer{
公共void reduce(文本键、Iterable值、上下文上下文)引发IOException、InterruptedException
{
整数和=0;
for(可写入值:值)
{
sum=sum+value.get();
}
write(Key,new intwriteable(sum));}
公共静态void main(字符串[]args)引发IOException、InterruptedException、ClassNotFoundException
{
//作业=新作业();
Configuration conf=新配置();
Job Job=新作业(conf,“字数”);
job.setJarByClass(WordCount.class);
setMapperClass(MyMapper.class);
job.setReducerClass(MyReducer.class);
job.setOutputKeyClass(Text.class);
job.setOutputValueClass(IntWritable.class);
addInputPath(作业,新路径(args[0]);
setOutputPath(作业,新路径(args[1]);
系统退出(作业等待完成(真)?0:1;
}
}
}
它有适当格式的主方法。 当我试图运行它时,它仍然会给我下面的错误

 **hadoop jar MyTesting-0.0.1-SNAPSHOT.jar MyPackage.WordCount <Input> <Output>**
Exception in thread "main" java.lang.NoSuchMethodException: MyPackage.WordCount.main([Ljava.lang.String;)
        at java.lang.Class.getMethod(Class.java:1670)
        at org.apache.hadoop.util.RunJar.run(RunJar.java:215)
        at org.apache.hadoop.util.RunJar.main(RunJar.java:136)
**hadoop jar MyTesting-0.0.1-SNAPSHOT.jar MyPackage.WordCount**
线程“main”java.lang.NoSuchMethodException中的异常:MyPackage.WordCount.main([Ljava.lang.String;)
位于java.lang.Class.getMethod(Class.java:1670)
位于org.apache.hadoop.util.RunJar.run(RunJar.java:215)
位于org.apache.hadoop.util.RunJar.main(RunJar.java:136)

您是如何构建jar的?包的名称是否与“MyPackage”完全一致?我安装了maven来构建jar。包的名称是正确的。只是一个简单的输入错误。您的方法main()位于MyReducer类而不是根WordCount类中。Ohh是的..非常感谢..:)但是我还有一个疑问..当我运行上述程序时,它会给我215个输出文件,其中许多是空的..我理解,这将是因为我使用的默认哈希分区器,我可以设置Reducer的数量来限制num oF输出文件。虽然相反,我希望使用懒惰的输出格式来消除空白文件。我刚刚添加了Joo.StutOutPufFraseClass(LaZyOutPuttoFr.class);到上面的程序…但是它给了我空指针异常……有什么线索吗?