Java MapReduce中的字数_Java_Hadoop

Java MapReduce中的字数

java hadoop

Java MapReduce中的字数,java,hadoop,Java,Hadoop,我已经看到线程解决了这个错误，但不知何故，解决方案对我没有帮助。我正在运行下面的代码： public class WordCount { public static class MyMapper extends Mapper<LongWritable,Text,Text,IntWritable> { final static IntWritable one=new IntWritable(1); Text word=new Text(

我已经看到线程解决了这个错误，但不知何故，解决方案对我没有帮助。我正在运行下面的代码：

public class WordCount
{
    public static class MyMapper extends Mapper<LongWritable,Text,Text,IntWritable>
    {
        final static IntWritable one=new IntWritable(1);
        Text word=new Text();

        public void map(LongWritable key,Text Value,Context context) throws IOException,InterruptedException
        {
            String line=Value.toString();
            StringTokenizer itr=new StringTokenizer(line);
            while (itr.hasMoreTokens())
            {
                word.set(itr.nextToken());
                context.write(word, one);
            }
        }

    }

    public static class MyReducer extends Reducer<Text,IntWritable,Text,IntWritable>{

        public void reduce(Text Key,Iterable<IntWritable> Values,Context context) throws IOException,InterruptedException
        {
            int sum=0;
            for (IntWritable value:Values)
            {
                sum=sum+value.get();
            }
            context.write(Key, new IntWritable(sum));}



        public static void main(String[] args) throws IOException, InterruptedException, ClassNotFoundException
        {

            //Job job=new Job();
            Configuration conf=new Configuration();
            Job job=new Job(conf,"Word Count");

            job.setJarByClass(WordCount.class);
            job.setMapperClass(MyMapper.class);
            job.setReducerClass(MyReducer.class);
            job.setOutputKeyClass(Text.class);
            job.setOutputValueClass(IntWritable.class);
            FileInputFormat.addInputPath(job, new Path(args[0]));
            FileOutputFormat.setOutputPath(job,new Path(args[1]));
            System.exit(job.waitForCompletion(true) ? 0 : 1);
        }
    }
}

公共类字数
{
公共静态类MyMapper扩展了Mapper
{
最终静态IntWritable one=新的IntWritable（1）；
Text word=新文本（）；
公共void映射（LongWritable键、文本值、上下文上下文）引发IOException、InterruptedException
{
字符串行=Value.toString（）；
StringTokenizer itr=新的StringTokenizer（行）；
而（itr.hasMoreTokens（））
{
set（itr.nextToken（））；
上下文。写（单词，一）；
}
}
}
公共静态类MyReducer扩展了Reducer{
公共void reduce（文本键、Iterable值、上下文上下文）引发IOException、InterruptedException
{
整数和=0；
for（可写入值：值）
{
sum=sum+value.get（）；
}
write（Key，new intwriteable（sum））；}
公共静态void main（字符串[]args）引发IOException、InterruptedException、ClassNotFoundException
{
//作业=新作业（）；
Configuration conf=新配置（）；
Job Job=新作业（conf，“字数”）；
job.setJarByClass（WordCount.class）；
setMapperClass（MyMapper.class）；
job.setReducerClass（MyReducer.class）；
job.setOutputKeyClass（Text.class）；
job.setOutputValueClass（IntWritable.class）；
addInputPath（作业，新路径（args[0]）；
setOutputPath（作业，新路径（args[1]）；
系统退出（作业等待完成（真）？0:1；
}
}
}

它有适当格式的主方法。当我试图运行它时，它仍然会给我下面的错误

 **hadoop jar MyTesting-0.0.1-SNAPSHOT.jar MyPackage.WordCount <Input> <Output>**
Exception in thread "main" java.lang.NoSuchMethodException: MyPackage.WordCount.main([Ljava.lang.String;)
        at java.lang.Class.getMethod(Class.java:1670)
        at org.apache.hadoop.util.RunJar.run(RunJar.java:215)
        at org.apache.hadoop.util.RunJar.main(RunJar.java:136)

**hadoop jar MyTesting-0.0.1-SNAPSHOT.jar MyPackage.WordCount**
线程“main”java.lang.NoSuchMethodException中的异常：MyPackage.WordCount.main（[Ljava.lang.String；）
位于java.lang.Class.getMethod（Class.java:1670）
位于org.apache.hadoop.util.RunJar.run（RunJar.java:215）
位于org.apache.hadoop.util.RunJar.main（RunJar.java:136）

您是如何构建jar的？包的名称是否与“MyPackage”完全一致？我安装了maven来构建jar。包的名称是正确的。只是一个简单的输入错误。您的方法main（）位于MyReducer类而不是根WordCount类中。Ohh是的..非常感谢..：）但是我还有一个疑问..当我运行上述程序时，它会给我215个输出文件，其中许多是空的..我理解，这将是因为我使用的默认哈希分区器，我可以设置Reducer的数量来限制num oF输出文件。虽然相反，我希望使用懒惰的输出格式来消除空白文件。我刚刚添加了Joo.StutOutPufFraseClass（LaZyOutPuttoFr.class）；到上面的程序…但是它给了我空指针异常……有什么线索吗？