Java MapReduce中的字数
我已经看到线程解决了这个错误,但不知何故,解决方案对我没有帮助。 我正在运行下面的代码:Java MapReduce中的字数,java,hadoop,Java,Hadoop,我已经看到线程解决了这个错误,但不知何故,解决方案对我没有帮助。 我正在运行下面的代码: public class WordCount { public static class MyMapper extends Mapper<LongWritable,Text,Text,IntWritable> { final static IntWritable one=new IntWritable(1); Text word=new Text(
public class WordCount
{
public static class MyMapper extends Mapper<LongWritable,Text,Text,IntWritable>
{
final static IntWritable one=new IntWritable(1);
Text word=new Text();
public void map(LongWritable key,Text Value,Context context) throws IOException,InterruptedException
{
String line=Value.toString();
StringTokenizer itr=new StringTokenizer(line);
while (itr.hasMoreTokens())
{
word.set(itr.nextToken());
context.write(word, one);
}
}
}
public static class MyReducer extends Reducer<Text,IntWritable,Text,IntWritable>{
public void reduce(Text Key,Iterable<IntWritable> Values,Context context) throws IOException,InterruptedException
{
int sum=0;
for (IntWritable value:Values)
{
sum=sum+value.get();
}
context.write(Key, new IntWritable(sum));}
public static void main(String[] args) throws IOException, InterruptedException, ClassNotFoundException
{
//Job job=new Job();
Configuration conf=new Configuration();
Job job=new Job(conf,"Word Count");
job.setJarByClass(WordCount.class);
job.setMapperClass(MyMapper.class);
job.setReducerClass(MyReducer.class);
job.setOutputKeyClass(Text.class);
job.setOutputValueClass(IntWritable.class);
FileInputFormat.addInputPath(job, new Path(args[0]));
FileOutputFormat.setOutputPath(job,new Path(args[1]));
System.exit(job.waitForCompletion(true) ? 0 : 1);
}
}
}
公共类字数
{
公共静态类MyMapper扩展了Mapper
{
最终静态IntWritable one=新的IntWritable(1);
Text word=新文本();
公共void映射(LongWritable键、文本值、上下文上下文)引发IOException、InterruptedException
{
字符串行=Value.toString();
StringTokenizer itr=新的StringTokenizer(行);
而(itr.hasMoreTokens())
{
set(itr.nextToken());
上下文。写(单词,一);
}
}
}
公共静态类MyReducer扩展了Reducer{
公共void reduce(文本键、Iterable值、上下文上下文)引发IOException、InterruptedException
{
整数和=0;
for(可写入值:值)
{
sum=sum+value.get();
}
write(Key,new intwriteable(sum));}
公共静态void main(字符串[]args)引发IOException、InterruptedException、ClassNotFoundException
{
//作业=新作业();
Configuration conf=新配置();
Job Job=新作业(conf,“字数”);
job.setJarByClass(WordCount.class);
setMapperClass(MyMapper.class);
job.setReducerClass(MyReducer.class);
job.setOutputKeyClass(Text.class);
job.setOutputValueClass(IntWritable.class);
addInputPath(作业,新路径(args[0]);
setOutputPath(作业,新路径(args[1]);
系统退出(作业等待完成(真)?0:1;
}
}
}
它有适当格式的主方法。
当我试图运行它时,它仍然会给我下面的错误
**hadoop jar MyTesting-0.0.1-SNAPSHOT.jar MyPackage.WordCount <Input> <Output>**
Exception in thread "main" java.lang.NoSuchMethodException: MyPackage.WordCount.main([Ljava.lang.String;)
at java.lang.Class.getMethod(Class.java:1670)
at org.apache.hadoop.util.RunJar.run(RunJar.java:215)
at org.apache.hadoop.util.RunJar.main(RunJar.java:136)
**hadoop jar MyTesting-0.0.1-SNAPSHOT.jar MyPackage.WordCount**
线程“main”java.lang.NoSuchMethodException中的异常:MyPackage.WordCount.main([Ljava.lang.String;)
位于java.lang.Class.getMethod(Class.java:1670)
位于org.apache.hadoop.util.RunJar.run(RunJar.java:215)
位于org.apache.hadoop.util.RunJar.main(RunJar.java:136)
您是如何构建jar的?包的名称是否与“MyPackage”完全一致?我安装了maven来构建jar。包的名称是正确的。只是一个简单的输入错误。您的方法main()位于MyReducer类而不是根WordCount类中。Ohh是的..非常感谢..:)但是我还有一个疑问..当我运行上述程序时,它会给我215个输出文件,其中许多是空的..我理解,这将是因为我使用的默认哈希分区器,我可以设置Reducer的数量来限制num oF输出文件。虽然相反,我希望使用懒惰的输出格式来消除空白文件。我刚刚添加了Joo.StutOutPufFraseClass(LaZyOutPuttoFr.class);到上面的程序…但是它给了我空指针异常……有什么线索吗?