Java Hadoop返回mapper而不是reducer的输出
我正在编写这段hadoop代码,但无法理解为什么它不生成reducer输出,而是准确地输出mapper的结果。我已经玩了很长时间的代码,测试不同的输出,但没有运气 我的自定义映射器:Java Hadoop返回mapper而不是reducer的输出,java,hadoop,mapper,reducers,Java,Hadoop,Mapper,Reducers,我正在编写这段hadoop代码,但无法理解为什么它不生成reducer输出,而是准确地输出mapper的结果。我已经玩了很长时间的代码,测试不同的输出,但没有运气 我的自定义映射器: public static class UserMapper extends Mapper<Object, Text, Text, Text> { private final static IntWritable one = new IntWritable(1); private Tex
public static class UserMapper extends Mapper<Object, Text, Text, Text> {
private final static IntWritable one = new IntWritable(1);
private Text userid = new Text();
private Text catid = new Text();
/* map method */
public void map(Object key, Text value, Context context)
throws IOException, InterruptedException {
StringTokenizer itr = new StringTokenizer(value.toString(), ","); /* separated by "," */
int count = 0;
userid.set(itr.nextToken());
while (itr.hasMoreTokens()) {
if (++count == 4) {
// catid.set(itr.nextToken());
catid.set("This is a test");
context.write(userid, catid);
}else {
itr.nextToken();
}
}
}
}
/* Reducer Class */
public static class UserReducer extends Reducer<Text, IntWritable, Text, IntWritable> {
private IntWritable result = new IntWritable();
public void reduce(Text key, Iterable<Text> values, Context context)
throws IOException, InterruptedException {
int sum = 0;
for (Text val : values) {
sum += 1; //val.get();
}
result.set(0);
context.write(key, result);
}
}
Job job = new Job(conf, "User Popular Categories");
job.setJarByClass(popularCategories.class);
job.setMapperClass(UserMapper.class);
job.setCombinerClass(UserReducer.class);
job.setReducerClass(UserReducer.class);
job.setMapOutputKeyClass(Text.class);
job.setMapOutputValueClass(Text.class);
job.setOutputKeyClass(Text.class);
job.setOutputValueClass(IntWritable.class);
job.setNumReduceTasks(2);
FileInputFormat.addInputPath(job, new Path(otherArgs[0]));
FileOutputFormat.setOutputPath(job, new Path(otherArgs[1]));
System.exit(job.waitForCompletion(true) ? 0 : 1);
和输出文件:/user/hduser/output/part-r-00000
0001be1731ee7d1c519bc7e87110c9eb880cb396 This is a test
0001bfa0c494c01f9f8c141c476c11bb4625a746 This is a test
0002bd9c3d654698bb514194c4f4171ad6992266 This is a test
000433e0ef411c2cb8ee1727002d6ba15fe9426b This is a test
00051f5350f4d9f3f4f5ba181b0a66d749b161ee This is a test
00066c85bf96469b905e2fb148095448797b2368 This is a test
0007b1a0334de785b3189b67bb73276d602fb7d4 This is a test
0007d018861d588e99e834fc29ca76a523b20e35 This is a test
000992b67ed22d2707ba65046d523ce66dfcfcb8 This is a test
000ad93a0819e2cbd7f0193e1d1ec481a0241b44 This is a test
不过,我还是很惊讶上面的代码块是如何为您工作的。就像在你们的另一个关于中映射器的问题中,你们应该在这里得到例外 似乎输出是映射器而不是还原器。你确定那个文件名吗
/user/hduser/output/part-r-00000
instead of
/user/hduser/output/part-m-00000
映射器输出应为还原器输入
public static class UserMapper extends Mapper<Object, Text, Text, Text> {
表示输入键为Text(正确)但值被错误地设置为intwriteable(应为Text)
将声明更改为
public static class UserReducer extends Reducer<Text, Text, Text, IntWritable> {
公共静态类UserReducer扩展了Reducer{
并相应地设置驱动程序
程序中的参数
public static class UserReducer extends Reducer<Text, Text, Text, IntWritable> {