Sorting MapReduce中的排序生成额外值
我尝试按以下顺序对一系列整数进行排序:Sorting MapReduce中的排序生成额外值,sorting,hadoop,mapreduce,Sorting,Hadoop,Mapreduce,我尝试按以下顺序对一系列整数进行排序: A 2 B 9 C 4 .... .... Z 42 以下是映射器和减速器代码: public static class MapClass extends MapReduceBase implements Mapper<Text, Text, IntWritable, Text> { public void map(Text key, Text value, OutputCollector<
A 2
B 9
C 4
....
....
Z 42
以下是映射器和减速器代码:
public static class MapClass extends MapReduceBase implements Mapper<Text, Text, IntWritable, Text>
{
public void map(Text key, Text value, OutputCollector<IntWritable, Text> output, Reporter reporter) throws IOException
{
output.collect(new IntWritable(Integer.parseInt(value.toString())), key);
}
}
public static class Reduce extends MapReduceBase implements Reducer<IntWritable, Text, IntWritable, Text>
{
public void reduce(IntWritable key, Iterator<Text> values, OutputCollector<IntWritable, Text> output, Reporter reporter) throws IOException
{
output.collect(key, new Text(""));
}
}
我根据您的逻辑尝试了,但使用了新的API。结果是正确的 注意:reduce(…)函数的第二个参数是
**Iterable**
输出:
2
4
9
42
你能分享你的全部代码吗。需要知道您使用的输入格式以及您的逻辑是什么。您的数据中没有重复的字段吗?您使用了多少个减缩器?还可以告诉您数据有多大?这是一个非常小的数据,最多20个整数分布在4个文件中(每个文件中有5个整数)。只是想测试一下程序。我已经编辑了源代码以包含输入格式。检查。根据您共享的代码和数据,我觉得很好。检查输入路径中是否有任何额外文件或代码中是否有任何循环?
package stackoverflow;
import java.io.IOException;
import java.util.Iterator;
import org.apache.hadoop.conf.Configured;
import org.apache.hadoop.fs.FileSystem;
import org.apache.hadoop.fs.Path;
import org.apache.hadoop.io.IntWritable;
import org.apache.hadoop.io.Text;
import org.apache.hadoop.mapreduce.Job;
import org.apache.hadoop.mapreduce.Mapper;
import org.apache.hadoop.mapreduce.Reducer;
import org.apache.hadoop.mapreduce.lib.input.FileInputFormat;
import org.apache.hadoop.mapreduce.lib.input.KeyValueTextInputFormat;
import org.apache.hadoop.mapreduce.lib.output.FileOutputFormat;
import org.apache.hadoop.mapreduce.lib.output.TextOutputFormat;
import org.apache.hadoop.util.Tool;
import org.apache.hadoop.util.ToolRunner;
public class q18076708 extends Configured implements Tool {
static class MapClass extends Mapper<Text, Text, IntWritable, Text> {
public void map(Text key, Text value, Context context)
throws IOException, InterruptedException {
context.write(new IntWritable(Integer.parseInt(value.toString())),
key);
}
}
static class Reduce extends Reducer<IntWritable, Text, IntWritable, Text> {
static int xxx = -1;
@Override
public void reduce(IntWritable key, **Iterable**<Text> values,
Context context) throws IOException, InterruptedException {
context.write(key, new Text(""));
}
}
public int run(String[] args) throws Exception {
getConf().set("fs.default.name", "file:///");
getConf().set("mapred.job.tracker", "local");
Job job = new Job(getConf(), "Logging job");
job.setJarByClass(getClass());
FileInputFormat.addInputPath(job, new Path("src/test/resources/testinput.txt"));
FileSystem.get(getConf()).delete(new Path("target/out"), true);
FileOutputFormat.setOutputPath(job, new Path("target/out"));
job.setMapperClass(MapClass.class);
job.setMapOutputKeyClass(IntWritable.class);
job.setMapOutputValueClass(Text.class);
job.setCombinerClass(Reduce.class);
job.setReducerClass(Reduce.class);
job.setInputFormatClass(KeyValueTextInputFormat.class);
job.setOutputFormatClass(TextOutputFormat.class);
job.setOutputKeyClass(IntWritable.class);
job.setOutputValueClass(Text.class);
return job.waitForCompletion(true) ? 0 : 1;
}
public static void main(String[] args) throws Exception {
int exitCode = ToolRunner.run(new q18076708(), args);
System.exit(exitCode);
}
}
A 2
B 9
C 4
Z 42
2
4
9
42