如何在Hadoop主程序中访问还原器输出的值（或键）？_Hadoop_Mapreduce

如何在Hadoop主程序中访问还原器输出的值（或键）？

hadoop mapreduce

如何在Hadoop主程序中访问还原器输出的值（或键）？,hadoop,mapreduce,Hadoop,Mapreduce,假设每个减速机输出一个整数作为其值（或键）。有没有办法在Hadoop的主程序中访问这些值（或键）（例如，对它们进行汇总？您的输出格式是什么？如果您使用的是SequenceFileOutput，则可以在作业完成后使用SequenceFile.Reader类在主程序中打开part-r-xxxxx文件。例如，对于输出的作业，可以按如下方式求和值： FileSystem fs = FileSystem.get(getConf()); Text key = new Text(); IntWritable

假设每个减速机输出一个整数作为其值（或键）。有没有办法在Hadoop的主程序中访问这些值（或键）（例如，对它们进行汇总？

您的输出格式是什么？如果您使用的是SequenceFileOutput，则可以在作业完成后使用SequenceFile.Reader类在主程序中打开part-r-xxxxx文件。例如，对于输出

的作业，可以按如下方式求和值：

FileSystem fs = FileSystem.get(getConf());
Text key = new Text();
IntWritable value = new IntWritable();
long total = 0;
for (FileStatus fileStat : fs.globStatus(new Path("/user/jsmith/output/part-r-*"))) {
  SequenceFile.Reader reader = new SequenceFile.Reader(fs, fileStat.getPath(), getConf());
  while (reader.next(key, value)) {
    total = value.get();
  }
  reader.close();
}

对于TextOutputFormat，以下操作可能会完成（替换For循环的内容）：

谢谢，但是如果我使用

TextOutputFormat

作为我的输出格式呢？为TextOutputFormat添加了一个简单的解决方案

BufferedReader reader = new BufferedReader(new InputStreamReader(fs.open(fileStat.getPath())));
String nextLine;
while ((nextLine = reader.readLine()) != null) {
  String tokens[] = nextLine.split("\t");
  total += Integer.parseInt(tokens[1]);
}
reader.close();