Java hadoop mapreduce作业中未调用Reducer_Java_Hadoop_Mapreduce

Java hadoop mapreduce作业中未调用Reducer

java hadoop mapreduce

Java hadoop mapreduce作业中未调用Reducer,java,hadoop,mapreduce,Java,Hadoop,Mapreduce,我有两个mapper类，它们只是创建键值对。我的主要逻辑应该在reducer部分。我正在尝试比较来自两个不同文本文件的数据。我的mapper类是 public static class Map extends Mapper<LongWritable, Text, Text, Text> { private String ky,vl="a"; public void map(LongWritable key, Text value, Context

我有两个mapper类，它们只是创建键值对。我的主要逻辑应该在reducer部分。我正在尝试比较来自两个不同文本文件的数据。
我的mapper类是

public static class Map extends
        Mapper<LongWritable, Text, Text, Text> {

    private String ky,vl="a";

    public void map(LongWritable key, Text value, Context context) throws IOException, InterruptedException {

            String line = value.toString();
            String tokens[] = line.split("\t");
            vl = tokens[1].trim();
            ky = tokens[2].trim();
    //sending key-value pairs to the reducer
            context.write(new Text(ky),new Text(vl));
    }
}

public static class Reduce extends
        Reducer<Text, Text, Text, Text> {

  private String rslt = "l";

    public void reduce(Text key, Iterator<Text> values,Context context) throws IOException, InterruptedException {
          int count = 0;
            while(values.hasNext()){
            count++;
            }
            rslt = Integer.toString(count);
       if(count>1){    
            context.write(key,new Text(rslt));
          }
    }

}

输出

    File System Counters
    FILE: Number of bytes read=361621  
 FILE: Number of bytes written=1501806
    FILE: Number of read operations=0
    FILE: Number of large read operations=0
    FILE: Number of write operations=0
    HDFS: Number of bytes read=552085
    HDFS: Number of bytes written=150962
    HDFS: Number of read operations=28
    HDFS: Number of large read operations=0
    HDFS: Number of write operations=5
Map-Reduce Framework
    Map input records=10783
    Map output records=10783
    Map output bytes=150962
    Map output materialized bytes=172540
    Input split bytes=507
    Combine input records=0
    Combine output records=0
    Reduce input groups=7985
    Reduce shuffle bytes=172540
    Reduce input records=10783
    Reduce output records=10783
    Spilled Records=21566
    Shuffled Maps =2
    Failed Shuffles=0
    Merged Map outputs=2
    GC time elapsed (ms)=12
    Total committed heap usage (bytes)=928514048
Shuffle Errors
    BAD_ID=0
    CONNECTION=0
    IO_ERROR=0
    WRONG_LENGTH=0
    WRONG_MAP=0
    WRONG_REDUCE=0
File Input Format Counters 
    Bytes Read=0
File Output Format Counters 
    Bytes Written=150962

为什么需要这两个映射器类？看起来两者的作用是一样的。你能更详细地描述一下出了什么问题吗？减速器没有启动吗？作业的退出状态是什么？我使用两个文件，因为我接受用户输入并存储在另一个文件中。输出与映射后的结果相同…结果得到排序（我猜是它实现了默认的缩减器）。我的意思是Map和Map2做的相同，所以Map可以重用。但是你能描述一下减速器的情况吗？你能在工作跟踪器上看到它吗？所以你的减速机做了一些工作，但是输出不是你期望的，对吗？你能发一份你的意见样本吗？也许删除减速机中的if子句（count>1）也可以找到这个输出。@0309gunner正在执行它。它清楚地显示了Reduce input records=10783 Reduce output records=10783为什么需要这两个映射器类？看起来两者的作用是一样的。你能更详细地描述一下出了什么问题吗？减速器没有启动吗？作业的退出状态是什么？我使用两个文件，因为我接受用户输入并存储在另一个文件中。输出与映射后的结果相同…结果得到排序（我猜是它实现了默认的缩减器）。我的意思是Map和Map2做的相同，所以Map可以重用。但是你能描述一下减速器的情况吗？你能在工作跟踪器上看到它吗？所以你的减速机做了一些工作，但是输出不是你期望的，对吗？你能发一份你的意见样本吗？也许删除减速机中的if子句（count>1）也可以找到这个输出。@0309gunner正在执行它。它清楚地显示了Reduce input records=10783 Reduce output records=10783

Configuration conf = new Configuration();
    Job job = new Job(conf);
    job.setJarByClass(CompareTwoFiles.class);
    job.setJobName("Compare Two Files and Identify the Difference");
    FileOutputFormat.setOutputPath(job, new Path(args[2]));
    job.setReducerClass(Reduce.class);
    job.setOutputKeyClass(Text.class);
    job.setOutputValueClass(Text.class);
    MultipleInputs.addInputPath(job, new Path(args[0]),
            TextInputFormat.class, Map.class);
    MultipleInputs.addInputPath(job, new Path(args[1]),
            TextInputFormat.class, Map2.class);
    job.waitForCompletion(true);

    File System Counters
    FILE: Number of bytes read=361621  
 FILE: Number of bytes written=1501806
    FILE: Number of read operations=0
    FILE: Number of large read operations=0
    FILE: Number of write operations=0
    HDFS: Number of bytes read=552085
    HDFS: Number of bytes written=150962
    HDFS: Number of read operations=28
    HDFS: Number of large read operations=0
    HDFS: Number of write operations=5
Map-Reduce Framework
    Map input records=10783
    Map output records=10783
    Map output bytes=150962
    Map output materialized bytes=172540
    Input split bytes=507
    Combine input records=0
    Combine output records=0
    Reduce input groups=7985
    Reduce shuffle bytes=172540
    Reduce input records=10783
    Reduce output records=10783
    Spilled Records=21566
    Shuffled Maps =2
    Failed Shuffles=0
    Merged Map outputs=2
    GC time elapsed (ms)=12
    Total committed heap usage (bytes)=928514048
Shuffle Errors
    BAD_ID=0
    CONNECTION=0
    IO_ERROR=0
    WRONG_LENGTH=0
    WRONG_MAP=0
    WRONG_REDUCE=0
File Input Format Counters 
    Bytes Read=0
File Output Format Counters 
    Bytes Written=150962