Hadoop映射减少编程语法错误_Hadoop_Mapreduce

Hadoop映射减少编程语法错误

hadoop mapreduce

Hadoop映射减少编程语法错误,hadoop,mapreduce,Hadoop,Mapreduce,我的输入是许多文本文件。我希望我的map reduce程序将所有文件名和与文件名相关联的句子写入一个输出文件中，其中我只希望从映射器发出文件名（键）和相关句子（值）。reducer将收集键和所有值，并在输出中写入文件名及其相关语句以下是我的mapper和reducer的代码： import java.io.IOException; import java.util.*; import org.apache.hadoop.fs.Path; import org.apache.had

我的输入是许多文本文件。我希望我的map reduce程序将所有文件名和与文件名相关联的句子写入一个输出文件中，其中我只希望从映射器发出文件名（键）和相关句子（值）。reducer将收集键和所有值，并在输出中写入文件名及其相关语句

以下是我的mapper和reducer的代码：

import java.io.IOException;

   import java.util.*;
import org.apache.hadoop.fs.Path;
   import org.apache.hadoop.io.*;
    import org.apache.hadoop.mapred.*;


      public class WordCount {

public static class Map extends MapReduceBase implements Mapper<LongWritable,   
    Text, Text, Text> {



  public void map(LongWritable key, Text value, OutputCollector<Text,Text> 
      output, Reporter reporter) throws IOException {
  String filename = new String();
  FileSplit filesplit = (FileSplit)reporter.getInputSplit();
  filename=filesplit.getPath().getName();
      output.collect(new Text(filename), value);


     }

    }


    public static class Reduce extends MapReduceBase implements Reducer<Text, Text,  
    Text, Text> {

     public void reduce(Text key, Iterable<Text> values, OutputCollector<Text, 
         Text> output, Reporter reporter) throws IOException {

    StringBuilder builder = new StringBuilder();
    for(Text value : values)
    {
        String str = value.toString();
        builder.append(str);
    }
    String valueToWrite=builder.toString();
    output.collect(key, new Text(valueToWrite));
      }

    @Override
    public void reduce(Text arg0, Iterator<Text> arg1,
            OutputCollector<Text, Text> arg2, Reporter arg3)
            throws IOException {
                    }




  }
   public static void main(String[] args) throws Exception {
  JobConf conf = new JobConf(WordCount.class);
  conf.setJobName("wordcount");

  conf.setMapperClass(Map.class);
  conf.setReducerClass(Reduce.class);
  conf.setJarByClass(WordCount.class);
  conf.setOutputKeyClass(Text.class);
  conf.setOutputValueClass(Text.class);

  conf.setInputFormat(TextInputFormat.class);
  conf.setOutputFormat(TextOutputFormat.class);
  conf.setNumReduceTasks(1);
  FileInputFormat.setInputPaths(conf, new Path(args[0]));
  FileOutputFormat.setOutputPath(conf, new Path(args[1]));

  JobClient.runJob(conf);

 }
 }

当我使用相同的inputformat配置（keyvaluetextinputformat.class）运行上述映射器和reducer时，它不会在输出中写入任何内容

我应该改变什么来实现我的目标。

当我查看计数器时，我发现减速机没有输入记录。这意味着减速器一侧发生了一些事情

查看您的代码，我发现您的reduce（）方法签名是：

public void reduce(Text key, Iterable<Text> values, OutputCollector<Text, 
     Text> output, Reporter reporter) throws IOException

public void reduce（文本键、Iterable值、OutputCollector输出、Reporter报告器）引发IOException

它将值声明为Iterable。正确的类型是迭代器（根据在线Javadoc）：

public void reduce（文本键、迭代器值、OutputCollector输出、报告器报告器）引发IOException

因此，即使您提供了reduce方法，但事实上is的签名是错误的，这意味着它没有被使用

将Iterable更改为Iterator，它应该可以工作

这就是为什么每个人都应该添加

@Override

注释：）先生，如果我想使用Iterable，我会怎么做？使用Iterable，你唯一能做的就是向它要一个迭代器，因此，即使您成功地将Iterable传递给reduce方法，您也必须编写代码，通过迭代器访问值。先生，迭代器和Iterable之间的区别是什么？迭代器允许您从前到后遍历列表，而不必知道列表的实现甚至位置。iterable是可以迭代的东西，但它唯一记录的行为是能够返回一个迭代器，让您执行实际的迭代。除了知道对象的实际类或者知道该类实现的另一个接口之外，没有其他可以对iterable执行的操作。只有这样，才能对iterable执行其他行为。

public void reduce(Text key, Iterable<Text> values, OutputCollector<Text, 
     Text> output, Reporter reporter) throws IOException

public void reduce(Text key, Iterator<Text> values, OutputCollector<Text, 
     Text> output, Reporter reporter) throws IOException