Hadoop 映射器和还原器函数的输出是什么_Hadoop_Mapreduce_Hadoop2_Feature Extraction_Mapper

Hadoop 映射器和还原器函数的输出是什么

hadoop mapreduce

Hadoop 映射器和还原器函数的输出是什么,hadoop,mapreduce,hadoop2,feature-extraction,mapper,Hadoop,Mapreduce,Hadoop2,Feature Extraction,Mapper,这是映射器功能 public static class MapForWordCount extends Mapper<Object, Text, Text, IntWritable>{ private IntWritable saleValue = new IntWritable(); private Text rangeValue = new Text(); public void map(Object key, Text value, Context con) throws

这是
映射器功能

public static class MapForWordCount extends Mapper<Object, Text, Text, IntWritable>{

private IntWritable saleValue = new IntWritable();
private Text rangeValue = new Text();

public void map(Object key, Text value, Context con) throws IOException, InterruptedException
{
    String line = value.toString();
    String[] words = line.split(",");
    for(String word: words )
    {
        if(words[3].equals("40")){  
            saleValue.set(Integer.parseInt(words[0]));
            rangeValue.set(words[3]);
            con.write( rangeValue , saleValue );
        }
    }
}   
}

public static class ReduceForWordCount extends Reducer<Text, IntWritable, Text, IntWritable>  
{  
    private IntWritable result = new IntWritable();  
    public void reduce(Text word, Iterable<IntWritable> values, Context con) throws IOException, InterruptedException  
    {  
        for(IntWritable value : values)  
        {  
            result.set(value.get());  
            con.write(word, result);  
        }  
    }  
}

编辑1: 但预期的产出是有限的

40 102  
40 104  
40 105

我做错了什么？

mapper和reducer函数中到底发生了什么

到底发生了什么

您正在使用逗号分隔的文本行，拆分逗号，并过滤掉一些值<如果您所做的只是提取这些值，则每行只能调用一次code>con.write（）

映射器将对您输出的所有“40”键进行分组，并形成一个使用该键写入的所有值的列表。这就是减速器的读数

您可能应该在map函数中尝试此功能

// Set the values to write 
saleValue.set(Integer.parseInt(words[0]));
rangeValue.set(words[3]);

// Filter out only the 40s
if(words[3].equals("40")) {
    // Write out "(40, safeValue)" words.length times 
    for(String word: words )
    {
        con.write( rangeValue , saleValue );
    }
}

如果您不希望分割字符串的长度值重复，那么就去掉for循环

您的reducer所做的只是打印出它从映射器接收到的内容

映射器输出如下：

<word,count>

<unique word, its total count>

在你的情况下，减速机没有任何作用。映射器找到的唯一值/单词仅作为输出给出

理想情况下，您应该减少并获得类似“40150”这样的输出，在同一行中发现了5次。

在的上下文中-在复制条目时，您不需要映射器中或减少器中的循环：

public static class MapForWordCount extends Mapper<Object, Text, Text, IntWritable>{

private IntWritable saleValue = new IntWritable();
private Text rangeValue = new Text();

public void map(Object key, Text value, Context con) throws IOException, InterruptedException
{
    String line = value.toString();
    String[] words = line.split(",");
    if(words[3].equals("40")){  
       saleValue.set(Integer.parseInt(words[0]));
       rangeValue.set(words[3]);
       con.write(rangeValue , saleValue );
    }
}   
}

公共静态类MapForWordCount扩展映射器{
private intwriteable saleValue=new intwriteable（）；
私有文本范围值=新文本（）；
公共void映射（对象键、文本值、上下文con）引发IOException、InterruptedException
{
字符串行=value.toString（）；
String[]words=line.split（“，”）；
如果（字[3]。等于（“40”）{
saleValue.set（Integer.parseInt（words[0]）；
rangeValue.set（文字[3]）；
con.write（rangeValue、saleValue）；
}
}   
}

在reducer中，正如@Serhiy在原始问题中所建议的，您只需要一行代码：

public static class ReduceForWordCount extends Reducer<Text, IntWritable, Text, IntWritable>  
{  
private IntWritable result = new IntWritable();  
public void reduce(Text word, Iterable<IntWritable> values, Context con) throws IOException, InterruptedException  
{  
    con.write(word, null);  
}

公共静态类reduceForDorCount扩展了Reducer
{  
私有IntWritable结果=新的IntWritable（）；
public void reduce（文本字、Iterable值、上下文con）抛出IOException、InterruptedException
{  
con.write（字，空）；
}

重新分级“编辑1”-我将留下一个简单的练习：）

您正在编写键值对…您还想知道什么？感谢@cricket\u 007的建议，我一定会尝试一下…我实际上想知道mapper返回和reducer-接受和打印的确切内容。当您

扩展它们时，这两个类的顺序是s、 映射器的输出键值必须与reducer的输入键值匹配，以提供更多信息-映射器使用上下文对象将值写入reducer（而不是“返回”），而reducer将值发送到输出（同样使用上下文，而不是“返回”）。映射器“发送”具有相同属性的所有值“key”到同一个reducer（这实际上发生在shuffle阶段），因此每个reducer将使用同一个key在一组值上“运行”。感谢@It-Z，这正是我想要的。关于复制条目的方式，您可以参考@cricket_007 answer。
public static class MapForWordCount extends Mapper<Object, Text, Text, IntWritable>{

private IntWritable saleValue = new IntWritable();
private Text rangeValue = new Text();

public void map(Object key, Text value, Context con) throws IOException, InterruptedException
{
    String line = value.toString();
    String[] words = line.split(",");
    if(words[3].equals("40")){  
       saleValue.set(Integer.parseInt(words[0]));
       rangeValue.set(words[3]);
       con.write(rangeValue , saleValue );
    }
}   
}

public static class ReduceForWordCount extends Reducer<Text, IntWritable, Text, IntWritable>  
{  
private IntWritable result = new IntWritable();  
public void reduce(Text word, Iterable<IntWritable> values, Context con) throws IOException, InterruptedException  
{  
    con.write(word, null);  
}