Hadoop 映射器和还原器函数的输出是什么

Hadoop 映射器和还原器函数的输出是什么,hadoop,mapreduce,hadoop2,feature-extraction,mapper,Hadoop,Mapreduce,Hadoop2,Feature Extraction,Mapper,这是 映射器功能 public static class MapForWordCount extends Mapper<Object, Text, Text, IntWritable>{ private IntWritable saleValue = new IntWritable(); private Text rangeValue = new Text(); public void map(Object key, Text value, Context con) throws

这是
映射器功能

public static class MapForWordCount extends Mapper<Object, Text, Text, IntWritable>{

private IntWritable saleValue = new IntWritable();
private Text rangeValue = new Text();

public void map(Object key, Text value, Context con) throws IOException, InterruptedException
{
    String line = value.toString();
    String[] words = line.split(",");
    for(String word: words )
    {
        if(words[3].equals("40")){  
            saleValue.set(Integer.parseInt(words[0]));
            rangeValue.set(words[3]);
            con.write( rangeValue , saleValue );
        }
    }
}   
}
public static class ReduceForWordCount extends Reducer<Text, IntWritable, Text, IntWritable>  
{  
    private IntWritable result = new IntWritable();  
    public void reduce(Text word, Iterable<IntWritable> values, Context con) throws IOException, InterruptedException  
    {  
        for(IntWritable value : values)  
        {  
            result.set(value.get());  
            con.write(word, result);  
        }  
    }  
}
40 105  
40 105  
40 105  
40 105
编辑1: 但预期的产出是有限的

40 102  
40 104  
40 105
我做错了什么?

mapper和reducer函数中到底发生了什么

到底发生了什么

您正在使用逗号分隔的文本行,拆分逗号,并过滤掉一些值<如果您所做的只是提取这些值,则每行只能调用一次code>con.write()

映射器将对您输出的所有“40”键进行分组,并形成一个使用该键写入的所有值的列表。这就是减速器的读数

您可能应该在map函数中尝试此功能

// Set the values to write 
saleValue.set(Integer.parseInt(words[0]));
rangeValue.set(words[3]);

// Filter out only the 40s
if(words[3].equals("40")) {
    // Write out "(40, safeValue)" words.length times 
    for(String word: words )
    {
        con.write( rangeValue , saleValue );
    }
}
如果您不希望分割字符串的长度值重复,那么就去掉for循环


您的reducer所做的只是打印出它从映射器接收到的内容

映射器输出如下:

<word,count>
<unique word, its total count>
在你的情况下,减速机没有任何作用。映射器找到的唯一值/单词仅作为输出给出

理想情况下,您应该减少并获得类似“40150”这样的输出,在同一行中发现了5次。

在的上下文中-在复制条目时,您不需要映射器中或减少器中的循环:

public static class MapForWordCount extends Mapper<Object, Text, Text, IntWritable>{

private IntWritable saleValue = new IntWritable();
private Text rangeValue = new Text();

public void map(Object key, Text value, Context con) throws IOException, InterruptedException
{
    String line = value.toString();
    String[] words = line.split(",");
    if(words[3].equals("40")){  
       saleValue.set(Integer.parseInt(words[0]));
       rangeValue.set(words[3]);
       con.write(rangeValue , saleValue );
    }
}   
}
公共静态类MapForWordCount扩展映射器{
private intwriteable saleValue=new intwriteable();
私有文本范围值=新文本();
公共void映射(对象键、文本值、上下文con)引发IOException、InterruptedException
{
字符串行=value.toString();
String[]words=line.split(“,”);
如果(字[3]。等于(“40”){
saleValue.set(Integer.parseInt(words[0]);
rangeValue.set(文字[3]);
con.write(rangeValue、saleValue);
}
}   
}
在reducer中,正如@Serhiy在原始问题中所建议的,您只需要一行代码:

public static class ReduceForWordCount extends Reducer<Text, IntWritable, Text, IntWritable>  
{  
private IntWritable result = new IntWritable();  
public void reduce(Text word, Iterable<IntWritable> values, Context con) throws IOException, InterruptedException  
{  
    con.write(word, null);  
} 
公共静态类reduceForDorCount扩展了Reducer
{  
私有IntWritable结果=新的IntWritable();
public void reduce(文本字、Iterable值、上下文con)抛出IOException、InterruptedException
{  
con.write(字,空);
} 

重新分级“编辑1”-我将留下一个简单的练习:)

您正在编写键值对…您还想知道什么?感谢@cricket\u 007的建议,我一定会尝试一下…我实际上想知道mapper返回和reducer-接受和打印的确切内容。当您
扩展它们时,这两个类的顺序是
s、 映射器的输出键值必须与reducer的输入键值匹配,以提供更多信息-映射器使用上下文对象将值写入reducer(而不是“返回”),而reducer将值发送到输出(同样使用上下文,而不是“返回”)。映射器“发送”具有相同属性的所有值“key”到同一个reducer(这实际上发生在shuffle阶段),因此每个reducer将使用同一个key在一组值上“运行”。感谢@It-Z,这正是我想要的。关于复制条目的方式,您可以参考@cricket_007 answer。
public static class MapForWordCount extends Mapper<Object, Text, Text, IntWritable>{

private IntWritable saleValue = new IntWritable();
private Text rangeValue = new Text();

public void map(Object key, Text value, Context con) throws IOException, InterruptedException
{
    String line = value.toString();
    String[] words = line.split(",");
    if(words[3].equals("40")){  
       saleValue.set(Integer.parseInt(words[0]));
       rangeValue.set(words[3]);
       con.write(rangeValue , saleValue );
    }
}   
}
public static class ReduceForWordCount extends Reducer<Text, IntWritable, Text, IntWritable>  
{  
private IntWritable result = new IntWritable();  
public void reduce(Text word, Iterable<IntWritable> values, Context con) throws IOException, InterruptedException  
{  
    con.write(word, null);  
}