Hadoop 映射器和还原器函数的输出是什么
这是Hadoop 映射器和还原器函数的输出是什么,hadoop,mapreduce,hadoop2,feature-extraction,mapper,Hadoop,Mapreduce,Hadoop2,Feature Extraction,Mapper,这是 映射器功能 public static class MapForWordCount extends Mapper<Object, Text, Text, IntWritable>{ private IntWritable saleValue = new IntWritable(); private Text rangeValue = new Text(); public void map(Object key, Text value, Context con) throws
映射器功能
public static class MapForWordCount extends Mapper<Object, Text, Text, IntWritable>{
private IntWritable saleValue = new IntWritable();
private Text rangeValue = new Text();
public void map(Object key, Text value, Context con) throws IOException, InterruptedException
{
String line = value.toString();
String[] words = line.split(",");
for(String word: words )
{
if(words[3].equals("40")){
saleValue.set(Integer.parseInt(words[0]));
rangeValue.set(words[3]);
con.write( rangeValue , saleValue );
}
}
}
}
public static class ReduceForWordCount extends Reducer<Text, IntWritable, Text, IntWritable>
{
private IntWritable result = new IntWritable();
public void reduce(Text word, Iterable<IntWritable> values, Context con) throws IOException, InterruptedException
{
for(IntWritable value : values)
{
result.set(value.get());
con.write(word, result);
}
}
}
40 105
40 105
40 105
40 105
编辑1:
但预期的产出是有限的
40 102
40 104
40 105
我做错了什么?
mapper和reducer函数中到底发生了什么
到底发生了什么
您正在使用逗号分隔的文本行,拆分逗号,并过滤掉一些值<如果您所做的只是提取这些值,则每行只能调用一次code>con.write()
映射器将对您输出的所有“40”键进行分组,并形成一个使用该键写入的所有值的列表。这就是减速器的读数
您可能应该在map函数中尝试此功能
// Set the values to write
saleValue.set(Integer.parseInt(words[0]));
rangeValue.set(words[3]);
// Filter out only the 40s
if(words[3].equals("40")) {
// Write out "(40, safeValue)" words.length times
for(String word: words )
{
con.write( rangeValue , saleValue );
}
}
如果您不希望分割字符串的长度值重复,那么就去掉for循环
您的reducer所做的只是打印出它从映射器接收到的内容 映射器输出如下:
<word,count>
<unique word, its total count>
在你的情况下,减速机没有任何作用。映射器找到的唯一值/单词仅作为输出给出
理想情况下,您应该减少并获得类似“40150”这样的输出,在同一行中发现了5次。在的上下文中-在复制条目时,您不需要映射器中或减少器中的循环:
public static class MapForWordCount extends Mapper<Object, Text, Text, IntWritable>{
private IntWritable saleValue = new IntWritable();
private Text rangeValue = new Text();
public void map(Object key, Text value, Context con) throws IOException, InterruptedException
{
String line = value.toString();
String[] words = line.split(",");
if(words[3].equals("40")){
saleValue.set(Integer.parseInt(words[0]));
rangeValue.set(words[3]);
con.write(rangeValue , saleValue );
}
}
}
公共静态类MapForWordCount扩展映射器{
private intwriteable saleValue=new intwriteable();
私有文本范围值=新文本();
公共void映射(对象键、文本值、上下文con)引发IOException、InterruptedException
{
字符串行=value.toString();
String[]words=line.split(“,”);
如果(字[3]。等于(“40”){
saleValue.set(Integer.parseInt(words[0]);
rangeValue.set(文字[3]);
con.write(rangeValue、saleValue);
}
}
}
在reducer中,正如@Serhiy在原始问题中所建议的,您只需要一行代码:
public static class ReduceForWordCount extends Reducer<Text, IntWritable, Text, IntWritable>
{
private IntWritable result = new IntWritable();
public void reduce(Text word, Iterable<IntWritable> values, Context con) throws IOException, InterruptedException
{
con.write(word, null);
}
公共静态类reduceForDorCount扩展了Reducer
{
私有IntWritable结果=新的IntWritable();
public void reduce(文本字、Iterable值、上下文con)抛出IOException、InterruptedException
{
con.write(字,空);
}
重新分级“编辑1”-我将留下一个简单的练习:)您正在编写键值对…您还想知道什么?感谢@cricket\u 007的建议,我一定会尝试一下…我实际上想知道mapper返回和reducer-接受和打印的确切内容。当您
扩展它们时,这两个类的顺序是
s、 映射器的输出键值必须与reducer的输入键值匹配,以提供更多信息-映射器使用上下文对象将值写入reducer(而不是“返回”),而reducer将值发送到输出(同样使用上下文,而不是“返回”)。映射器“发送”具有相同属性的所有值“key”到同一个reducer(这实际上发生在shuffle阶段),因此每个reducer将使用同一个key在一组值上“运行”。感谢@It-Z,这正是我想要的。关于复制条目的方式,您可以参考@cricket_007 answer。
public static class MapForWordCount extends Mapper<Object, Text, Text, IntWritable>{
private IntWritable saleValue = new IntWritable();
private Text rangeValue = new Text();
public void map(Object key, Text value, Context con) throws IOException, InterruptedException
{
String line = value.toString();
String[] words = line.split(",");
if(words[3].equals("40")){
saleValue.set(Integer.parseInt(words[0]));
rangeValue.set(words[3]);
con.write(rangeValue , saleValue );
}
}
}
public static class ReduceForWordCount extends Reducer<Text, IntWritable, Text, IntWritable>
{
private IntWritable result = new IntWritable();
public void reduce(Text word, Iterable<IntWritable> values, Context con) throws IOException, InterruptedException
{
con.write(word, null);
}