Java MapReduce-如何从Reducer类中的可写和输出前10名
我很难为前10个(键、值)对输出编写reducer代码 我当前的输出格式为((年、市场)、总量)。我要找的是每年前10名的总金额。我当前的代码是每年输出每个市场的每个金额 如有任何建议,将不胜感激 制图员:Java MapReduce-如何从Reducer类中的可写和输出前10名,java,hadoop,mapreduce,Java,Hadoop,Mapreduce,我很难为前10个(键、值)对输出编写reducer代码 我当前的输出格式为((年、市场)、总量)。我要找的是每年前10名的总金额。我当前的代码是每年输出每个市场的每个金额 如有任何建议,将不胜感激 制图员: public class FundingMapper extends Mapper<LongWritable, Text, Text, IntWritable> { private Text Year = new Text(); private Text Market = ne
public class FundingMapper extends Mapper<LongWritable, Text, Text, IntWritable> {
private Text Year = new Text();
private Text Market = new Text();
public void map(LongWritable key, Text value, Context context) throws IOException, InterruptedException {
String line = value.toString();
CSVReader reader = new CSVReader(new StringReader(line));
String[] array = reader.readNext();
reader.close();
Year.set(array[14]);
Market.set(array[3]);
String amountString = array[15].replaceAll("[^0-9]","");
int amount = 0;
try {
amount = Integer.parseInt(amountString);
}
catch(NumberFormatException nfe) {
return;
}
IntWritable intW = new IntWritable(amount);
String S = new StringBuilder().append(Year + " ").append(Market + " ").toString();
context.write(new Text(S), intW);
}
}
输出样本:
2014 Biotechnology 16967648
2014 Social Media 300000
您需要在地图输出中具有“年”键。这将确保您在reducer中一次获得每年的值。然后,您可以在输出中过滤出10个值。请看下面
public void map(LongWritable key, Text value, Context context) throws IOException, InterruptedException {
String line = value.toString();
CSVReader reader = new CSVReader(new StringReader(line));
String[] array = reader.readNext();
reader.close();
Year.set(array[14]);
Market.set(array[3]);
String amountString = array[15].replaceAll("[^0-9]","");
int amount = 0;
try {
amount = Integer.parseInt(amountString);
}
catch(NumberFormatException nfe) {
return;
}
IntWritable intW = new IntWritable(amount);
context.write(new Intwritable(Year), new Text(amount +" "+ market));
}
public void reduce(Text key, Iterable<IntWritable> values, Context context) throws IOException,
InterruptedException {
int count= 0;
int amount =0;
string market = "";
for(IntWritable value : values) {
market = value.toString().split(" ")[1];
amount = Integer.parseInt(value.toString.split(" ")[0])
if(count < 10){
count ++;
context.write(key, value);
}
else
break;
}
// context.write(key, new IntWritable(sum));
}
public void映射(LongWritable键、文本值、上下文)抛出IOException、InterruptedException{
字符串行=value.toString();
CSVReader reader=新CSVReader(新StringReader(行));
String[]数组=reader.readNext();
reader.close();
年份集(数组[14]);
Market.set(数组[3]);
String amountString=数组[15].replaceAll(“^0-9]”,“”);
整数金额=0;
试一试{
amount=Integer.parseInt(amountString);
}
捕获(NumberFormatException nfe){
返回;
}
IntWritable intW=新的IntWritable(金额);
context.write(新的Intwritable(年),新文本(金额+“”+市场));
}
public void reduce(文本键、Iterable值、上下文上下文)抛出IOException,
中断异常{
整数计数=0;
整数金额=0;
字符串市场=”;
for(可写入值:值){
市场=价值.toString().split(“”[1];
amount=Integer.parseInt(value.toString.split(“”[0]))
如果(计数<10){
计数++;
编写(键、值);
}
其他的
打破
}
//write(key,newintwriteable(sum));
}
我应该在我的第一篇文章中澄清这一点,为此我深表歉意,但我需要for循环中的求和公式,因为我的数据中有多个条目是同一年和同一市场的。在输出前10名之前,我需要得到总金额。我还需要我的输出有年份、市场和总量,所以我也需要我在其中的连接。我认为这有点复杂。你能提供一个示例数据集和示例输出,这样我们就可以更多地讨论实现了。如果你想要一个更大的样本量,请告诉我。再说一次,这是我没有解释清楚的错误。我很抱歉。我正在寻找每年10个最大的总金额以及相应的市场。我感谢你迄今为止的帮助。
2014 Biotechnology 16967648
2014 Social Media 300000
public void map(LongWritable key, Text value, Context context) throws IOException, InterruptedException {
String line = value.toString();
CSVReader reader = new CSVReader(new StringReader(line));
String[] array = reader.readNext();
reader.close();
Year.set(array[14]);
Market.set(array[3]);
String amountString = array[15].replaceAll("[^0-9]","");
int amount = 0;
try {
amount = Integer.parseInt(amountString);
}
catch(NumberFormatException nfe) {
return;
}
IntWritable intW = new IntWritable(amount);
context.write(new Intwritable(Year), new Text(amount +" "+ market));
}
public void reduce(Text key, Iterable<IntWritable> values, Context context) throws IOException,
InterruptedException {
int count= 0;
int amount =0;
string market = "";
for(IntWritable value : values) {
market = value.toString().split(" ")[1];
amount = Integer.parseInt(value.toString.split(" ")[0])
if(count < 10){
count ++;
context.write(key, value);
}
else
break;
}
// context.write(key, new IntWritable(sum));
}