根据条件停止Hadoop中的Reduce函数
我有一个reduce函数,在处理一些“n”键后,我想停止reduce函数。我已经在每个键上设置了一个递增计数器,并且在满足条件的情况下从reduce函数返回 这是密码根据条件停止Hadoop中的Reduce函数,hadoop,reduce,Hadoop,Reduce,我有一个reduce函数,在处理一些“n”键后,我想停止reduce函数。我已经在每个键上设置了一个递增计数器,并且在满足条件的情况下从reduce函数返回 这是密码 public class wordcount { public static class Map extends Mapper<LongWritable, Text, IntWritable, IntWritable> { private final static IntWritable on
public class wordcount {
public static class Map extends Mapper<LongWritable, Text, IntWritable, IntWritable> {
private final static IntWritable one = new IntWritable(1);
private Text word = new Text();
private IntWritable leng=new IntWritable();
public void map(LongWritable key, Text value, Context context) throws IOException, InterruptedException {
String line = value.toString();
StringTokenizer tokenizer = new StringTokenizer(line);
while (tokenizer.hasMoreTokens()) {
String lword=tokenizer.nextToken();
leng.set(lword.length());
context.write(leng, one);
}
}
}
public static class Reduce extends Reducer<IntWritable, IntWritable, IntWritable, IntWritable> {
int count=0;
public void reduce(IntWritable key, Iterable<IntWritable> values, Context context)
throws IOException, InterruptedException {
int sum = 0;
for (IntWritable val : values) {
sum += val.get();
count++;
}
context.write(key, new IntWritable(sum));
if(count>19) return;
}
}
公共类字数{
公共静态类映射扩展映射器{
私有最终静态IntWritable one=新的IntWritable(1);
私有文本字=新文本();
private intwriteable leng=新的intwriteable();
公共void映射(LongWritable键、文本值、上下文上下文)引发IOException、InterruptedException{
字符串行=value.toString();
StringTokenizer标记器=新的StringTokenizer(行);
while(tokenizer.hasMoreTokens()){
字符串lword=tokenizer.nextToken();
长度集(lword.length());
上下文。写(冷,一);
}
}
}
公共静态类Reduce扩展Reducer{
整数计数=0;
公共void reduce(可写键、可写值、上下文)
抛出IOException、InterruptedException{
整数和=0;
for(可写入值:值){
sum+=val.get();
计数++;
}
write(key,newintwriteable(sum));
如果(计数>19)返回;
}
}
是否有其他方法可以实现这一点。您可以通过重写Reducer类(新API)的
run()
来实现这一点
公共静态类Reduce扩展Reducer{
//这里的reduce方法
//重写run()命令
@凌驾
公共void运行(上下文上下文)引发IOException、InterruptedException{
设置(上下文);
整数计数=0;
while(context.nextKey()){
如果(计数+++
请记住,如果您有多个减速机,则无法在内部处理已处理的键限制。例如,如果您想在10个键后停止,但有2个减速机,则最终将总共处理20个键。无论从何处开始作业,您都需要从外部控制此限制。我使用的是一个减速机to达到我所需的前n个键的状态。感谢您的提示。感谢Amar。工作正常!
public static class Reduce extends Reducer<IntWritable, IntWritable, IntWritable, IntWritable> {
//reduce method here
// Override the run()
@override
public void run(Context context) throws IOException, InterruptedException {
setup(context);
int count = 0;
while (context.nextKey()) {
if (count++ < n) {
reduce(context.getCurrentKey(), context.getValues(), context);
} else {
// exit or do whatever you want
}
}
cleanup(context);
}
}