在hadoop中使用计数器和ToolRunner时显示状态为DEFINE而不是RUNNING的作业
我正在尝试使用MapReduce进行迭代。 我有在hadoop中使用计数器和ToolRunner时显示状态为DEFINE而不是RUNNING的作业,hadoop,mapreduce,toolrunner,Hadoop,Mapreduce,Toolrunner,我正在尝试使用MapReduce进行迭代。 我有3个序列作业正在运行 static enum UpdateCounter { INCOMING_ATTR } public static void main(String[] args) throws Exception { Configuration conf = new Configuration(); int res = ToolRunner.run(conf, new Driver(), args)
3个序列
作业正在运行
static enum UpdateCounter {
INCOMING_ATTR
}
public static void main(String[] args) throws Exception {
Configuration conf = new Configuration();
int res = ToolRunner.run(conf, new Driver(), args);
System.exit(res);
}
@Override
public int run(String[] args) throws Exception {
while(counter >= 0){
Configuration conf = getConf();
/*
* Job 1:
*/
Job job1 = new Job(conf, "");
//other configuration
job1.setMapperClass(ID3ClsLabelMapper.class);
job1.setReducerClass(ID3ClsLabelReducer.class);
Path in = new Path(args[0]);
Path out1 = new Path(CL);
if(counter == 0){
FileInputFormat.addInputPath(job1, in);
}
else{
FileInputFormat.addInputPath(job1, out5);
}
FileInputFormat.addInputPath(job1, in);
FileOutputFormat.setOutputPath(job1,out1);
job1.waitForCompletion(true);
/*
* Job 2:
*
*/
Configuration conf2 = getConf();
Job job2 = new Job(conf2, "");
Path out2 = new Path(ANC);
FileInputFormat.addInputPath(job2, in);
FileOutputFormat.setOutputPath(job2,out2);
job2.waitForCompletion(true);
/*
* Job3
*/
Configuration conf3 = getConf();
Job job3 = new Job(conf3, "");
System.out.println("conf3");
Path out5 = new Path(args[1]);
if(fs.exists(out5)){
fs.delete(out5, true);
}
FileInputFormat.addInputPath(job3,out2);
FileOutputFormat.setOutputPath(job3,out5);
job3.waitForCompletion(true);
FileInputFormat.addInputPath(job3,new Path(args[0]));
FileOutputFormat.setOutputPath(job3,out5);
job3.waitForCompletion(true);
counter = job3.getCounters().findCounter(UpdateCounter.INCOMING_ATTR).getValue();
}
return 0;
作业3减速器
public class ID3GSReducer extends Reducer<NullWritable, Text, NullWritable, Text>{
public static final String UpdateCounter = null;
NullWritable out = NullWritable.get();
public void reduce(NullWritable key,Iterable<Text> values ,Context context) throws IOException, InterruptedException{
for(Text val : values){
String v = val.toString();
context.getCounter(UpdateCounter.INCOMING_ATTR).increment(1);
context.write(out, new Text(v));
}
}
}
现在,如何迭代上述作业
整个3个作业应一直工作,直到传入的_ATTR==0
而job3-args[1]
的输出应该是job1
的输入,用于第二次迭代
请建议
我做错了什么吗。当1:您有一个使计数器递增的减速机,并且您试图使用作业时,会引发非法状态异常。驱动程序代码中的setnumreducetasks(计数器值)2:您试图为默认文件系统设置当前工作目录3:当InputFormat出现问题时。
14/06/12 10:12:30 INFO mapred.JobClient: Virtual memory (bytes) snapshot=0
14/06/12 10:12:30 INFO mapred.JobClient: Total committed heap usage (bytes)=1238630400
conf3
Exception in thread "main" java.lang.IllegalStateException: Job in state DEFINE instead of RUNNING
at org.apache.hadoop.mapreduce.Job.ensureState(Job.java:116)
at org.apache.hadoop.mapreduce.Job.getCounters(Job.java:491)