Warning: file_get_contents(/data/phpspider/zhask/data//catemap/6/ant/2.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
如何在hadoop中使用JobControl_Hadoop - Fatal编程技术网

如何在hadoop中使用JobControl

如何在hadoop中使用JobControl,hadoop,Hadoop,我想把两个文件合并成一个。 我做了两个绘图器来阅读,一个减速器来连接 JobConf classifiedConf = new JobConf(new Configuration()); classifiedConf.setJarByClass(myjob.class); classifiedConf.setJobName("classifiedjob"); FileInputFormat.setInputPaths(classifiedC

我想把两个文件合并成一个。 我做了两个绘图器来阅读,一个减速器来连接

        JobConf classifiedConf = new JobConf(new Configuration());
            classifiedConf.setJarByClass(myjob.class);
    classifiedConf.setJobName("classifiedjob");
    FileInputFormat.setInputPaths(classifiedConf,classifiedInputPath );
    classifiedConf.setMapperClass(ClassifiedMapper.class);
    classifiedConf.setMapOutputKeyClass(TextPair.class);
    classifiedConf.setMapOutputValueClass(Text.class);
    Job classifiedJob = new Job(classifiedConf);
    //first mapper config

    JobConf featureConf = new JobConf(new Configuration());
    featureConf.setJobName("featureJob");
            featureConf.setJarByClass(myjob.class);
    FileInputFormat.setInputPaths(featureConf, featuresInputPath);
    featureConf.setMapperClass(FeatureMapper.class);
    featureConf.setMapOutputKeyClass(TextPair.class);
    featureConf.setMapOutputValueClass(Text.class);
    Job featureJob = new Job(featureConf);
            //second mapper config

    JobConf joinConf = new JobConf(new Configuration());
    joinConf.setJobName("joinJob");
            joinConf.setJarByClass(myjob.class);
    joinConf.setReducerClass(JoinReducer.class);
    joinConf.setOutputKeyClass(Text.class);
    joinConf.setOutputValueClass(Text.class);
    Job joinJob = new Job(joinConf);
             //reducer config
             //JobControl config
            joinJob.addDependingJob(featureJob);
    joinJob.addDependingJob(classifiedJob);
    secondJob.addDependingJob(joinJob);
    JobControl jobControl = new JobControl("jobControl");
    jobControl.addJob(classifiedJob);
    jobControl.addJob(featureJob);
    jobControl.addJob(secondJob);

    Thread thread = new Thread(jobControl);
    thread.start();
    while(jobControl.allFinished()){
        jobControl.stop();
    }
但是,我得到了这个信息: WARN mapred.JobClient:

Use GenericOptionsParser for parsing the arguments. Applications should implement Tool for the same.
WARN mapred.JobClient: No job jar file set.  User classes may not be found. See JobConf(Class) or JobConf#setJar(String).

任何人都可以帮助您…………

您使用的是哪个版本的Hadoop

您收到的警告将停止程序

您不需要使用setJarByClass()。您可以看到我的代码段,我可以不使用setJarByClass()方法运行它


您应该以以下方式执行您的工作:

public class MyApp extends Configured implements Tool {

    public int run(String[] args) throws Exception {
      // Configuration processed by ToolRunner
      Configuration conf = getConf();

      // Create a JobConf using the processed conf
      JobConf job = new JobConf(conf, MyApp.class);

      // Process custom command-line options
      Path in = new Path(args[1]);
      Path out = new Path(args[2]);

      // Specify various job-specific parameters     
      job.setJobName("my-app");
      job.setInputPath(in);
      job.setOutputPath(out);
      job.setMapperClass(MyMapper.class);
      job.setReducerClass(MyReducer.class);

      // Submit the job, then poll for progress until the job is complete
      JobClient.runJob(job);
      return 0;
    }

    public static void main(String[] args) throws Exception {
      // Let ToolRunner handle generic command-line options 
      int res = ToolRunner.run(new Configuration(), new MyApp(), args);

      System.exit(res);
    }
}
这直接来自Hadoop的文档


因此,基本上您的作业需要继承自
配置的
并实现
工具
。这将迫使您实现
run()
。然后使用
Toolrunner.run(,)
从主类启动作业,警告将消失。

您需要在驱动程序
job.setJarByClass(MapperClassName.class)中包含此代码;

public class MyApp extends Configured implements Tool {

    public int run(String[] args) throws Exception {
      // Configuration processed by ToolRunner
      Configuration conf = getConf();

      // Create a JobConf using the processed conf
      JobConf job = new JobConf(conf, MyApp.class);

      // Process custom command-line options
      Path in = new Path(args[1]);
      Path out = new Path(args[2]);

      // Specify various job-specific parameters     
      job.setJobName("my-app");
      job.setInputPath(in);
      job.setOutputPath(out);
      job.setMapperClass(MyMapper.class);
      job.setReducerClass(MyReducer.class);

      // Submit the job, then poll for progress until the job is complete
      JobClient.runJob(job);
      return 0;
    }

    public static void main(String[] args) throws Exception {
      // Let ToolRunner handle generic command-line options 
      int res = ToolRunner.run(new Configuration(), new MyApp(), args);

      System.exit(res);
    }
}