Hadoop程序驱动程序的多种编写方法-选择哪种？_Hadoop

Hadoop程序驱动程序的多种编写方法-选择哪种？

hadoop

Hadoop程序驱动程序的多种编写方法-选择哪种？,hadoop,Hadoop,我观察到有多种方法可以编写Hadoop程序的驱动程序方法以下方法在中给出这种方法在Oreilly的《Hadoop最终指南2012》一书中给出 public static void main(String[] args) throws Exception { if (args.length != 2) { System.err.println("Usage: MaxTemperature <input path> <output path>"); S

我观察到有多种方法可以编写Hadoop程序的驱动程序方法

以下方法在中给出

这种方法在Oreilly的《Hadoop最终指南2012》一书中给出

public static void main(String[] args) throws Exception {
  if (args.length != 2) {
    System.err.println("Usage: MaxTemperature <input path> <output path>");
    System.exit(-1);
  }
  Job job = new Job();
  job.setJarByClass(MaxTemperature.class);
  job.setJobName("Max temperature");
  FileInputFormat.addInputPath(job, new Path(args[0]));
  FileOutputFormat.setOutputPath(job, new Path(args[1]));
  job.setMapperClass(MaxTemperatureMapper.class);
  job.setReducerClass(MaxTemperatureReducer.class);
  job.setOutputKeyClass(Text.class);
  job.setOutputValueClass(IntWritable.class);
  System.exit(job.waitForCompletion(true) ? 0 : 1);
}

publicstaticvoidmain（字符串[]args）引发异常{
如果（参数长度！=2）{
System.err.println（“用法：MaxTemperature”）；
系统退出（-1）；
}
作业=新作业（）；
job.setJarByClass（MaxTemperature.class）；
job.setJobName（“最高温度”）；
addInputPath（作业，新路径（args[0]）；
setOutputPath（作业，新路径（args[1]）；
setMapperClass（MaxTemperatureMapper.class）；
job.setReducerClass（MaxTemperatureReducer.class）；
job.setOutputKeyClass（Text.class）；
job.setOutputValueClass（IntWritable.class）；
系统退出（作业等待完成（真）？0:1；
}

在尝试Oreilly书中给出的程序时，我发现

Job

类的构造函数不受欢迎。由于Oreilly的书是基于Hadoop2（纱线）的，我很惊讶地看到他们使用了不推荐的类

我想知道每个人都使用哪种方法？

我使用前一种方法。如果我们覆盖run（）方法，我们可以使用hadoop jar选项，如-D、-libjars、-files等。所有这些在几乎任何hadoop项目中都是非常必要的。

不确定是否可以通过main（）方法使用它们。

我使用前一种方法。如果我们覆盖run（）方法，我们可以使用hadoop jar选项，如-D、-libjars、-files等。所有这些在几乎任何hadoop项目中都是非常必要的。

不确定是否可以通过main（）方法使用它们。

与您的第一个（Yahoo）块略有不同-您应该使用利用GenericOptions Parser的ToolRunner/工具类（如Eswara的回答中所述）

模板模式类似于：

import org.apache.hadoop.conf.Configured;
import org.apache.hadoop.mapred.JobConf;
import org.apache.hadoop.mapreduce.Job;
import org.apache.hadoop.util.Tool;
import org.apache.hadoop.util.ToolRunner;

public class ToolExample extends Configured implements Tool {

    @Override
    public int run(String[] args) throws Exception {
        // old API
        JobConf jobConf = new JobConf(getConf());

        // new API
        Job job = new Job(getConf());

        // rest of your config here

        // determine success / failure (depending on your choice of old / new api)
        // return 0 for success, non-zero for an error
        return 0;
    }

    public static void main(String args[]) throws Exception {
        System.exit(ToolRunner.run(new ToolExample(), args));
    }
}

与您的第一个（Yahoo）块略有不同-您应该使用ToolRunner/工具类，这些类利用GenericOptionsParser（如Eswara的回答中所述）

模板模式类似于：

import org.apache.hadoop.conf.Configured;
import org.apache.hadoop.mapred.JobConf;
import org.apache.hadoop.mapreduce.Job;
import org.apache.hadoop.util.Tool;
import org.apache.hadoop.util.ToolRunner;

public class ToolExample extends Configured implements Tool {

    @Override
    public int run(String[] args) throws Exception {
        // old API
        JobConf jobConf = new JobConf(getConf());

        // new API
        Job job = new Job(getConf());

        // rest of your config here

        // determine success / failure (depending on your choice of old / new api)
        // return 0 for success, non-zero for an error
        return 0;
    }

    public static void main(String args[]) throws Exception {
        System.exit(ToolRunner.run(new ToolExample(), args));
    }
}