Warning: file_get_contents(/data/phpspider/zhask/data//catemap/9/java/305.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181

Warning: file_get_contents(/data/phpspider/zhask/data//catemap/0/hadoop/6.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
Java 映射到HBase的Mapreduce作业引发IOException:传递删除或Put_Java_Hadoop_Mapreduce_Hbase_Elastic Map Reduce - Fatal编程技术网

Java 映射到HBase的Mapreduce作业引发IOException:传递删除或Put

Java 映射到HBase的Mapreduce作业引发IOException:传递删除或Put,java,hadoop,mapreduce,hbase,elastic-map-reduce,Java,Hadoop,Mapreduce,Hbase,Elastic Map Reduce,在EMR上使用Hadoop2.4.0和HBase0.94.18时,我试图直接从映射器输出到HBase表 我遇到了一个讨厌的IOException:在执行下面的代码时传递一个Delete或Put public class TestHBase { static class ImportMapper extends Mapper<MyKey, MyValue, ImmutableBytesWritable, Writable> { private by

在EMR上使用Hadoop2.4.0和HBase0.94.18时,我试图直接从映射器输出到HBase表

我遇到了一个讨厌的
IOException:在执行下面的代码时传递一个Delete或Put

public class TestHBase {
  static class ImportMapper 
            extends Mapper<MyKey, MyValue, ImmutableBytesWritable, Writable> {
    private byte[] family = Bytes.toBytes("f");

    @Override
    public void map(MyKey key, MyValue value, Context context) {
      MyItem item = //do some stuff with key/value and create item
      byte[] rowKey = Bytes.toBytes(item.getKey());
      Put put = new Put(rowKey);
      for (String attr : Arrays.asList("a1", "a2", "a3")) {
        byte[] qualifier = Bytes.toBytes(attr);
        put.add(family, qualifier, Bytes.toBytes(item.get(attr)));
      }
      context.write(new ImmutableBytesWritable(rowKey), put);
    }
  }

  public static void main(String[] args) throws Exception {
    Configuration conf = HBaseConfiguration.create();
    String input = args[0];
    String table = "table";
    Job job = Job.getInstance(conf, "stuff");

    job.setJarByClass(ImportMapper.class);
    job.setInputFormatClass(SequenceFileInputFormat.class);
    FileInputFormat.setInputDirRecursive(job, true);
    FileInputFormat.addInputPath(job, new Path(input));

    TableMapReduceUtil.initTableReducerJob(
            table,                  // output table
            null,                   // reducer class
            job);
    job.setNumReduceTasks(0);
    System.exit(job.waitForCompletion(true) ? 0 : 1);
  }
}
公共类TestHBase{
静态类导入器
扩展映射器{
专用字节[]系列=字节数。字节数(“f”);
@凌驾
公共void映射(MyKey、MyValue、上下文){
MyItem item=//使用key/value执行一些操作并创建item
byte[]rowKey=Bytes.toBytes(item.getKey());
Put Put=新Put(行键);
对于(字符串attr:Arrays.asList(“a1”、“a2”、“a3”)){
byte[]限定符=Bytes.toBytes(attr);
add(family、限定符、Bytes.toBytes(item.get(attr));
}
write(新的ImmutableBytesWritable(rowKey),put);
}
}
公共静态void main(字符串[]args)引发异常{
Configuration=HBaseConfiguration.create();
字符串输入=args[0];
String table=“table”;
Job=Job.getInstance(conf,“stuff”);
job.setJarByClass(ImportMapper.class);
作业.setInputFormatClass(SequenceFileInputFormat.class);
setInputDirRecursive(作业,true);
addInputPath(作业,新路径(输入));
TableMapReduceUtil.initTableReducerJob(
表,//输出表
null,//reducer类
工作);
job.setNumReduceTasks(0);
系统退出(作业等待完成(真)?0:1;
}
}
有人知道我做错了什么吗

Stacktrace


错误:java.io.IOException:在org.apache.hadoop.hbase.mapreduce.TableOutputFormat$TableRecordWriter.write(TableOutputFormat.java:125)上传递org.apache.hadoop.hbase.mapreduce.TableOutputFormat$TableRecordWriter.write(TableOutputFormat.java:84)在org.apache.hadoop.mapred.MapTask$NewDirectOutputCollector.write上的删除或放置(MapTask.java:646)在org.apache.hadoop.mapreduce.task.taskInputPutContextImpl.write(taskInputPutPutContextImpl.java:89)在org.apache.hadoop.mapreduce.lib.map.WrappedMapper$Context.write(WrappedMapper.java:112)在org.apache.hadoop.mapreduce.Mapper.map(Mapper.java:124)在org.apache.apache.hadoop.mapreduce.mapreduce.map.run(Mapper.java:145)上org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:775)org.apache.hadoop.mapred.MapTask.run(MapTask.java:341)org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:167)java.security.AccessController.doPrivileged(本机方法)javax.security.auth.Subject.doAs(Subject.java:415)org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1548)org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:162)容器被ApplicationMaster终止。容器在请求时终止。退出代码是143容器退出,退出代码为非零143

如果您可以显示完整的堆栈跟踪,这样我可以帮助您轻松解决问题。我没有执行您的代码。就我所看到的代码而言,这可能是问题所在
job.setNumReduceTasks(0);

Mapper将期望您的
put
对象直接写入Apache HBase。
您可以增加setNumReduceTasks,或者如果您看到API,您可以找到它的默认值并对其进行注释。

感谢您添加堆栈跟踪。不幸的是,您没有包含引发异常的代码,因此我无法为您完全跟踪它。相反,我进行了一些搜索,为您发现了一些东西

您的堆栈跟踪与另一个堆栈跟踪相似,因此问题如下:

那个人通过注释掉
job.setNumReduceTasks(0);

有一个类似的SO问题存在相同的异常,但无法通过这种方式解决问题。相反,它在注释方面存在问题:


下面是一些很好的例子,说明了如何在setNumReduceTasks为0和1或更多的情况下编写工作代码

“51.2.HBase MapReduce读/写示例 下面是一个将HBase用作MapReduce的源和接收器的示例。此示例将简单地将数据从一个表复制到另一个表

Configuration config = HBaseConfiguration.create();
Job job = new Job(config,"ExampleReadWrite");
job.setJarByClass(MyReadWriteJob.class);    // class that contains mapper

Scan scan = new Scan();
scan.setCaching(500);        // 1 is the default in Scan, which will be bad for MapReduce jobs
scan.setCacheBlocks(false);  // don't set to true for MR jobs
// set other scan attrs

TableMapReduceUtil.initTableMapperJob(
  sourceTable,      // input table
  scan,             // Scan instance to control CF and attribute selection
  MyMapper.class,   // mapper class
  null,             // mapper output key
  null,             // mapper output value
  job);
TableMapReduceUtil.initTableReducerJob(
  targetTable,      // output table
  null,             // reducer class
  job);
job.setNumReduceTasks(0);

boolean b = job.waitForCompletion(true);
if (!b) {
    throw new IOException("error with job!");
}
这是一个或多个示例:

“51.4.HBase MapReduce摘要到HBase示例 以下示例使用HBase作为MapReduce源和汇,并执行摘要步骤。此示例将统计表中某个值的不同实例数,并将这些摘要计数写入另一个表中

Configuration config = HBaseConfiguration.create();
Job job = new Job(config,"ExampleSummary");
job.setJarByClass(MySummaryJob.class);     // class that contains mapper and reducer

Scan scan = new Scan();
scan.setCaching(500);        // 1 is the default in Scan, which will be bad for MapReduce jobs
scan.setCacheBlocks(false);  // don't set to true for MR jobs
// set other scan attrs

TableMapReduceUtil.initTableMapperJob(
  sourceTable,        // input table
  scan,               // Scan instance to control CF and attribute selection
  MyMapper.class,     // mapper class
  Text.class,         // mapper output key
  IntWritable.class,  // mapper output value
  job);
TableMapReduceUtil.initTableReducerJob(
  targetTable,        // output table
  MyTableReducer.class,    // reducer class
  job);
job.setNumReduceTasks(1);   // at least one, adjust as required

boolean b = job.waitForCompletion(true);
if (!b) {
  throw new IOException("error with job!");
}


您似乎更接近第一个示例。我想说明,有时将reduce任务数设置为零是有原因的。

让我们看看您提供的代码中的堆栈跟踪:
context.write(new ImmutableBytesWritable(rowKey),put)
它在映射方法之外。请先修复它,因为它与回溯显示的内容不匹配…感谢您指出Ruben,这是复制/粘贴错误添加了堆栈跟踪