Hadoop 使用MapReduce在HBase中插入多行_Hadoop_Mapreduce_Hbase

Hadoop 使用MapReduce在HBase中插入多行

hadoop mapreduce hbase

Hadoop 使用MapReduce在HBase中插入多行,hadoop,mapreduce,hbase,Hadoop,Mapreduce,Hbase,我想批量从每个映射器向HBase表插入N行。我现在知道两种方法：创建对象列表并使用实例的put（list-put）方法，同时确保禁用autoFlush参数使用class和context.write（rowKey，put）方法哪一个更好在第1种方式中，不需要使用context.write（），因为hTable.put（putsList）方法直接将数据放入表中。我的映射器类正在扩展类映射器，那么我应该为键入输出和值输出使用哪些类呢在第二种方式中，我必须调用context.write（row

我想批量从每个映射器向HBase表插入N行。我现在知道两种方法：

创建对象列表并使用实例的

put（list-put）

方法，同时确保禁用

autoFlush

参数

使用class和context.write（rowKey，put）方法

哪一个更好

在第1种方式中，不需要使用

context.write（）

，因为

hTable.put（putsList）

方法直接将数据放入表中。我的映射器类正在扩展

类映射器

，那么我应该为

键入输出

和

值输出

使用哪些类呢

在第二种方式中，我必须调用

context.write（rowKey，put）

N次。是否有任何方法可以使用

context.write（）

来列出

Put

操作

使用MapReduce还有其他方法吗

提前谢谢

我更喜欢第二种选择，即配料是自然的（不需要列出放置）用于mapreduce。。。。要深入了解，请参阅我的第二点

1）您的第一个选项

列表

通常用于独立Hbase Java客户端。在内部，它由hbase.client.write.buffer控制，如下所示，在一个配置XML中

<property>
         <name>hbase.client.write.buffer</name>
         <value>20971520</value> // around 2 mb i guess
 </property>

结论：context.write（rowkey，putlist）使用API是不可能的

然而，（来自上面代码中的mutator.mutate）说

因此，您的批处理是自然的（使用BufferedMutator），如上所述

为什么是单个映射器？为什么不是多个映射器？如何指定映射器的数量？即使您指定这是对代码的建议，也不能保证映射器的数量是一个。您可以使用setNummaptask或conf.set（'mapred.map.tasks'，'NumberOfMapperYouwantToSet'）更改映射器的数量（但这是对配置的建议），但不能保证将设置映射器实例。此外，这取决于输入拆分。请看我详细的回答。。请随意提问。使用您的第一种方法也可以在“公共类HBasePutOrDeleteMapper扩展表映射器{”中完成。我不知道您从哪里得到了示例。“我的映射器类正在扩展类映射器，那么我应该使用哪些类来进行键出和值输出”

org.apache.hadoop.hbase.mapreduce
Class TableOutputFormat<KEY>

java.lang.Object
org.apache.hadoop.mapreduce.OutputFormat<KEY,Mutation>
org.apache.hadoop.hbase.mapreduce.TableOutputFormat<KEY>
All Implemented Interfaces:
org.apache.hadoop.conf.Configurable

@InterfaceAudience.Public
@InterfaceStability.Stable
public class TableOutputFormat<KEY>
extends org.apache.hadoop.mapreduce.OutputFormat<KEY,Mutation>
implements org.apache.hadoop.conf.Configurable
Convert Map/Reduce output and write it to an HBase table. The KEY is ignored

/**
     * Writes a key/value pair into the table.
     *
     * @param key  The key.
     * @param value  The value.
     * @throws IOException When writing fails.
     * @see RecordWriter#write(Object, Object)
     */
    @Override
    public void write(KEY key, Mutation value)
    throws IOException {
      if (!(value instanceof Put) && !(value instanceof Delete)) {
        throw new IOException("Pass a Delete or a Put");
      }
      mutator.mutate(value);
    }
  }