如何为hbase创建外部表

如何为hbase创建外部表,hbase,hive,Hbase,Hive,我能够在HBase的配置单元中创建外部表,现在我需要创建一个具有可变列的外部表,这意味着HBase中的列对于特定表(列数)不是固定的,可以在插入数据时动态创建,如何处理这种情况? Summary:当HBase表中的列数不固定时,如何在配置单元中创建外部表 提前谢谢 在Hbase shell中创建表 创建“hbase\u 2\u hive\u名称”、“id”、“名称”、“年龄” 将数据加载到Hbase(输入文件必须为HDFS格式) export HADOOP_CLASSPATH=$(/usr/lo

我能够在HBase的配置单元中创建外部表,现在我需要创建一个具有可变列的外部表,这意味着HBase中的列对于特定表(列数)不是固定的,可以在插入数据时动态创建,如何处理这种情况?

Summary:当HBase表中的列数不固定时,如何在配置单元中创建外部表

提前谢谢

  • 在Hbase shell中创建表

    创建“hbase\u 2\u hive\u名称”、“id”、“名称”、“年龄”

  • 将数据加载到Hbase(输入文件必须为HDFS格式)

    export HADOOP_CLASSPATH=$(/usr/local/hbase/bin/hbase CLASSPATH)$HADOOP_HOME/bin/HADOOP jar/usr/local/hbase/hbase-0.94.1.jar importtsv-Dimporttsv.columns=hbase_ROW_KEY,id:id,name:fn,name:ln,age:age hbase_2_hive_names/var/data/samples/names.tsv

  • 在配置单元外壳中创建外部表

    使用serdeProperty(“hbase.columns.mapping”=”:key,id:id,name:fn,name:ln,age:age”)TBLProperty(“hbase.TABLE.name”=“hbase 2\u hive\u name”)创建由“org.apache.hadoop.hive.hbase.hbase.hbase存储处理程序”存储的外部表hbase\u hive\u名称(hbid INT,id INT,id,fn字符串,ln字符串,ln字符串,age INT),并创建外部表hbase\u hive\u名称


  • 步骤1:登录到HBase外壳

    hbase shell
    
    步骤2:创建HBase表

    hbase(main):001:0> create 'hbase_emp_table', [{NAME => 'per', COMPRESSION => 'SNAPPY'}, {NAME => 'prof', COMPRESSION => 'SNAPPY'} ]
    Created table hbase_emp_table
    Took 1.5417 seconds
    => Hbase::Table - hbase_emp_table
    
    put 'hbase_emp_table','1','per:name','Ranga Reddy'
    put 'hbase_emp_table','1','per:age','32'
    put 'hbase_emp_table','1','prof:des','Senior Software Engineer'
    put 'hbase_emp_table','1','prof:sal','50000'
    
    put 'hbase_emp_table','2','per:name','Nishanth Reddy'
    put 'hbase_emp_table','2','per:age','3'
    put 'hbase_emp_table','2','prof:des','Software Engineer'
    put 'hbase_emp_table','2','prof:sal','80000'
    
    hive
    
    CREATE EXTERNAL TABLE IF NOT EXISTS hive_emp_table(id INT, name STRING, age SMALLINT, designation STRING, salary BIGINT) 
    STORED BY 'org.apache.hadoop.hive.hbase.HBaseStorageHandler' 
    WITH SERDEPROPERTIES ("hbase.columns.mapping" = ":key,per:name,per:age,prof:des,prof:sal") 
    TBLPROPERTIES("hbase.table.name" = "hbase_emp_table");
    
    步骤3:描述HBase表:

    hbase(main):002:0> describe 'hbase_emp_table'
    Table hbase_emp_table is ENABLED
    hbase_emp_table
    COLUMN FAMILIES DESCRIPTION
    {NAME => 'per', VERSIONS => '1', EVICT_BLOCKS_ON_CLOSE => 'false', NEW_VERSION_BEHAVIOR => 'false', KEEP_DELETED_CELLS => 'FALSE', CACHE_DATA_ON_WRITE => 'false', DATA_BLOCK_ENCODING =>
     'NONE', TTL => 'FOREVER', MIN_VERSIONS => '0', REPLICATION_SCOPE => '0', BLOOMFILTER => 'ROW', CACHE_INDEX_ON_WRITE => 'false', IN_MEMORY => 'false', CACHE_BLOOMS_ON_WRITE => 'false',
    PREFETCH_BLOCKS_ON_OPEN => 'false', COMPRESSION => 'SNAPPY', BLOCKCACHE => 'true', BLOCKSIZE => '65536'}
    {NAME => 'prof', VERSIONS => '1', EVICT_BLOCKS_ON_CLOSE => 'false', NEW_VERSION_BEHAVIOR => 'false', KEEP_DELETED_CELLS => 'FALSE', CACHE_DATA_ON_WRITE => 'false', DATA_BLOCK_ENCODING =
    > 'NONE', TTL => 'FOREVER', MIN_VERSIONS => '0', REPLICATION_SCOPE => '0', BLOOMFILTER => 'ROW', CACHE_INDEX_ON_WRITE => 'false', IN_MEMORY => 'false', CACHE_BLOOMS_ON_WRITE => 'false',
     PREFETCH_BLOCKS_ON_OPEN => 'false', COMPRESSION => 'SNAPPY', BLOCKCACHE => 'true', BLOCKSIZE => '65536'}
    2 row(s)
    Took 0.1846 seconds
    
    步骤4:向HBase表插入数据

    hbase(main):001:0> create 'hbase_emp_table', [{NAME => 'per', COMPRESSION => 'SNAPPY'}, {NAME => 'prof', COMPRESSION => 'SNAPPY'} ]
    Created table hbase_emp_table
    Took 1.5417 seconds
    => Hbase::Table - hbase_emp_table
    
    put 'hbase_emp_table','1','per:name','Ranga Reddy'
    put 'hbase_emp_table','1','per:age','32'
    put 'hbase_emp_table','1','prof:des','Senior Software Engineer'
    put 'hbase_emp_table','1','prof:sal','50000'
    
    put 'hbase_emp_table','2','per:name','Nishanth Reddy'
    put 'hbase_emp_table','2','per:age','3'
    put 'hbase_emp_table','2','prof:des','Software Engineer'
    put 'hbase_emp_table','2','prof:sal','80000'
    
    hive
    
    CREATE EXTERNAL TABLE IF NOT EXISTS hive_emp_table(id INT, name STRING, age SMALLINT, designation STRING, salary BIGINT) 
    STORED BY 'org.apache.hadoop.hive.hbase.HBaseStorageHandler' 
    WITH SERDEPROPERTIES ("hbase.columns.mapping" = ":key,per:name,per:age,prof:des,prof:sal") 
    TBLPROPERTIES("hbase.table.name" = "hbase_emp_table");
    
    步骤5:检查HBase表数据

    hbase(main):012:0> scan 'hbase_emp_table'
    ROW                                             COLUMN+CELL
     1                                              column=per:age, timestamp=1606304606241, value=32
     1                                              column=per:name, timestamp=1606304606204, value=Ranga Reddy
     1                                              column=prof:des, timestamp=1606304606269, value=Senior Software Engineer
     1                                              column=prof:sal, timestamp=1606304606301, value=50000
     2                                              column=per:age, timestamp=1606304606362, value=3
     2                                              column=per:name, timestamp=1606304606338, value=Nishanth Reddy
     2                                              column=prof:des, timestamp=1606304606387, value=Software Engineer
     2                                              column=prof:sal, timestamp=1606304608374, value=80000
    2 row(s)
    Took 0.0513 seconds
    
    hive> select * from hive_emp_table;
    INFO  : OK
    +--------------------+----------------------+---------------------+-----------------------------+------------------------+
    | hive_emp_table.id  | hive_emp_table.name  | hive_emp_table.age  | hive_emp_table.designation  | hive_emp_table.salary  |
    +--------------------+----------------------+---------------------+-----------------------------+------------------------+
    | 1                  | Ranga Reddy          | 32                  | Senior Software Engineer    | 50000                  |
    | 2                  | Nishanth Reddy       | 3                   | Software Engineer           | 80000                  |
    +--------------------+----------------------+---------------------+-----------------------------+------------------------+
    2 rows selected (17.401 seconds)
    
    第6步:使用蜂巢或直线登录到蜂巢外壳

    hbase(main):001:0> create 'hbase_emp_table', [{NAME => 'per', COMPRESSION => 'SNAPPY'}, {NAME => 'prof', COMPRESSION => 'SNAPPY'} ]
    Created table hbase_emp_table
    Took 1.5417 seconds
    => Hbase::Table - hbase_emp_table
    
    put 'hbase_emp_table','1','per:name','Ranga Reddy'
    put 'hbase_emp_table','1','per:age','32'
    put 'hbase_emp_table','1','prof:des','Senior Software Engineer'
    put 'hbase_emp_table','1','prof:sal','50000'
    
    put 'hbase_emp_table','2','per:name','Nishanth Reddy'
    put 'hbase_emp_table','2','per:age','3'
    put 'hbase_emp_table','2','prof:des','Software Engineer'
    put 'hbase_emp_table','2','prof:sal','80000'
    
    hive
    
    CREATE EXTERNAL TABLE IF NOT EXISTS hive_emp_table(id INT, name STRING, age SMALLINT, designation STRING, salary BIGINT) 
    STORED BY 'org.apache.hadoop.hive.hbase.HBaseStorageHandler' 
    WITH SERDEPROPERTIES ("hbase.columns.mapping" = ":key,per:name,per:age,prof:des,prof:sal") 
    TBLPROPERTIES("hbase.table.name" = "hbase_emp_table");
    
    步骤7:创建配置单元表

    hbase(main):001:0> create 'hbase_emp_table', [{NAME => 'per', COMPRESSION => 'SNAPPY'}, {NAME => 'prof', COMPRESSION => 'SNAPPY'} ]
    Created table hbase_emp_table
    Took 1.5417 seconds
    => Hbase::Table - hbase_emp_table
    
    put 'hbase_emp_table','1','per:name','Ranga Reddy'
    put 'hbase_emp_table','1','per:age','32'
    put 'hbase_emp_table','1','prof:des','Senior Software Engineer'
    put 'hbase_emp_table','1','prof:sal','50000'
    
    put 'hbase_emp_table','2','per:name','Nishanth Reddy'
    put 'hbase_emp_table','2','per:age','3'
    put 'hbase_emp_table','2','prof:des','Software Engineer'
    put 'hbase_emp_table','2','prof:sal','80000'
    
    hive
    
    CREATE EXTERNAL TABLE IF NOT EXISTS hive_emp_table(id INT, name STRING, age SMALLINT, designation STRING, salary BIGINT) 
    STORED BY 'org.apache.hadoop.hive.hbase.HBaseStorageHandler' 
    WITH SERDEPROPERTIES ("hbase.columns.mapping" = ":key,per:name,per:age,prof:des,prof:sal") 
    TBLPROPERTIES("hbase.table.name" = "hbase_emp_table");
    
    步骤8:选择配置单元表数据

    hbase(main):012:0> scan 'hbase_emp_table'
    ROW                                             COLUMN+CELL
     1                                              column=per:age, timestamp=1606304606241, value=32
     1                                              column=per:name, timestamp=1606304606204, value=Ranga Reddy
     1                                              column=prof:des, timestamp=1606304606269, value=Senior Software Engineer
     1                                              column=prof:sal, timestamp=1606304606301, value=50000
     2                                              column=per:age, timestamp=1606304606362, value=3
     2                                              column=per:name, timestamp=1606304606338, value=Nishanth Reddy
     2                                              column=prof:des, timestamp=1606304606387, value=Software Engineer
     2                                              column=prof:sal, timestamp=1606304608374, value=80000
    2 row(s)
    Took 0.0513 seconds
    
    hive> select * from hive_emp_table;
    INFO  : OK
    +--------------------+----------------------+---------------------+-----------------------------+------------------------+
    | hive_emp_table.id  | hive_emp_table.name  | hive_emp_table.age  | hive_emp_table.designation  | hive_emp_table.salary  |
    +--------------------+----------------------+---------------------+-----------------------------+------------------------+
    | 1                  | Ranga Reddy          | 32                  | Senior Software Engineer    | 50000                  |
    | 2                  | Nishanth Reddy       | 3                   | Software Engineer           | 80000                  |
    +--------------------+----------------------+---------------------+-----------------------------+------------------------+
    2 rows selected (17.401 seconds)
    

    你认为这是如何工作的?我得到了一些解决方案,我也将在这里发布,如果需要,它将帮助其他人创建外部表shashwat(key int,value map)>存储在'org.apache.hadoop.hive.hbase.HBaseStorageHandler'>中,并使用serdeProperty(“hbase.columns.mapping”=”:key,demo:)>tblproperty(“hbase.TABLE.name”=“hbase_shashwat”);