如何让Pig在HBase中将行存储为字符串而不是字节?

如何让Pig在HBase中将行存储为字符串而不是字节?,hbase,apache-pig,hbasestorage,Hbase,Apache Pig,Hbasestorage,如果我使用hbase外壳并发布: put 'test', 'rowkey1','cf:foo', 'bar' scan 'test' import happybase connection = happybase.Connection('<hostname>') table = connection.table('test') table.put('rowkey2', {'cf:foo': 'bar'}) for row in table.scan(): print row

如果我使用hbase外壳并发布:

put 'test', 'rowkey1','cf:foo', 'bar'
scan 'test'
import happybase
connection = happybase.Connection('<hostname>')
table = connection.table('test')
table.put('rowkey2', {'cf:foo': 'bar'})
for row in table.scan():
    print row
我将以字符串形式查看结果,而不是以字节为单位

如果我使用happybase并发布:

put 'test', 'rowkey1','cf:foo', 'bar'
scan 'test'
import happybase
connection = happybase.Connection('<hostname>')
table = connection.table('test')
table.put('rowkey2', {'cf:foo': 'bar'})
for row in table.scan():
    print row
但是,如果我在Pig中发布以下内容:

A = LOAD 'aggregation_test' USING PigStorage(',') as (device_id:chararray, device_name:chararray, device_sum:int);
STORE A INTO 'hbase://aggregation_test'
USING org.apache.pig.backend.hadoop.hbase.HBaseStorage(
    'cf:device_name, cf:device_sum');
hbase shell和happybase中的扫描结果是字节,而不是字符串

我甚至不能搜索字符串的行键

如何使用Pig和HBasStorage将HDFS中的数据以字符串而不是字节的形式存储到HBase中

hbase shell和happybase中的扫描结果是字节,而不是字符串

我怀疑问题出在源数据上,而不是清管器进程本身

为什么不将源数据复制到本地磁盘并进行检查?比如:

hadoop fs -copyToLocal /<>/aggregation_test /tmp/aggregation_test
cat /tmp/aggregation_test/*
store CompleteCases_f into 'hbase://user_test' using
    org.apache.pig.backend.hadoop.hbase.HBaseStorage(
        'id:DEFAULT id:last_modified birth:year gender:female gender:male','-caster HBaseBinaryConverter'
);

另一项检查:HBase中的行数是否符合您的预期?

您是否尝试过使用HBaseBinaryConverter选项?比如:

hadoop fs -copyToLocal /<>/aggregation_test /tmp/aggregation_test
cat /tmp/aggregation_test/*
store CompleteCases_f into 'hbase://user_test' using
    org.apache.pig.backend.hadoop.hbase.HBaseStorage(
        'id:DEFAULT id:last_modified birth:year gender:female gender:male','-caster HBaseBinaryConverter'
);