Hadoop 试图使用pig脚本将数据加载到hbase面临的问题

Hadoop 试图使用pig脚本将数据加载到hbase面临的问题,hadoop,hbase,apache-pig,Hadoop,Hbase,Apache Pig,我写了一个猪的脚本如下: REGISTER /home/hduser/hbase/lib/zookeeper-*.jar REGISTER /home/hduser/hbase/lib/hbase-*.jar REGISTER /home/hduser/hbase/lib/hadoop*.jar REGISTER /home/hduser/pig/lib/hbase-0.94.1.jar REGISTER /home/hduser/pig/lib/zookeeper-3.4.5.jar REGI

我写了一个猪的脚本如下:

REGISTER /home/hduser/hbase/lib/zookeeper-*.jar
REGISTER /home/hduser/hbase/lib/hbase-*.jar
REGISTER /home/hduser/hbase/lib/hadoop*.jar
REGISTER /home/hduser/pig/lib/hbase-0.94.1.jar
REGISTER /home/hduser/pig/lib/zookeeper-3.4.5.jar
REGISTER /home/hduser/pig/lib/piggybank.jar


STOCK_2008 = LOAD 'hdfs:/user/file.txt' using PigStorage(',') AS (no:int, name:chararray, digit:int);
DUMP STOCK_2008 ;
STORE STOCK_2008 INTO 'hbase:/hi' USING org.apache.pig.backend.hadoop.hbase.HBaseStorage('hi_data:name, hi_data:digit');
我有pig版本0.13.0hbase版本0.98.8hadoop 2.5.1。

我面临的问题是:

2014-12-25 15:54:54,028 [main] INFO  org.apache.pig.newplan.logical.optimizer.LogicalPlanOptimizer - {RULES_ENABLED=[AddForEach, ColumnMapKeyPrune, GroupByConstParallelSetter, LimitOptimizer, LoadTypeCastInserter, MergeFilter, MergeForEach, PartitionFilterOptimizer, PushDownForEachFlatten, PushUpFilter, SplitFilter, StreamTypeCastInserter], RULES_DISABLED=[FilterLogicExpressionSimplifier]}
2014-12-25 15:54:56,637 [main] INFO  org.apache.pig.data.SchemaTupleBackend - Key [pig.schematuple] was not set... will not generate code.
2014-12-25 15:54:56,744 [main] INFO  org.apache.hadoop.conf.Configuration.deprecation - fs.default.name is deprecated. Instead, use fs.defaultFS
2014-12-25 15:54:56,753 [main] INFO  org.apache.hadoop.mapreduce.lib.input.FileInputFormat - Total input paths to process : 1
2014-12-25 15:54:56,757 [main] INFO  org.apache.pig.backend.hadoop.executionengine.util.MapRedUtil - Total input paths to process : 1
2014-12-25 15:54:57,585 [main] INFO  org.apache.hadoop.mapreduce.lib.output.FileOutputCommitter - Saved output of task 'attempt__0001_m_000001_1' to hdfs://masternode/tmp/temp-1402587382/tmp-1688408384/_temporary/0/task__0001_m_000001
2014-12-25 15:54:57,720 [main] INFO  org.apache.hadoop.conf.Configuration.deprecation - fs.default.name is deprecated. Instead, use fs.defaultFS
2014-12-25 15:54:57,725 [main] WARN  org.apache.pig.data.SchemaTupleBackend - SchemaTupleBackend has already been initialized
2014-12-25 15:54:57,731 [main] INFO  org.apache.hadoop.mapreduce.lib.input.FileInputFormat - Total input paths to process : 1
2014-12-25 15:54:57,731 [main] INFO  org.apache.pig.backend.hadoop.executionengine.util.MapRedUtil - Total input paths to process : 1
(1,one,1)
(2,two,2)
(3,three,3)
2014-12-25 15:54:57,802 [main] INFO  org.apache.hadoop.conf.Configuration.deprecation - fs.default.name is deprecated. Instead, use fs.defaultFS

2014-12-25 15:54:57,902 [main] ERROR org.apache.pig.tools.grunt.Grunt - ERROR 2998: Unhandled internal error. org.apache.hadoop.hbase.util.Bytes.equals([BLjava/nio/ByteBuffer;)Z

Details at logfile: /home/hduser/pig_1419503089087.log

删除所有寄存器语句并使用'hbase://hi'而不是'hbase:/hi'


因为您使用的是hbase 0.98.8,而您正在导入hbase-0.94.1.jar

所以我仍然面临同样的问题。这是由于某些类路径变量问题造成的吗?只注册您正在使用的特定版本的hbase和zookeeper jar。我现在使用的是pig 14,现在我没有面临这个问题,而是遇到了另一个问题2014-12-31 11:59:18838[main]INFO org.apache.hadoop.hbase.zookeeper.RecoverableZooKeeper-进程标识符=hconnection-0x53c4ecfd连接到zookeeper集合=localhost:2181 2014-12-31 11:59:18842[主发送线程(localhost:2181)]INFO org.apache.zookeeper.ClientCnxn-打开到服务器localhost的套接字连接/127.0.0.1:2181。不会尝试使用SASL进行身份验证(未知错误)。请发送hbase-site.xml或检查hbase中hbase.zookeeper.quorum属性的值。动物园管理员在哪里。您必须在hbase.zookeeper.quorum属性中提到zookeeper服务器主机。