Warning: file_get_contents(/data/phpspider/zhask/data//catemap/0/hadoop/6.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
Hadoop 使用Pig在Hbase中存储时出错_Hadoop_Hbase_Apache Pig_Store_Hortonworks Data Platform - Fatal编程技术网

Hadoop 使用Pig在Hbase中存储时出错

Hadoop 使用Pig在Hbase中存储时出错,hadoop,hbase,apache-pig,store,hortonworks-data-platform,Hadoop,Hbase,Apache Pig,Store,Hortonworks Data Platform,hadoop dfs输入数据目录: [ituser1@genome-dev3 ~]$ hadoop fs -cat FOR_COPY/COMPETITOR_BROKERING/part-r-00000 | head -1 返回: 836646827,1000.0,2016-02-20,34,CAPITAL BOOK,POS/CAPITAL BOOK/NEW DELHI/200216/14:18,BOOKS AND STATIONERY,5497519004453567/41043516,MAR

hadoop dfs输入数据目录:

[ituser1@genome-dev3 ~]$ hadoop fs -cat FOR_COPY/COMPETITOR_BROKERING/part-r-00000 | head -1
返回:

836646827,1000.0,2016-02-20,34,CAPITAL BOOK,POS/CAPITAL BOOK/NEW DELHI/200216/14:18,BOOKS AND STATIONERY,5497519004453567/41043516,MARRIED,M,SALARIED,D,5942,1
我的猪代码:

DATA = LOAD 'FOR_COPY/COMPETITOR_BROKERING' USING PigStorage(',') AS (CUST_ID:chararray,TXN_AMT:chararray,TXN_DATE:chararray,AGE_CASA:chararray,MERCH_NAME:chararray,TXN_PARTICULARS:chararray,MCC_CATEGORY:chararray,TXN_REMARKS:chararray,MARITAL_STATUS_CASA:chararray,GENDER_CASA:chararray,OCCUPATION_CAT_V2_NEW:chararray,DR_CR:chararray,MCC_CODE:chararray,OCCURANCE:int);

DATA_FIL = FOREACH DATA GENERATE                
                (chararray)CUST_ID AS CUST_ID,
                (chararray)TXN_AMT AS TXN_AMT,
                (chararray)TXN_DATE AS TXN_DATE,
                (chararray)AGE_CASA AS AGE_CASA,
                (chararray)MERCH_NAME AS MERCH_NAME,
                (chararray)TXN_PARTICULARS AS TXN_PARTICULARS,
                (chararray)MCC_CATEGORY AS MCC_CATEGORY,
                (chararray)TXN_REMARKS AS TXN_REMARKS,
                (chararray)MARITAL_STATUS_CASA AS MARITAL_STATUS_CASA,
                (chararray)GENDER_CASA AS GENDER_CASA,
                (chararray)OCCUPATION_CAT_V2_NEW AS OCCUPATION_CAT_V2_NEW,
                (chararray)DR_CR AS DR_CR,
                (chararray)MCC_CODE AS MCC_CODE;

STORE DATA_FIL INTO 'hbase://TXN_EVENTS' USING org.apache.pig.backend.hadoop.hbase.HBaseStorage ('DETAILS:CUST_ID DETAILS:TXN_AMT DETAILS:TXN_DATE DETAILS:AGE_CASA DETAILS:MERCH_NAME DETAILS:TXN_PARTICULARS DETAILS:MCC_CATEGORY DETAILS:TXN_REMARKS DETAILS:MARITAL_STATUS_CASA DETAILS:GENDER_CASA DETAILS:OCCUPATION_CAT_V2_NEW DETAILS:DR_CR DETAILS:MCC_CODE');
但给出了错误:

ERROR org.apache.pig.tools.grunt.GruntParser - ERROR 2244: Job job_1457792710587_0100 failed, hadoop does not return any error message
但我的负载工作正常:

HDATA = LOAD 'hbase://TXN_EVENTS'
       USING org.apache.pig.backend.hadoop.hbase.HBaseStorage(
       'DETAILS:CUST_ID DETAILS:TXN_AMT DETAILS:TXN_DATE DETAILS:AGE_CASA DETAILS:MERCH_NAME DETAILS:TXN_PARTICULARS DETAILS:MCC_CATEGORY DETAILS:TXN_REMARKS DETAILS:MARITAL_STATUS_CASA DETAILS:GENDER_CASA DETAILS:OCCUPATION_CAT_V2_NEW DETAILS:DR_CR DETAILS:MCC_CODE','-loadKey true' )
       AS (ROWKEY:chararray,CUST_ID:chararray,TXN_AMT:chararray,TXN_DATE:chararray,AGE_CASA:chararray,MERCH_NAME:chararray,TXN_PARTICULARS:chararray,MCC_CATEGORY:chararray,TXN_REMARKS:chararray,MARITAL_STATUS_CASA:chararray,GENDER_CASA:chararray,OCCUPATION_CAT_V2_NEW:chararray,DR_CR:chararray,MCC_CODE:chararray);
转储HDATA;(这提供了完美的结果):

谢谢你的帮助

我在分布式模式下使用Horton stack:

HDP2.3 ApachePig版本0.15.0 HBase 1.1.1


此外,我通过Ambari安装了所有JAR。解决了数据上传问题:

由于我没有对关系进行排名,因此hbase rowkey成为排名\

DATA_FIL_1=排名数据_FIL_2

注意:这将生成任意行键

但是,如果您想定义行键,请使用如下命令:

你必须给出另一个关系,只有存储函数不起作用。 这将把第一个元组作为rowkey(您已经定义)


存储数据=将数据存储到'hbase://genome:event_sink'使用org.apache.pig.backend.hadoop.hbase.hbastorage('event_data:CUST_ID event_data:event transaction_data:TXN_AMT transaction_data:TXN_DATE transaction_data:AGE_CASA transaction_data:MERCH NAME transaction_data:TXN_detacts transaction_data:MCC_CATEGORY transaction_data:TXN_备注事务_data:婚姻状况_CASA transaction_数据:性别_CASA事务_数据:职业_catv2;新交易在_data:DR_CR transaction _data:MCC_CODE');

您能检查日志并发布它们吗?org.apache.pig.backend.executionengine.executeexception:ERROR 2244:Job Job_1457792710587_0105失败,hadoop不会在org.apache.pig.tools.grunt.GruntParser.executeBatch(GruntParser.java:179)返回任何错误消息在org.apache.pig.tools.grunt.GruntParser.parsetOnError(GruntParser.java:234)在org.apache.pig.tools.grunt.GruntParser.parsetOnError(GruntParser.java:205)在org.apache.pig.tools.grunt.grunt.exec(grunt.java:81)在org.apache.pig.Main.run(Main.java:631)在org.apache.pig.Main.Main.Main.java:177sun.reflect.NativeMethodAccessorImpl.invoke0(本机方法)sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)java.lang.reflect.Method.invoke(Method.java:606)org.apache.hadoop.util.RunJar.run(RunJar.java:221)org.apache.hadoop.util.RunJar.main(RunJar.java:136)
2016-03-01,1,20.0,2016-03-22,27,test_merch,test/particulars,test_category,test/remarks,married,M,service,D,1234