Csv 如何将清管器输出存储到蜂箱表?

Csv 如何将清管器输出存储到蜂箱表?,csv,azure,hadoop,apache-pig,Csv,Azure,Hadoop,Apache Pig,我在Azure上有HDInsight群集,在hdfs(Azure存储)中有.csv文件 我希望使用ApachePig处理这些文件并将输出存储在配置单元表中。为了实现这一点,我编写了以下脚本: A = LOAD '/test/input/t12007.csv' USING PigStorage(',') AS (year:chararray,ArrTime:chararray,DeptTime:chararray); describe A; dump A; store A into 'testdb

我在Azure上有HDInsight群集,在hdfs(Azure存储)中有
.csv
文件

我希望使用ApachePig处理这些文件并将输出存储在配置单元表中。为了实现这一点,我编写了以下脚本:

A = LOAD '/test/input/t12007.csv' USING PigStorage(',') AS (year:chararray,ArrTime:chararray,DeptTime:chararray);
describe A;
dump A;
store A into 'testdb.tbl3' using org.apache.hive.hcatalog.pig.HCatStorer();
此脚本成功加载文件,描述结构,并使用dump显示数据,但在执行store命令时抛出以下错误:

2017-05-02 06:18:41,476 [main] ERROR org.apache.pig.PigServer - exception during parsing: Error during parsing. Could not resolve org.apache.hive.hcatalog.pig.HCatStorer using imports: [, java.lang., org.apache.pig.builtin., org.apache.pig.impl.builtin.]
Failed to parse: Pig script failed to parse: <file script.pig, line 4, column 33> pig script failed to validate: org.apache.pig.backend.executionengine.ExecException: ERROR 1070: Could not resolve org.apache.hive.hcatalog.pig.HCatStorer using imports: [, java.lang., org.apache.pig.builtin., org.apache.pig.impl.builtin.] 
Caused by: <file script.pig, line 4, column 33> pig script failed to validate: org.apache.pig.backend.executionengine.ExecException: ERROR 1070: Could not resolve org.apache.hive.hcatalog.pig.HCatStorer using imports: [, java.lang., org.apache.pig.builtin., org.apache.pig.impl.builtin.] 
Caused by: org.apache.pig.backend.executionengine.ExecException: ERROR 1070: Could not resolve org.apache.hive.hcatalog.pig.HCatStorer using imports: [, java.lang., org.apache.pig.builtin., org.apache.pig.impl.builtin.] 
2017-05-02 06:18:41,484 [main] ERROR org.apache.pig.tools.grunt.Grunt - ERROR 1070: Could not resolve org.apache.hive.hcatalog.pig.HCatStorer using imports: [, java.lang., org.apache.pig.builtin., org.apache.pig.impl.builtin.] 
2017-05-02 06:18:41476[main]错误org.apache.pig.PigServer-解析过程中异常:解析过程中出错。无法使用导入解析org.apache.hive.hcatalog.pig.HCatStorer:[,java.lang.,org.apache.pig.builtin.,org.apache.pig.impl.builtin.]
未能分析:Pig脚本未能分析:Pig脚本未能验证:org.apache.Pig.backend.executionengine.ExecutionException:错误1070:无法使用导入解析org.apache.hive.hcatalog.Pig.HCatStorer:[,java.lang.,org.apache.Pig.builtin.,org.apache.Pig.impl.builtin.]
原因:pig脚本无法验证:org.apache.pig.backend.executionengine.ExecuteException:错误1070:无法使用导入解析org.apache.hive.hcatalog.pig.HCatStorer:[,java.lang.,org.apache.pig.builtin.,org.apache.pig.impl.builtin.]
原因:org.apache.pig.backend.executionengine.ExecuteException:错误1070:无法使用导入解析org.apache.hive.hcatalog.pig.HCatStorer:[,java.lang.,org.apache.pig.builtin.,org.apache.pig.impl.builtin.]
2017-05-02 06:18:41484[main]错误org.apache.pig.tools.grunt.grunt-错误1070:无法使用导入解析org.apache.hive.hcatalog.pig.HCatStorer:[,java.lang.,org.apache.pig.builtin.,org.apache.pig.impl.builtin.]

清管器-使用Catalog

从猪身上

使用HCatalog运行清管器

清管器不会自动拾取
HCatalog
jar。要引入必要的jar,您可以在pig命令中使用标志,或者设置环境变量
pig\u CLASSPATH
pig\u OPTS
,如下所述。要为使用
HCatalog
引入适当的jar,只需在脚本中包含以下标志:

备用方式:

指定
HCatalog
jar的位置,并将带有jar路径的
REGISTER
语句添加到脚本顶部,如下所示

REGISTER /usr/username/client/lib/hive-hcatalog-core-1.2.1.2.3.0.0-2557.jar;
根据集群中的安装情况,您的路径可能不同。您可以使用以下命令找到这个jar位置:
locate*hcatalog-core*

HCatStorer

HCatStorer
与Pig脚本一起使用,将数据写入
HCatalog管理的

用法

HCatStorer
通过Pig store语句访问

STORE A INTO 'tablename'
   USING org.apache.hive.hcatalog.pig.HCatStorer();

尝试使用:pig-useHCatalog your_scriptname.pigi运行脚本。我正在ambari中执行脚本。我在执行脚本时添加了-useHCatalog参数。我正在ambari中执行脚本。我添加了-useHCatalog作为参数。它仍然给出错误。您的配置单元表存在吗?如果不是,则需要先创建表。