Csv 如何将清管器输出存储到蜂箱表?
我在Azure上有HDInsight群集,在hdfs(Azure存储)中有Csv 如何将清管器输出存储到蜂箱表?,csv,azure,hadoop,apache-pig,Csv,Azure,Hadoop,Apache Pig,我在Azure上有HDInsight群集,在hdfs(Azure存储)中有.csv文件 我希望使用ApachePig处理这些文件并将输出存储在配置单元表中。为了实现这一点,我编写了以下脚本: A = LOAD '/test/input/t12007.csv' USING PigStorage(',') AS (year:chararray,ArrTime:chararray,DeptTime:chararray); describe A; dump A; store A into 'testdb
.csv
文件
我希望使用ApachePig处理这些文件并将输出存储在配置单元表中。为了实现这一点,我编写了以下脚本:
A = LOAD '/test/input/t12007.csv' USING PigStorage(',') AS (year:chararray,ArrTime:chararray,DeptTime:chararray);
describe A;
dump A;
store A into 'testdb.tbl3' using org.apache.hive.hcatalog.pig.HCatStorer();
此脚本成功加载文件,描述结构,并使用dump显示数据,但在执行store命令时抛出以下错误:
2017-05-02 06:18:41,476 [main] ERROR org.apache.pig.PigServer - exception during parsing: Error during parsing. Could not resolve org.apache.hive.hcatalog.pig.HCatStorer using imports: [, java.lang., org.apache.pig.builtin., org.apache.pig.impl.builtin.]
Failed to parse: Pig script failed to parse: <file script.pig, line 4, column 33> pig script failed to validate: org.apache.pig.backend.executionengine.ExecException: ERROR 1070: Could not resolve org.apache.hive.hcatalog.pig.HCatStorer using imports: [, java.lang., org.apache.pig.builtin., org.apache.pig.impl.builtin.]
Caused by: <file script.pig, line 4, column 33> pig script failed to validate: org.apache.pig.backend.executionengine.ExecException: ERROR 1070: Could not resolve org.apache.hive.hcatalog.pig.HCatStorer using imports: [, java.lang., org.apache.pig.builtin., org.apache.pig.impl.builtin.]
Caused by: org.apache.pig.backend.executionengine.ExecException: ERROR 1070: Could not resolve org.apache.hive.hcatalog.pig.HCatStorer using imports: [, java.lang., org.apache.pig.builtin., org.apache.pig.impl.builtin.]
2017-05-02 06:18:41,484 [main] ERROR org.apache.pig.tools.grunt.Grunt - ERROR 1070: Could not resolve org.apache.hive.hcatalog.pig.HCatStorer using imports: [, java.lang., org.apache.pig.builtin., org.apache.pig.impl.builtin.]
2017-05-02 06:18:41476[main]错误org.apache.pig.PigServer-解析过程中异常:解析过程中出错。无法使用导入解析org.apache.hive.hcatalog.pig.HCatStorer:[,java.lang.,org.apache.pig.builtin.,org.apache.pig.impl.builtin.]
未能分析:Pig脚本未能分析:Pig脚本未能验证:org.apache.Pig.backend.executionengine.ExecutionException:错误1070:无法使用导入解析org.apache.hive.hcatalog.Pig.HCatStorer:[,java.lang.,org.apache.Pig.builtin.,org.apache.Pig.impl.builtin.]
原因:pig脚本无法验证:org.apache.pig.backend.executionengine.ExecuteException:错误1070:无法使用导入解析org.apache.hive.hcatalog.pig.HCatStorer:[,java.lang.,org.apache.pig.builtin.,org.apache.pig.impl.builtin.]
原因:org.apache.pig.backend.executionengine.ExecuteException:错误1070:无法使用导入解析org.apache.hive.hcatalog.pig.HCatStorer:[,java.lang.,org.apache.pig.builtin.,org.apache.pig.impl.builtin.]
2017-05-02 06:18:41484[main]错误org.apache.pig.tools.grunt.grunt-错误1070:无法使用导入解析org.apache.hive.hcatalog.pig.HCatStorer:[,java.lang.,org.apache.pig.builtin.,org.apache.pig.impl.builtin.]
清管器-使用Catalog
从猪身上
使用HCatalog运行清管器
清管器不会自动拾取HCatalog
jar。要引入必要的jar,您可以在pig命令中使用标志,或者设置环境变量pig\u CLASSPATH
和pig\u OPTS
,如下所述。要为使用HCatalog
引入适当的jar,只需在脚本中包含以下标志:
备用方式:
指定HCatalog
jar的位置,并将带有jar路径的REGISTER
语句添加到脚本顶部,如下所示
REGISTER /usr/username/client/lib/hive-hcatalog-core-1.2.1.2.3.0.0-2557.jar;
根据集群中的安装情况,您的路径可能不同。您可以使用以下命令找到这个jar位置:locate*hcatalog-core*
HCatStorer
HCatStorer
与Pig脚本一起使用,将数据写入HCatalog管理的
表
用法
HCatStorer
通过Pig store语句访问
STORE A INTO 'tablename'
USING org.apache.hive.hcatalog.pig.HCatStorer();
尝试使用:pig-useHCatalog your_scriptname.pigi运行脚本。我正在ambari中执行脚本。我在执行脚本时添加了-useHCatalog参数。我正在ambari中执行脚本。我添加了-useHCatalog作为参数。它仍然给出错误。您的配置单元表存在吗?如果不是,则需要先创建表。