执行错误,从org.apache.hadoop.hive.ql.exec.MapRedTask返回代码2

执行错误,从org.apache.hadoop.hive.ql.exec.MapRedTask返回代码2,hadoop,amazon-s3,hive,amazon-emr,Hadoop,Amazon S3,Hive,Amazon Emr,在S3存储桶中写入数据时,我遇到了奇怪的错误。我没有经常遇到这种错误。因此,我们无法弄清楚问题到底是什么。 仅供参考,我每次都保持EMR的配置不变。此外,s3存储桶中的文件夹未受写保护 insert overwrite directory 's3://logs/apr' select f.cookie,sum(f.pgvw) as pageview, count(distinct(f.cookie)) as visits from ( SELECT a.cookie,a.session,

在S3存储桶中写入数据时,我遇到了奇怪的错误。我没有经常遇到这种错误。因此,我们无法弄清楚问题到底是什么。 仅供参考,我每次都保持EMR的配置不变。此外,s3存储桶中的文件夹未受写保护

    insert overwrite directory 's3://logs/apr'
select f.cookie,sum(f.pgvw) as pageview, count(distinct(f.cookie)) as visits from
(
SELECT a.cookie,a.session,count(distinct(a.date_time)) as pgvw from
(
SELECT extcookie(cs_Cookie) as cookie,extsession(cs_cookie) as session,concat(logdate,' ',time) as date_time FROM apr_1 where (uri like '%.aspx%' or uri like '%.html%') and (not(uri like '/lts%')) 
and (extcookie(cs_Cookie)!='-' and extcookie(cs_Cookie)!='  ')
and (extsession(cs_Cookie)!='-' and extsession(cs_Cookie)!='  ')
group by extcookie(cs_cookie),extsession(cs_cookie),logdate,time
)a
group by a.cookie,a.session
)f
where f.pgvw>1
group by f.cookie;
失败作业的日志:

FINISH_TIME="1373886754825" HOSTNAME="10\.144\.95\.241" ERROR="java\.lang\.RuntimeException: org\.apache\.hadoop\.hive\.ql\.metadata\.HiveException: Hive Runtime Error while processing row {\"_col0\":\"CWC\=4LD8UplOIb7rD5X3uinVAWd7H\",\"_col1\":7,\"_col2\":1}
        at org\.apache\.hadoop\.hive\.ql\.exec\.ExecMapper\.map(ExecMapper\.java:166)
        at org\.apache\.hadoop\.mapred\.MapRunner\.run(MapRunner\.java:50)
        at org\.apache\.hadoop\.mapred\.MapTask\.runOldMapper(MapTask\.java:441)
        at org\.apache\.hadoop\.mapred\.MapTask\.run(MapTask\.java:377)
        at org\.apache\.hadoop\.mapred\.Child$4\.run(Child\.java:255)
        at java\.security\.AccessController\.doPrivileged(Native Method)
        at javax\.security\.auth\.Subject\.doAs(Subject\.java:396)
        at org\.apache\.hadoop\.security\.UserGroupInformation\.doAs(UserGroupInformation\.java:1132)
        at org\.apache\.hadoop\.mapred\.Child\.main(Child\.java:249)
    Caused by: org\.apache\.hadoop\.hive\.ql\.metadata\.HiveException: Hive Runtime Error while processing row {\"_col0\":\"CWC\=4LD8UplOIb7rD5X3uinVAWd7H\",\"_col1\":7,\"_col2\":1}
        at org\.apache\.hadoop\.hive\.ql\.exec\.MapOperator\.process(MapOperator\.java:550)
        at org\.apache\.hadoop\.hive\.ql\.exec\.ExecMapper\.map(ExecMapper\.java:148)
        \.\.\. 8 more
    Caused by: java\.lang\.IndexOutOfBoundsException: Index: 1, Size: 1
        at java\.util\.ArrayList\.RangeCheck(ArrayList\.java:547)
        at java\.util\.ArrayList\.get(ArrayList\.java:322)
        at org\.apache\.hadoop\.hive\.serde2\.objectinspector\.StandardStructObjectInspector\.init(StandardStructObjectInspector\.java:118)
        at org\.apache\.hadoop\.hive\.serde2\.objectinspector\.StandardStructObjectInspector\.<init>(StandardStructObjectInspector\.java:106)
        at org\.apache\.hadoop\.hive\.serde2\.objectinspector\.ObjectInspectorFactory\.getStandardStructObjectInspector(ObjectInspectorFactory\.java:274)
        at org\.apache\.hadoop\.hive\.serde2\.objectinspector\.ObjectInspectorFactory\.getStandardStructObjectInspector(ObjectInspectorFactory\.java:259)
        at org\.apache\.hadoop\.hive\.ql\.exec\.ReduceSinkOperator\.initEvaluatorsAndReturnStruct(ReduceSinkOperator\.java:188)
        at org\.apache\.hadoop\.hive\.ql\.exec\.ReduceSinkOperator\.processOp(ReduceSinkOperator\.java:197)
        at org\.apache\.hadoop\.hive\.ql\.exec\.Operator\.process(Operator\.java:471)
        at org\.apache\.hadoop\.hive\.ql\.exec\.Operator\.forward(Operator\.java:762)
        at org\.apache\.hadoop\.hive\.ql\.exec\.TableScanOperator\.processOp(TableScanOperator\.java:83)
        at org\.apache\.hadoop\.hive\.ql\.exec\.Operator\.process(Operator\.java:471)
        at org\.apache\.hadoop\.hive\.ql\.exec\.Operator\.forward(Operator\.java:762)
        at org\.apache\.hadoop\.hive\.ql\.exec\.MapOperator\.process(MapOperator\.java:531)
        \.\.\. 9 more
FINISH\u TIME=“1373886754825”HOSTNAME=“10\.144\.95\.241”ERROR=“java\.lang\.RuntimeException:org\.apache\.hadoop\.hive\.ql\.metadata\.HiveException:hive在处理行{\\”\u col0\”时运行时出错:\“CWC\=4LD8UplOIb7rD5X3uinVAWd7H\”、\“u col1\”:7、“\u col2\”:1}
位于org\.apache\.hadoop\.hive\.ql\.exec\.ExecMapper\.map(ExecMapper\.java:166)
在org\.apache\.hadoop\.mapred\.MapRunner\.run上运行(MapRunner\.java:50)
位于org\.apache\.hadoop\.mapred\.MapTask\.runOldMapper(MapTask\.java:441)
在org\.apache\.hadoop\.mapred\.MapTask\.run上运行(MapTask\.java:377)
在org\.apache\.hadoop\.mapred\.Child$4\.run上运行(Child\.java:255)
位于java\.security\.AccessController\.doPrivileged(本机方法)
在javax\.security\.auth\.Subject\.doAs(Subject\.java:396)
位于org\.apache\.hadoop\.security\.UserGroupInformation\.doAs(UserGroupInformation\.java:1132)
位于org\.apache\.hadoop\.mapred\.Child\.main(Child\.java:249)
由以下原因引起:org\.apache\.hadoop\.hive\.ql\.metadata\.HiveException:处理行{\“\u col0\”:“CWC\=4LD8UplOIb7rD5X3uinVAWd7H\”,“\u col1\”:7,\“\u col2\”:1}
位于org\.apache\.hadoop\.hive\.ql\.exec\.MapOperator\.process(MapOperator\.java:550)
位于org\.apache\.hadoop\.hive\.ql\.exec\.ExecMapper\.map(ExecMapper\.java:148)
\.\.\.8更多
原因:java\.lang\.IndexOutOfBoundsException:索引:1,大小:1
位于java\.util\.ArrayList\.RangeCheck(ArrayList\.java:547)
在java\.util\.ArrayList\.get(ArrayList\.java:322)
位于org\.apache\.hadoop\.hive\.serde2\.objectinspector\.StandardStructObjectInspector\.init(StandardStructObjectInspector\.java:118)
位于org\.apache\.hadoop\.hive\.serde2\.objectinspector\.StandardStructObjectInspector\(StandardStructObjectInspector\.java:106)
位于org\.apache\.hadoop\.hive\.serde2\.objectinspector\.ObjectInspectorFactory\.getStandardStructObjectInspector(ObjectInspectorFactory\.java:274)
位于org\.apache\.hadoop\.hive\.serde2\.objectinspector\.ObjectInspectorFactory\.getStandardStructObjectInspector(ObjectInspectorFactory\.java:259)
位于org\.apache\.hadoop\.hive\.ql\.exec\.ReduceSinkOperator\.initEvaluatorsAndReturnStruct(ReduceSinkOperator\.java:188)
位于org\.apache\.hadoop\.hive\.ql\.exec\.ReduceSinkOperator\.processOp(ReduceSinkOperator\.java:197)
位于org\.apache\.hadoop\.hive\.ql\.exec\.Operator\.process(Operator\.java:471)
位于org\.apache\.hadoop\.hive\.ql\.exec\.Operator\.forward(Operator\.java:762)
位于org\.apache\.hadoop\.hive\.ql\.exec\.TableScanOperator\.processOp(TableScanOperator\.java:83)
位于org\.apache\.hadoop\.hive\.ql\.exec\.Operator\.process(Operator\.java:471)
位于org\.apache\.hadoop\.hive\.ql\.exec\.Operator\.forward(Operator\.java:762)
位于org\.apache\.hadoop\.hive\.ql\.exec\.MapOperator\.process(MapOperator\.java:531)
\.\.\.9更多

如果SELECT或GROUP BY子句包含两个或多个仅大小写不同的字段或聚合(例如,extsession(cs_cookie)和extsession(cs_cookie)),即使ExecMapper没有,优化器也会尝试合并字段。这会导致您看到的错误


您可以通过将“cs_cookie”的所有实例转换为“cs_cookie”(反之亦然)并再次尝试相同的查询来确认或否认这是问题的原因。如果不再出现错误,则是由于此优化问题。

这不是答案,如果您对问题作者有疑问或建议,请使用注释。