Hadoop 蜂巢查询在Tez上失败,但从直线连接时在Map Reduce上成功
我遇到了一个奇怪的错误。我正在使用where子句运行一个简单的select*查询,下面是查询执行状态的摘要Hadoop 蜂巢查询在Tez上失败,但从直线连接时在Map Reduce上成功,hadoop,hive,mapreduce,amazon-emr,apache-tez,Hadoop,Hive,Mapreduce,Amazon Emr,Apache Tez,我遇到了一个奇怪的错误。我正在使用where子句运行一个简单的select*查询,下面是查询执行状态的摘要 从EMR(Tez引擎)连接到蜂箱-成功 从EMR(MR引擎)连接到配置单元-成功 从直线连接到蜂箱(Tez引擎)-失败 从直线连接到蜂箱(MR引擎)-成功 我需要解决第三点。 这是我正在获取的错误跟踪,无法找到此故障的根本原因以及此错误日志试图传达的内容 at org.apache.hive.service.cli.operation.Operation.toSQLExceptio
at org.apache.hive.service.cli.operation.Operation.toSQLException(Operation.java:380)
at org.apache.hive.service.cli.operation.SQLOperation.runQuery(SQLOperation.java:257)
at org.apache.hive.service.cli.operation.SQLOperation.access$800(SQLOperation.java:91)
at org.apache.hive.service.cli.operation.SQLOperation$BackgroundWork$1.run(SQLOperation.java:348)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1840)
at org.apache.hive.service.cli.operation.SQLOperation$BackgroundWork.run(SQLOperation.java:362)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)' SQL<select `ID`, `ISDELETED`, `ACCOUNTID`, `CREATEDBYID`, `CREATEDDATE`, `FIELD`, `OLDVALUE`, `NEWVALUE`, `AUDIT_UPD_TS`, `SRC_OP_TYP`, `GG_INGEST_TS` from `t4i_ent_sfdc_b2b_psa`.`sf_accounthistory` x WHERE SRC_OP_TYP='NA'>```
位于org.apache.hive.service.cli.operation.operation.toSQLException(operation.java:380)
位于org.apache.hive.service.cli.operation.SQLOperation.runQuery(SQLOperation.java:257)
位于org.apache.hive.service.cli.operation.SQLOperation.access$800(SQLOperation.java:91)
位于org.apache.hive.service.cli.operation.SQLOperation$BackgroundWork$1.run(SQLOperation.java:348)
位于java.security.AccessController.doPrivileged(本机方法)
位于javax.security.auth.Subject.doAs(Subject.java:422)
位于org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1840)
位于org.apache.hive.service.cli.operation.SQLOperation$BackgroundWork.run(SQLOperation.java:362)
位于java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
在java.util.concurrent.FutureTask.run(FutureTask.java:266)处
位于java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
位于java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
运行(Thread.java:748)'SQL```
我能够解决这个问题。问题是,我没有指定用户,而是通过JDBC将应用程序连接到配置单元。对于需要简单数据流的查询,它是成功的,但在触发Map Reduce作业以写入HDFS的情况下,写入操作失败并出现错误
Failed to execute tez graph.
org.apache.hadoop.security.AccessControlException: Permission denied: user=anonymous, access=WRITE, inode="/user":hdfs:hadoop:drwxr-xr-x
为了解决这个问题,我添加了user=hadoop;在JDBC URL中,查询现在可以正常运行。尝试调用下面的
beeline
(Tez引擎
),然后运行查询:
beeline -u "jdbc:hive2://<host>:<port>,/;serviceDiscoveryMode=zooKeeper;zooKeeperNamespace=hiveserver2-batch?tez.queue.name=<yarn-queue-name>"
希望这对您有所帮助您可以共享用于调用蜂巢(beeline)会话的beeline命令吗?这将有助于理解是否有遗漏。嗨@AjayAhuja,我已经解决了这个问题,现在已经发布了答案。
`sf_accounthistory` x WHERE SRC_OP_TYP='NA'