Hive 配置单元:“where”条件包含子查询时执行错误

Hive 配置单元:“where”条件包含子查询时执行错误,hive,subquery,hiveql,hue,Hive,Subquery,Hiveql,Hue,我有两张桌子。表1是一个大的,表2是一个小的。如果Table1.column1中的值与Table2.column1中的值匹配,我想从表1中提取数据。表1和表2都有第列,第1列。这是我的密码 select * from Table1 where condition1 and condition2 and column1 in (select column1 from Table2) 条件1和条件2旨在限制要提取的表的大小。不确定这是否真的有效。然后我得到了执行错误,返回代码1。我在月台上 编辑

我有两张桌子。表1是一个大的,表2是一个小的。如果Table1.column1中的值与Table2.column1中的值匹配,我想从表1中提取数据。表1和表2都有第列,第1列。这是我的密码

select *
from Table1
where condition1
and condition2
and column1 in (select column1 from Table2)
条件1和条件2旨在限制要提取的表的大小。不确定这是否真的有效。然后我得到了执行错误,返回代码1。我在月台上

编辑

根据@yammanuarun的建议,我尝试了以下代码

SELECT *
FROM
  (SELECT *
   FROM Table1
   WHERE condition1
     AND condition2) t1
INNER JOIN Table2 ON t1.column1 = t2.column1
然后,我得到了以下错误

Error while processing statement: FAILED: Execution Error, return code 2 from 

org.apache.hadoop.hive.ql.exec.tez.TezTask. Application 

application_1580875150091_97539 failed 2 times due to AM Container for 

appattempt_1580875150091_97539_000002 exited with exitCode: 255 Failing this 

attempt.Diagnostics: [2020-02-07 14:35:53.944]Exception from container-launch.

Container id: container_e1237_1580875150091_97539_02_000001 Exit code: 255

Exception message: Launch container failed Shell output: main : command provided 1

 main : run as user is hive main : requested yarn user is hive Getting exit code

 file... Creating script paths... Writing pid file... Writing to tmp file /disk-
11/hadoop/yarn/local/nmPrivate/application_1580875150091_97539/container_e1237_1580875150091_97539_02_000001/container_e1237_1580875150091_97539_02_000001.pid.tmp

Writing to cgroup task files... Creating local dirs... Launching container... 

Getting exit code file... Creating script paths... [2020-02-07 14:35:53.967]Container exited with a non-zero exit code 255. Error file: prelaunch.err. Last 4096 bytes of prelaunch.err : Last 4096 bytes of stderr : 

Exception: java.lang.OutOfMemoryError thrown from the UncaughtExceptionHandler in 

thread "IPC Server idle connection scanner for port 26888" Halting due to Out Of 

Memory Error... Halting due to Out Of Memory Error... Halting due to Out Of Memory 
Error... 

Halting due to Out Of Memory Error... Halting due to Out Of Memory Error... 

Halting due to Out Of Memory Error... Halting due to Out Of Memory Error... 

Halting due to Out Of Memory Error... [2020-02-07 14:35:53.967]Container exited

 with a non-zero exit code 255. Error file: prelaunch.err. Last 4096 bytes of prelaunch.err : Last 4096 bytes of stderr : 

Exception: java.lang.OutOfMemoryError thrown from the UncaughtExceptionHandler in 

thread "IPC Server idle connection scanner for port 26888" Halting due to Out Of Memory Error... Halting due to Out Of Memory Error... 

Halting due to Out Of Memory Error... Halting due to Out Of Memory Error...

 Halting due to Out Of Memory Error... Halting due to Out Of Memory Error... 

Halting due to Out Of Memory Error... Halting due to Out Of Memory Error... 

For more detailed output, check the application tracking page: http://dcwipphm12002.edc.nam.gm.com:8088/cluster/app/application_1580875150091_97539 Then click on links to logs of each attempt. . Failing the application.

看起来是内存错误。有什么方法可以优化我的查询吗?

1。如果您有访问权限,您可以在作业跟踪器中提供或检查实际错误消息。运行作业时,您应该具有应用程序编号。仅仅是作业的返回代码不会有太大帮助。您还可以从日志中找到job tracker url,该url会粘贴到返回代码错误上。2.除此之外,你和乔一起去好吗?从Table1 t1中选择count*,其中条件1和条件2在t1上内部连接Table2 t2。column1=t2.column1;因为您只需要根据提供的条件匹配两个表中的记录,所以我使用了内部联接。@Yammanuaruny谢谢您的提示。您能告诉我如何查看实际的错误信息吗?我有应用程序ID。而且,内部连接的代码不起作用。看起来我无法在where条件后加入。我在上面的评论中给出的查询是错误的。加入后应具备的条件。使用正确的别名,这两个查询对我来说都很好。从t1上的Table1 t1内部联接Table2 t2中选择计数。column1=t2.column1,其中condition1和condition2;从表1 t1中选择计数,其中条件1和条件2以及从表2 t2中选择t2.column1中的t1.column1;在查询历史记录/结果的正上方,配置单元查询作业的日志将被打印,如图像链接所示,即文本或图表中显示结果的位置。在该日志中,您将找到一些作业信息以及一些作业信息,如作业124353131534、终止此hadoop作业的Kill命令、跟踪URL或跟踪作业的URL等。您需要复制跟踪URL并将其粘贴到浏览器中,以查看map reduce日志,然后查看失败的mapper/reducer日志@Yammanuarun谢谢。表1非常大,一直在接收数据,我认为我们不能先加入,然后再做where条件。我尝试了另一个内部连接,查看问题中编辑的部分,但仍然出现错误。