Warning: file_get_contents(/data/phpspider/zhask/data//catemap/5/sql/81.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
Sql 运行查询需要很长时间。有什么办法可以简化吗?_Sql_Hive_Hue - Fatal编程技术网

Sql 运行查询需要很长时间。有什么办法可以简化吗?

Sql 运行查询需要很长时间。有什么办法可以简化吗?,sql,hive,hue,Sql,Hive,Hue,我运行的代码如下。我的跑步时间太长了。有没有办法让它跑得更快 SELECT a.data_date as day , sum(a.column1) + sum(a.column2) as total , sum(a.column1) as part1 , sum(a.column2) as part2 , sum(b.column1) as alien FROM table1 a INNER JOIN table1 b ON a.data_date = b.data_date AND a.

我运行的代码如下。我的跑步时间太长了。有没有办法让它跑得更快

SELECT
a.data_date as day
, sum(a.column1) + sum(a.column2) as total
, sum(a.column1) as part1
, sum(a.column2) as part2
, sum(b.column1) as alien

FROM table1 a

INNER JOIN table1 b

ON a.data_date = b.data_date AND a.column3 = b.column3

WHERE a.data_date ='20131001'
and a.column3 = 12345
and a.column4 is not NULL
and b.column4 is NULL

GROUP BY
a.data_date

据我所知,您根本不需要加入
JOIN


您可以通过单次引用表获得相同的结果。

由于这是同一个表,我相信您可以删除您的联接,最好是提供您的示例数据和预期结果,然后我们可以更好地帮助您,Cheel=)


优化技术还取决于表的大小

首先应该使用小表,并尝试将该表放在分布式缓存上

要使其更快,而不是在加入后应用where条件,请尝试在加入前应用它,以便您的加入更快

你可以试试下面的方法

set hive.auto.convert.join.true;
select
a.data_date as day
, sum(a.column1) + sum(a.column2) as total
, sum(a.column1) as part1
, sum(a.column2) as part2
, sum(b.column1) as alien
from table1 b
inner join (select * from table1 WHERE a.data_date ='20131001'
and a.column3 = 12345
and a.column4 is not NULL
)a
on (a.data_date = b.data_date AND a.column3 = b.column3)

where b.column4 is NULL
GROUP BY
a.data_date

毫无疑问,创建一个索引会有所帮助。你试过执行计划吗?在SQL server中按Ctrl+L以查看执行计划。它将告诉您运行查询的大部分资源的位置,以及索引可以改进查询的位置。请记住,它只是告诉您如何改进该查询,而不是整个数据库。它看起来非常干净…只要自连接为每个table1记录生成一行(a.data_date+a.column3是否唯一?),问题是,通过使用a.column4不是NULL,b.column4是NULL,它将为我提供不同的column1数据,这就是我要找的东西,比如从表1 a中选择a.data\u date为day,sum(a.column1)+sum(a.column2)为total,sum(a.column1)为part1,sum(a.column2)为part2,其中a.data\u date='20131001'和a.column3=12345,a.column4由a.data\u date组成的非空组将给出大部分结果,可能会为
添加一个子查询,sum(b.column1)为alien
set hive.auto.convert.join.true;
select
a.data_date as day
, sum(a.column1) + sum(a.column2) as total
, sum(a.column1) as part1
, sum(a.column2) as part2
, sum(b.column1) as alien
from table1 b
inner join (select * from table1 WHERE a.data_date ='20131001'
and a.column3 = 12345
and a.column4 is not NULL
)a
on (a.data_date = b.data_date AND a.column3 = b.column3)

where b.column4 is NULL
GROUP BY
a.data_date