Warning: file_get_contents(/data/phpspider/zhask/data//catemap/5/sql/70.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
提高SQL查询的运行时间_Sql_Postgresql - Fatal编程技术网

提高SQL查询的运行时间

提高SQL查询的运行时间,sql,postgresql,Sql,Postgresql,我有以下表格结构: AdPerformance id ad_id impressions Targeting value AdActions app_starts Ad id name parent_id AdTargeting id targeting_ ad_id Targeting id name value AdProduct id ad_id name 我需要通过限制product name来聚

我有以下表格结构:

AdPerformance
   id
   ad_id
   impressions

Targeting
  value


AdActions
   app_starts

Ad
  id
  name
  parent_id

AdTargeting
  id
  targeting_
  ad_id

Targeting
  id
  name
  value

AdProduct
  id
  ad_id
  name
我需要通过限制product name来聚合数据,因此我编写了以下查询:

 SELECT ad_performance.ad_id, targeting.value AS targeting_value, 
     sum(impressions) AS impressions, 
     sum(app_starts) AS app_starts
 FROM ad_performance
     LEFT JOIN ad on ad.id = ad_performance.ad_id
     LEFT JOIN ad_actions ON ad_performance.id = ad_actions.ad_performance_id
     RIGHT JOIN (
        SELECT ad_id, value from targeting, ad_targeting 
        WHERE targeting.id = ad_targeting.id and targeting.name = 'gender' 
     ) targeting ON targeting.ad_id = ad.parent_id
WHERE ad_performance.ad_id IN 
       (SELECT ad_id FROM ad_product WHERE product = 'iphone')
GROUP BY ad_performance.ad_id, targeting_value
但是,
ANALYZE
命令中的上述查询大约需要5秒的时间才能查询到1米的记录

有什么办法可以改进吗

我确实有外键索引

已更新

请参阅ANALYZE的输出

                                                                                                                                                                                                          QUERY PLAN                                                                                                     
--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
 HashAggregate  (cost=5787.28..5789.87 rows=259 width=254) (actual time=3283.763..3286.015 rows=5998 loops=1)
   Group Key: adobject_performance.ad_id, targeting.value
   Buffers: shared hit=3400223
   ->  Nested Loop Left Join  (cost=241.63..5603.63 rows=8162 width=254) (actual time=10.438..2774.664 rows=839720 loops=1)
         Buffers: shared hit=3400223
         ->  Nested Loop  (cost=241.21..1590.52 rows=8162 width=250) (actual time=10.412..703.818 rows=839720 loops=1)
               Join Filter: (adobject.id = adobject_performance.ad_id)
               Buffers: shared hit=36755
               ->  Hash Join  (cost=240.78..323.35 rows=9 width=226) (actual time=10.380..20.332 rows=5998 loops=1)
                     Hash Cond: (ad_product.ad_id = ad.id)
                     Buffers: shared hit=190
                     ->  HashAggregate  (cost=128.98..188.96 rows=5998 width=4) (actual time=3.788..6.821 rows=5998 loops=1)
                           Group Key: ad_product.ad_id
                           Buffers: shared hit=39
                           ->  Seq Scan on ad_product  (cost=0.00..113.99 rows=5998 width=4) (actual time=0.011..1.726 rows=5998 loops=1)
                                 Filter: ((product)::text = 'ft2_iPhone'::text)
                                 Rows Removed by Filter: 1
                                 Buffers: shared hit=39
                     ->  Hash  (cost=111.69..111.69 rows=9 width=222) (actual time=6.578..6.578 rows=5998 loops=1)
                           Buckets: 1024  Batches: 1  Memory Usage: 241kB
                           Buffers: shared hit=151
                           ->  Hash Join  (cost=30.26..111.69 rows=9 width=222) (actual time=0.154..4.660 rows=5998 loops=1)
                                 Hash Cond: (adobject.parent_id = adobject_targeting.ad_id)
                                 Buffers: shared hit=151
                                 ->  Seq Scan on adobject  (cost=0.00..77.97 rows=897 width=8) (actual time=0.009..1.449 rows=6001 loops=1)
                                       Buffers: shared hit=69
                                 ->  Hash  (cost=30.24..30.24 rows=2 width=222) (actual time=0.132..0.132 rows=2 loops=1)
                                       Buckets: 1024  Batches: 1  Memory Usage: 1kB
                                       Buffers: shared hit=82
                                       ->  Nested Loop  (cost=0.15..30.24 rows=2 width=222) (actual time=0.101..0.129 rows=2 loops=1)
                                             Buffers: shared hit=82
                                             ->  Seq Scan on targeting  (cost=0.00..13.88 rows=2 width=222) (actual time=0.015..0.042 rows=79 loops=1)
                                                   Filter: (name = 'age group'::targeting_name)
                                                   Rows Removed by Filter: 82
                                                   Buffers: shared hit=1
                                             ->  Index Scan using advertising_targeting_pkey on adobject_targeting  (cost=0.15..8.17 rows=1 width=8) (actual time=0.001..0.001 rows=0 loops=79)
                                                   Index Cond: (id = targeting.id)
                                                   Buffers: shared hit=81
               ->  Index Scan using "fki_advertising_peformance_advertising_entity_id -> advertising" on adobject_performance  (cost=0.42..89.78 rows=4081 width=32) (actual time=0.007..0.046 rows=140 loops=5998)
                     Index Cond: (ad_id = ad_product.ad_id)
                     Buffers: shared hit=36565
         ->  Index Scan using facebook_advertising_actions_pkey on facebook_adobject_actions  (cost=0.42..0.48 rows=1 width=12) (actual time=0.001..0.002 rows=1 loops=839720)
               Index Cond: (ad_performance.id = ad_performance_id)
               Buffers: shared hit=3363468
 Planning time: 1.525 ms
 Execution time: 3287.324 ms
(46 rows)

在这里盲目射击,因为我们还没有得到解释的结果,但是,如果你在CTE中拿出你的
targeting
表,Postgres应该更好地处理这个查询:

WITH targeting AS 
(
        SELECT ad_id, value from targeting, ad_targeting 
        WHERE targeting.id = ad_targeting.id and targeting.name = 'gender' 
)
SELECT ad_performance.ad_id, targeting.value AS targeting_value, 
     sum(impressions) AS impressions, 
     sum(app_starts) AS app_starts
FROM ad_performance
     LEFT JOIN ad on ad.id = ad_performance.ad_id
     LEFT JOIN ad_actions ON ad_performance.id = ad_actions.ad_performance_id
     RIGHT JOIN  targeting ON targeting.ad_id = ad.parent_id
WHERE ad_performance.ad_id IN 
       (SELECT ad_id FROM ad_product WHERE product = 'iphone')
GROUP BY ad_performance.ad_id, targeting_value
摘自

WITH查询的一个有用特性是它们只计算一次 每次执行父查询,即使它们被引用的次数更多 父查询或同级查询多次执行。因此,价格昂贵 多个位置所需的计算可以放在一个 使用查询以避免冗余工作。另一个可能的应用是 防止对具有副作用的功能进行不必要的多次评估


在这里盲目射击,因为我们还没有得到解释的结果,但是,如果你在CTE中拿出你的
targeting
表,Postgres应该更好地处理这个查询:

WITH targeting AS 
(
        SELECT ad_id, value from targeting, ad_targeting 
        WHERE targeting.id = ad_targeting.id and targeting.name = 'gender' 
)
SELECT ad_performance.ad_id, targeting.value AS targeting_value, 
     sum(impressions) AS impressions, 
     sum(app_starts) AS app_starts
FROM ad_performance
     LEFT JOIN ad on ad.id = ad_performance.ad_id
     LEFT JOIN ad_actions ON ad_performance.id = ad_actions.ad_performance_id
     RIGHT JOIN  targeting ON targeting.ad_id = ad.parent_id
WHERE ad_performance.ad_id IN 
       (SELECT ad_id FROM ad_product WHERE product = 'iphone')
GROUP BY ad_performance.ad_id, targeting_value
摘自

WITH查询的一个有用特性是它们只计算一次 每次执行父查询,即使它们被引用的次数更多 父查询或同级查询多次执行。因此,价格昂贵 多个位置所需的计算可以放在一个 使用查询以避免冗余工作。另一个可能的应用是 防止对具有副作用的功能进行不必要的多次评估


我不知道此查询是否能解决您的问题,但请尝试:

 SELECT ad_performance.ad_id, targeting.value AS targeting_value, 
     sum(impressions) AS impressions, 
     sum(app_starts) AS app_starts
 FROM ad_performance
     LEFT JOIN ad on ad.id = ad_performance.ad_id
     LEFT JOIN ad_actions ON ad_performance.id = ad_actions.ad_performance_id
     RIGHT JOIN ad_targeting on ad_targeting.ad_id = ad.parent_id
     INNER JOIN targeting on  targeting.id = ad_targeting.id and targeting.name = 'gender'   
     INNER JOIN ad_product on ad_product.ad_id = ad_performance.ad_id
WHERE ad_product.product = 'iphone'
GROUP BY ad_performance.ad_id, targeting_value

也许你会在所有列上创建索引,这些列都是你要输入的,或者条件在哪里,我不知道这个查询是否能解决你的问题,但是试试看:

 SELECT ad_performance.ad_id, targeting.value AS targeting_value, 
     sum(impressions) AS impressions, 
     sum(app_starts) AS app_starts
 FROM ad_performance
     LEFT JOIN ad on ad.id = ad_performance.ad_id
     LEFT JOIN ad_actions ON ad_performance.id = ad_actions.ad_performance_id
     RIGHT JOIN ad_targeting on ad_targeting.ad_id = ad.parent_id
     INNER JOIN targeting on  targeting.id = ad_targeting.id and targeting.name = 'gender'   
     INNER JOIN ad_product on ad_product.ad_id = ad_performance.ad_id
WHERE ad_product.product = 'iphone'
GROUP BY ad_performance.ad_id, targeting_value

您可能会在所有要输入的列上创建索引,或者在执行计划似乎不再与查询匹配的条件下创建索引(可能您可以更新查询)

然而,现在的问题是:

->散列联接(成本=30.26..111.69行=9宽度=222)
(实际时间=0.154..4.660行=5998循环=1)
哈希条件:(adobject.parent\u id=adobject\u targeting.ad\u id)
缓冲区:共享命中率=151
->adobject上的顺序扫描(成本=0.00..77.97行=897宽度=8)
(实际时间=0.009..1.449行=6001循环=1)
缓冲区:共享命中=69
->散列(成本=30.24..30.24行=2宽度=222)
(实际时间=0.132..0.132行=2个循环=1)
存储桶:1024批:1内存使用量:1kB
缓冲区:共享命中率=82
->嵌套循环(成本=0.15..30.24行=2宽度=222)
(实际时间=0.101..0.129行=2个循环=1)
缓冲区:共享命中率=82
->目标定位时的顺序扫描(成本=0.00..13.88行=2宽度=222)
(实际时间=0.015..0.042行=79圈=1)
筛选器:(名称=‘年龄组’::目标设定_名称)
被筛选器删除的行:82
缓冲区:共享命中=1
->使用adobject\u targeting\u pkey在adobject\u targeting上进行索引扫描
(成本=0.15..8.17行=1宽=8)
(实际时间=0.001..0.001行=0圈=79)
索引条件:(id=targeting.id)
缓冲区:共享命中率=81
这是
adobject

targeting JOIN adobject_targeting
   USING (id)
WHERE targeting.name = 'age group'
后一个子查询正确估计为2行,但PostgreSQL没有注意到在
adobject
中找到的几乎所有行都将匹配这两行中的一行,因此联接的结果将是6000,而不是它估计的9

这会导致优化器在以后错误地选择嵌套循环联接,其中超过一半的查询时间被占用

不幸的是,由于PostgreSQL没有跨表统计信息,因此PostgreSQL无法更好地了解

一个粗略的度量是
设置enable_nestloop=off
,但这会降低另一个(正确选择的)嵌套循环联接的性能,因此我不知道这是否是一个净胜利。 如果有帮助,您可以考虑仅在查询的持续时间内更改参数(使用事务和<代码> SET本地< /代码>)。
也许有一种方法可以重写查询,以便找到更好的计划,但如果不知道确切的查询,这很难说。

执行计划似乎不再与查询匹配(也许可以更新查询)

然而,现在的问题是:

->散列联接(成本=30.26..111.69行=9宽度=222)
(实际时间=0.154..4.660行=5998循环=1)
哈希条件:(adobject.parent\u id=adobject\u targeting.ad\u id)
缓冲区:共享命中率=151
->adobject上的顺序扫描(成本=0.00..77.97行=897宽度=8)
(实际时间=0.009..1.449行=6001循环=1)
缓冲区:共享命中=69
->散列(成本=30.24..30.24行=2宽度=222)
(实际时间=0.132..0.132行=2个循环=1)
存储桶:1024批:1内存使用量:1kB
缓冲区:共享命中率=82
->嵌套循环(成本=0.15..30.24行=2宽度=222)
(实际时间=0.101..0.129行=2个循环=1)
缓冲区:共享命中率=82
->目标定位时的顺序扫描(成本=0.00..13.88行=2宽度=222)
(实际时间=0.015..0.042行=79圈=1)
筛选器:(名称=‘年龄组’::目标设定_名称)
被筛选器删除的行:82