PostgreSQL:计数查询花费太多时间

PostgreSQL:计数查询花费太多时间,sql,postgresql,Sql,Postgresql,我的查询有一些问题-时间太长(2636124毫秒!): 此查询由ORM(Django)生成。当我尝试(用ORM)执行它时,我的应用程序挂起,并且当我输入psql-psql挂起时 解释和分析: Aggregate (cost=329583550.40..329583550.41 rows=1 width=8) (actual time=2636109.932..2636109.933 rows=1 loops=1) -> Seq Scan on dictionary_dictio

我的查询有一些问题-时间太长(2636124毫秒!):

此查询由ORM(Django)生成。当我尝试(用ORM)执行它时,我的应用程序挂起,并且当我输入psql-psql挂起时

解释和分析:

Aggregate  (cost=329583550.40..329583550.41 rows=1 width=8) (actual 
time=2636109.932..2636109.933 rows=1 loops=1)
   ->  Seq Scan on dictionary_dictionary  (cost=0.00..329583390.76 
       rows=63856 width=0) (actual time=2636109.922..2636109.922 rows=0 loops=1)
           Filter: (NOT (SubPlan 1))
           Rows Removed by Filter: 127712
           SubPlan 1
             ->  Materialize  (cost=0.00..4821.74 rows=135828 width=4) (actual time=0.006..12.453 rows=63856 loops=127712)
                ->  Seq Scan on dictionary_frequencydata u1  (cost=0.00..3611.60 rows=135828 width=4) (actual time=0.299..95.915 rows=127712 loops=1)
                     Filter: (user_id = 1)
                     Rows Removed by Filter: 28054
 Planning time: 0.277 ms
 Execution time: 2636124.744 ms
 (11 wierszy)`
我来自Django的模特

class Dictionary(DateTimeModel):
    base_word = models.ForeignKey(BaseDictionary, related_name=_('dict_words'))
    word = models.CharField(max_length=64)
    version = models.ForeignKey(Version)

class FrequencyData(DateTimeModel):
    word = models.ForeignKey(Dictionary, related_name=_('frequency_data'))
    count = models.BigIntegerField(null=True, blank=True)
    source = models.ForeignKey(Source, related_name=_('frequency_data'), null=True, blank=True)
    user = models.ForeignKey(settings.AUTH_USER_MODEL, related_name=_('frequency_data'))
    user_ip_address = models.GenericIPAddressField(null=True, blank=True)
    date_of_checking = models.DateTimeField(null=True, blank=True)
    is_checked = models.BooleanField(default=False)
表格定义如下:

\d+字典\u字典
Tabela“public.dictionary\u dictionary”
Kolumna | Typ | Porównanie | Nullowalne | Domy"lnie | Przechowywanie | Cel statystyk | Opis
----------------------+--------------------------+------------+------------+--------------------------------------------------------------------+----------------+---------------+------
id | integer | not null | nextval('dictionary_dictionary_id_seq'::regclass)| plain |
创建日期|带时区的时间戳| |非空| |普通| |124;
日期|修改|带时区的时间戳| |非空| |普通| |
单词|字符变化(64)| |非空| |扩展| |
算法_版本| id |整数| |非空| |普通| |
基本单词id整数非空普通
索引:
“dictionary\u dictionary\u pkey”主键,btree(id)
“dictionary\u Phone\u algorithm\u version\u id\u 0f0af100”b树(algorithm\u version\u id)
“dictionary\u dictionary\u base\u word\u id\u 8db15cb4”b树(base\u word\u id)
Ograniczenia kluczy obcych:
“dictionary\u算法\u版本\u id\u 0f0af100\u fk\u拼音\u”外键(算法\u版本\u id)引用dictionary\u算法版本(id)可延迟初始延迟
“dictionary\u base\u word\u id\u 8db15cb4\u fk\u拼音”外键(base\u word\u id)引用dictionary\u语法dictionary(id)可延迟初始延迟
Wskazywany przez:
表“dictionary\u frequencydata”约束“dictionary\u word\u id\u c231110d\u fk\u拼音”外键(word\u id)引用dictionary\u dictionary(id)可延迟初始延迟
=========
\d+字典\u频率数据
Tabela“public.dictionary\u frequencydata”
Kolumna | Typ | Porównanie | Nullowalne | Domy"lnie | Przechowywanie | Cel statystyk | Opis
------------------+--------------------------+------------+------------+---------------------------------------------------------------+----------------+---------------+------
id | integer | not null | nextval('dictionary_frequencydata_id_seq'::regclass)| plain |
创建日期|带时区的时间戳| |非空| |普通| |124;
日期|修改|带时区的时间戳| |非空| |普通| |
计数| bigint | | | | |普通| |
用户| ip |地址| inet | | | | main |124;
日期|检查|带时区的时间戳| | | |普通| |124;
是否选中|布尔| |非空| |普通| |
源代码|整数| | | |普通| |124;
用户id |整数| |非空| |普通| |
单词| id |整数| |非空| |普通| |
索引:
“dictionary\u frequencydata\u pkey”主键,btree(id)
“字典频率数据源id”B树(源id)
“字典频率数据用户id c6dfedce”B树(用户id)
“dictionary\u frequencydata\u word\u id\u c231110d”B树(word\u id)
Ograniczenia kluczy obcych:
“dictionary\u source\u id\u 38bb205a\u fk\u peopologic\u”外键(source\u id)引用dictionary\u frequencysource(id)可延迟初始延迟
“dictionary\u user\u id\u c6dfedce\u fk\u auth\u user”外键(user\u id)引用最初延迟的auth\u user(id)
“dictionary\u word\u id\u c231110d\u fk\u拼音”外键(word\u id)引用dictionary\u dictionary(id)可延迟初始延迟
它是共享主机。
Dictionary db tabel-120k rows FrequencyData-160k rows

在这种情况下,如果您像下面那样重新编写查询,您的查询应该会快得多,因为两个子查询都很快。最终结果相当于django生成的查询

dictionary\u dictionary
上使用过滤器操作的seq扫描似乎相当昂贵,但是
class Dictionary(DateTimeModel):
    base_word = models.ForeignKey(BaseDictionary, related_name=_('dict_words'))
    word = models.CharField(max_length=64)
    version = models.ForeignKey(Version)

class FrequencyData(DateTimeModel):
    word = models.ForeignKey(Dictionary, related_name=_('frequency_data'))
    count = models.BigIntegerField(null=True, blank=True)
    source = models.ForeignKey(Source, related_name=_('frequency_data'), null=True, blank=True)
    user = models.ForeignKey(settings.AUTH_USER_MODEL, related_name=_('frequency_data'))
    user_ip_address = models.GenericIPAddressField(null=True, blank=True)
    date_of_checking = models.DateTimeField(null=True, blank=True)
    is_checked = models.BooleanField(default=False)
SELECT 
tot - excl
from (select count(*) tot
      from dictionary_dictionary) t1
, (select count(DISTINCT d.id) excl
   from dictionary_dictionary d 
   join dictionary_frequencydata f
     on d.id = f.word_id 
   where f.user_id = 1 ) t2
SELECT COUNT(*) AS "__count" 
FROM "dictionary_dictionary" 
WHERE NOT ("dictionary_dictionary"."id" IN (SELECT distinct U1."word_id" AS Col1
                                            FROM "dictionary_frequencydata" U1 
                                            WHERE U1."user_id" = 1));