Python Django预取与大型数据集相关
我现在与django的预回迁相关的问题。 举个例子,让我们想象一下这些模型Python Django预取与大型数据集相关,python,django,postgresql,django-models,query-optimization,Python,Django,Postgresql,Django Models,Query Optimization,我现在与django的预回迁相关的问题。 举个例子,让我们想象一下这些模型 from django.db import models class Client(models.Model): name = models.CharField(max_length=255) class Purchase(models.Model): client = models.ForeignKey('Client') 让我们想象一下,我们有几个客户,大约200个,但他们购买了很多,所以我们有
from django.db import models
class Client(models.Model):
name = models.CharField(max_length=255)
class Purchase(models.Model):
client = models.ForeignKey('Client')
让我们想象一下,我们有几个客户,大约200个,但他们购买了很多,所以我们有数百万的购买
如果我必须创建一个显示所有客户和每个客户的购买数量的网页,我必须这样写
from django.db.models import Prefetch
from .models import Purchase, Client
purchases = Purchase.objects.all()
clients = Client.prefetch_related(Prefetch('purchase_set', queryset=purchases))
这里的问题是,我将查询big purchases数据库,而该查询可能需要一分钟以上的时间,或者更糟糕的是,在服务器上创建一个MemoryError
因此,我尝试只选择一批具有
purchases = Purchase.objects.all()[:9]
但正如我们所料,Django不太喜欢它,并推出了这种例外
Traceback (most recent call last):
File "project/venv/lib/python3.6/site-packages/django/core/handlers/base.py",
line 149, in get_response
response = self.process_exception_by_middleware(e, request)
File "project/venv/lib/python3.6/site-packages/django/core/handlers/base.py",
line 147, in get_response
response = wrapped_callback(request, *callback_args, **callback_kwargs)
File "project/venv/lib/python3.6/site-packages/django/views/generic/base.py",
line 68, in view
return self.dispatch(request, *args, **kwargs)
File "project/venv/lib/python3.6/site-packages/django/utils/decorators.py", l
ine 67, in _wrapper
return bound_func(*args, **kwargs)
File "project/venv/lib/python3.6/site-packages/django/views/decorators/cache.
py", line 57, in _wrapped_view_func
response = view_func(request, *args, **kwargs)
File "project/venv/lib/python3.6/site-packages/django/utils/decorators.py", l
ine 63, in bound_func
return func.__get__(self, type(self))(*args2, **kwargs2)
****************** login decorators, views, ...
File "project/***.py", line ***, in ***
for client in clients:
File "project/venv/lib/python3.6/site-packages/django/db/models/query.py", li
ne 258, in __iter__
self._fetch_all()
File "project/venv/lib/python3.6/site-packages/django/db/models/query.py", li
ne 1076, in _fetch_all
self._prefetch_related_objects()
File "project/venv/lib/python3.6/site-packages/django/db/models/query.py", li
ne 656, in _prefetch_related_objects
prefetch_related_objects(self._result_cache, self._prefetch_related_lookups)
File "project/venv/lib/python3.6/site-packages/django/db/models/query.py", li
ne 1457, in prefetch_related_objects
obj_list, additional_lookups = prefetch_one_level(obj_list, prefetcher, lookup, level)
File "project/venv/lib/python3.6/site-packages/django/db/models/query.py", li
ne 1556, in prefetch_one_level
prefetcher.get_prefetch_queryset(instances, lookup.get_current_queryset(level)))
File "project/venv/lib/python3.6/site-packages/django/db/models/fields/relate
d_descriptors.py", line 539, in get_prefetch_queryset
queryset = queryset.filter(**query)
File "project/venv/lib/python3.6/site-packages/django/db/models/query.py", li
ne 790, in filter
return self._filter_or_exclude(False, *args, **kwargs)
File "project/venv/lib/python3.6/site-packages/django/db/models/query.py", li
ne 802, in _filter_or_exclude
"Cannot filter a query once a slice has been taken."
AssertionError: Cannot filter a query once a slice has been taken.
所以现在,我没有真正的解决办法。我正在研究django/db/models/query.py:258中的_iter____)函数是如何构建的,以尝试创建一个具有相同行为的函数,但需要在预取中设置一个有限的集才能对其进行分页,并以更并行的方式进行操作
有没有什么“好办法”来进行这种查询
让我们想象一下,我们有几个客户,大约200个,但他们购买
很多,所以我们有数以百万计的购买
如果我必须创建一个显示所有客户端和
每个客户的购买数量
我将把你的问题解释为想要这个功能。您是否尝试过:
from django.db.models import Count
clients = Client.objects.annotate(num_purchases=Count('purchase'))
clients[0].num_purchases
如果您希望排序并获得最高的采购客户,您还可以执行以下操作:
clients = Client.objects.annotate(num_purchases=Count('purchase')).order_by('-num_purchases')[:5]
有关更多功能,请参阅。非常感谢您,这正是我想要的,很抱歉,我没有充分阅读手册^”