Python Django左外联接
我有一个网站,用户可以在其中查看电影列表,并为他们创建评论 用户应该能够看到所有电影的列表。此外,如果他们看过这部电影,他们应该能够看到他们给它的分数。如果没有,则只显示电影而不显示乐谱 他们根本不在乎其他用户提供的分数 考虑以下Python Django左外联接,python,django,django-models,orm,Python,Django,Django Models,Orm,我有一个网站,用户可以在其中查看电影列表,并为他们创建评论 用户应该能够看到所有电影的列表。此外,如果他们看过这部电影,他们应该能够看到他们给它的分数。如果没有,则只显示电影而不显示乐谱 他们根本不在乎其他用户提供的分数 考虑以下models.py from django.contrib.auth.models import User from django.db import models class Topic(models.Model): name = models.TextF
models.py
from django.contrib.auth.models import User
from django.db import models
class Topic(models.Model):
name = models.TextField()
def __str__(self):
return self.name
class Record(models.Model):
user = models.ForeignKey(User)
topic = models.ForeignKey(Topic)
value = models.TextField()
class Meta:
unique_together = ("user", "topic")
我真正想要的是这个
select * from bar_topic
left join (select topic_id as tid, value from bar_record where user_id = 1)
on tid = bar_topic.id
select * from bar_topic
left join (select topic_id as tid, value from bar_record where user_id = 1)
on tid = bar_topic.id
考虑以下test.py
作为上下文:
from django.test import TestCase
from bar.models import *
from django.db.models import Q
class TestSuite(TestCase):
def setUp(self):
t1 = Topic.objects.create(name="A")
t2 = Topic.objects.create(name="B")
t3 = Topic.objects.create(name="C")
# 2 for Johnny
johnny = User.objects.create(username="Johnny")
johnny.record_set.create(topic=t1, value=1)
johnny.record_set.create(topic=t3, value=3)
# 3 for Mary
mary = User.objects.create(username="Mary")
mary.record_set.create(topic=t1, value=4)
mary.record_set.create(topic=t2, value=5)
mary.record_set.create(topic=t3, value=6)
def test_raw(self):
print('\nraw\n---')
with self.assertNumQueries(1):
topics = Topic.objects.raw('''
select * from bar_topic
left join (select topic_id as tid, value from bar_record where user_id = 1)
on tid = bar_topic.id
''')
for topic in topics:
print(topic, topic.value)
def test_orm(self):
print('\norm\n---')
with self.assertNumQueries(1):
topics = Topic.objects.filter(Q(record__user_id=1)).values_list('name', 'record__value')
for topic in topics:
print(*topic)
两个测试都应打印完全相同的输出,但是,只有原始版本才会输出正确的结果表:
raw
---
A 1
B None
C 3
如何使用Django ORM实现原始查询的简单行为
编辑:这类方法很有效,但似乎很差:
topics = Topic.objects.filter(record__user_id=1).values_list('name', 'record__value')
noned = Topic.objects.exclude(record__user_id=1).values_list('name')
for topic in chain(topics, noned):
...
topics=Topic.objects.filter(记录\用户\ id=1)。值\列表('name','record\值')
noned=Topic.objects.exclude(记录\用户\ id=1)。值\列表('name'))
对于链中的主题(主题,无编号):
...
编辑:这工作稍微好一点,但仍然不好:
topics = Topic.objects.filter(record__user_id=1).annotate(value=F('record__value'))
topics |= Topic.objects.exclude(pk__in=topics)
topics=Topic.objects.filter(记录\用户\ id=1)。注释(value=F('record\值'))
topics |=Topic.objects.exclude(pk|u in=topics)
奥姆
---
A 1
B 5
C 3首先,没有办法(atm Django 1.9.7)用Django的ORM表示您发布的原始查询,完全符合您的要求;但是,您可以通过以下方式获得相同的预期结果:
>>> Topic.objects.annotate(
f=Case(
When(
record__user=johnny,
then=F('record__value')
),
output_field=IntegerField()
)
).order_by(
'id', 'name', 'f'
).distinct(
'id', 'name'
).values_list(
'name', 'f'
)
>>> [(u'A', 1), (u'B', None), (u'C', 3)]
>>> Topic.objects.annotate(f=Case(When(record__user=may, then=F('record__value')), output_field=IntegerField())).order_by('id', 'name', 'f').distinct('id', 'name').values_list('name', 'f')
>>> [(u'A', 4), (u'B', 5), (u'C', 6)]
这里是为第一个查询生成的SQL:
>>> print Topic.objects.annotate(f=Case(When(record__user=johnny, then=F('record__value')), output_field=IntegerField())).order_by('id', 'name', 'f').distinct('id', 'name').values_list('name', 'f').query
>>> SELECT DISTINCT ON ("payments_topic"."id", "payments_topic"."name") "payments_topic"."name", CASE WHEN "payments_record"."user_id" = 1 THEN "payments_record"."value" ELSE NULL END AS "f" FROM "payments_topic" LEFT OUTER JOIN "payments_record" ON ("payments_topic"."id" = "payments_record"."topic_id") ORDER BY "payments_topic"."id" ASC, "payments_topic"."name" ASC, "f" ASC
##一些注释
- 毫不犹豫地使用原始查询,特别是当性能是最重要的事情时。此外,有时这是必须的,因为使用Django的ORM无法获得相同的结果;在其他情况下,您可以这样做,但偶尔有一段清晰易懂的代码比这段代码的性能更重要
- 此答案中使用了带位置参数的
,仅适用于PostgreSQL、atm。在文档中,您可以看到有关的更多信息distinct
null=True
,可以强制ORM使用左外连接。按原样处理桌子
print Record.objects.filter(user_id=8).select_related('topic').query
结果是
SELECT "bar_record"."id", "bar_record"."user_id", "bar_record"."topic_id", "bar_record"."value", "bar_topic"."id", "bar_topic"."name" FROM "bar_record"
INNER JOIN "bar_topic" ON ( "bar_record"."topic_id" = "bar_topic"."id" ) WHERE "bar_record"."user_id" = 8
SELECT "bar_record"."id", "bar_record"."user_id", "bar_record"."topic_id", "bar_record"."value", "bar_topic"."id", "bar_topic"."name" FROM "bar_record"
LEFT OUTER JOIN "bar_topic" ON ( "bar_record"."topic_id" = "bar_topic"."id" ) WHERE "bar_record"."user_id" = 8
现在设置null=True,并执行与上面相同的ORM查询。结果是
SELECT "bar_record"."id", "bar_record"."user_id", "bar_record"."topic_id", "bar_record"."value", "bar_topic"."id", "bar_topic"."name" FROM "bar_record"
INNER JOIN "bar_topic" ON ( "bar_record"."topic_id" = "bar_topic"."id" ) WHERE "bar_record"."user_id" = 8
SELECT "bar_record"."id", "bar_record"."user_id", "bar_record"."topic_id", "bar_record"."value", "bar_topic"."id", "bar_topic"."name" FROM "bar_record"
LEFT OUTER JOIN "bar_topic" ON ( "bar_record"."topic_id" = "bar_topic"."id" ) WHERE "bar_record"."user_id" = 8
请注意查询是如何突然更改为
左外部联接的
。但是我们还没有走出困境,因为桌子的顺序应该颠倒!因此,除非您能够重新构造模型,否则如果没有链接或联合(您已经尝试过这两种链接或联合),ORM左外部连接可能不完全可能实现。我将这样做。两个问题,而不是一个:
class Topic(models.Model):
#...
@property
def user_value(self):
try:
return self.user_records[0].value
except IndexError:
#This topic does not have
#a review by the request.user
return None
except AttributeError:
raise AttributeError('You forgot to prefetch the user_records')
#or you can just
return None
#usage
topics = Topic.objects.all().prefetch_related(
models.Prefetch('record_set',
queryset=Record.objects.filter(user=request.user),
to_attr='user_records'
)
)
for topic in topics:
print topic.user_value
好处是您可以获得整个记录
对象。因此,考虑一种情况,您不仅要显示<代码>值<代码>,还要考虑<代码>时间戳< /代码>。
为了记录在案,我想用.extra
再展示一个解决方案。我很惊讶没有人提到它,因为它应该产生最好的性能
topics = Topic.objects.all().extra(
select={
'user_value': """SELECT value FROM myapp_record
WHERE myapp_record.user_id = %s
AND myapp_record.topic_id = myapp_topic.id
"""
},
select_params=(request.user.id,)
)
for topic in topics
print topic.user_value
这两种解决方案都可以抽象为一个定制的TopicQuerySet
类,以便重用
class TopicQuerySet(models.QuerySet):
def prefetch_user_records(self, user):
return self.prefetch_related(
models.Prefetch('record_set',
queryset=Record.objects.filter(user=request.user),
to_attr='user_records'
)
)
def annotate_user_value(self, user):
return self.extra(
select={
'user_value': """SELECT value FROM myapp_record
WHERE myapp_record.user_id = %s
AND myapp_record.topic_id = myapp_topic.id
"""
},
select_params=(user.id,)
)
class Topic(models.Model):
#...
objects = TopicQuerySet.as_manager()
#usage
topics = Topic.objects.all().annotate_user_value(request.user)
#or
topics = Topic.objects.all().prefetch_user_records(request.user)
for topic in topics:
print topic.user_value
此更通用的解决方案灵感来源于其他数据库:
>>> qs = Topic.objects.annotate(
... f=Max(Case(When(record__user=johnny, then=F('record__value'))))
... )
示例数据
>>> print(qs.values_list('name', 'f'))
[(u'A', 1), (u'B', None), (u'C', 3)]
验证查询
>>> print(qs.query) # formated and removed excessive double quotes
SELECT bar_topic.id, bar_topic.name,
MAX(CASE WHEN bar_record.user_id = 1 THEN bar_record.value ELSE NULL END) AS f
FROM bar_topic LEFT OUTER JOIN bar_record ON (bar_topic.id = bar_record.topic_id)
GROUP BY bar_topic.id, bar_topic.name
优势(与原始解决方案相比)
- 它也适用于SQLite
- 查询集可以很容易地进行筛选或排序,无论如何
- 不需要类型转换
输出\u字段
- 方法
或值
对于更简单的值列表(*字段名称)
很有用,但它们不是必需的分组依据
from django.db.models import Max, Case, When, F
def left_join(result_field, **lookups):
return Max(Case(When(then=F(result_field), **lookups)))
>>> Topic.objects.annotate(
... record_value=left_join('record__value', record__user=johnny),
... ).values_list('name', 'record_value')
可以通过anotate
方法添加记录中的更多字段,以获得具有良好记忆名的结果
我同意其他作者的观点,它可以优化,但是
编辑:如果将聚合函数Max
替换为Min
,则会产生相同的结果。最小值和最大值都忽略空值,可用于任何类型,例如字符串。如果不能保证左连接是唯一的,则聚合非常有用。如果该字段是数字字段,则在左侧连接处使用平均值Avg
我真正想要的是这个
select * from bar_topic
left join (select topic_id as tid, value from bar_record where user_id = 1)
on tid = bar_topic.id
select * from bar_topic
left join (select topic_id as tid, value from bar_record where user_id = 1)
on tid = bar_topic.id
…或者,这个避免子查询的等价项
select * from bar_topic
left join bar_record
on bar_record.topic_id = bar_topic.id and bar_record.user_id = 1
我想知道如何有效地做到这一点,或者,如果不可能,解释为什么不可能 除非使用原始查询,否则Django的ORM是不可能的,原因如下
QuerySet
对象(django.db.models.query.QuerySet
)具有一个query
属性(django.db.models.sql.query.query
),该属性表示将执行的实际查询。这些Query
对象有一个\uuuu str\uuuu
方法,因此您可以打印出来查看它是什么
让我们从一个简单的查询集开始
>>> from bar.models import *
>>> qs = Topic.objects.filter(record__user_id=1)
>>> print qs.query
SELECT "bar_topic"."id", "bar_topic"."name" FROM "bar_topic" INNER JOIN "bar_record" ON ("bar_topic"."id" = "bar_record"."topic_id") WHERE "bar_record"."user_id" = 1
>>> qs = Topic.objects.filter(record__user_id=1).values_list('name', 'record__value')
>>> print qs.query
SELECT "bar_topic"."name", "bar_record"."value" FROM "bar_topic" LEFT OUTER JOIN "bar_record" ON ("bar_topic"."id" = "bar_record"."topic_id") WHERE "bar_record"."user_id" = 1
…由于内部联接
,这显然不起作用
深入查看Query
对象内部,有一个alias\u map
属性确定将执行哪些表联接
>>> from pprint import pprint
>>> pprint(qs.query.alias_map)
{u'bar_record': JoinInfo(table_name=u'bar_record', rhs_alias=u'bar_record', join_type='INNER JOIN', lhs_alias=u'bar_topic', lhs_join_col=u'id', rhs_join_col='topic_id', nullable=True),
u'bar_topic': JoinInfo(table_name=u'bar_topic', rhs_alias=u'bar_topic', join_type=None, lhs_alias=None, lhs_join_col=None, rhs_join_col=None, nullable=False),
u'auth_user': JoinInfo(table_name=u'auth_user', rhs_alias=u'auth_user', join_type='INNER JOIN', lhs_alias=u'bar_record', lhs_join_col='user_id', rhs_join_col=u'id', nullable=False)}
请注意,Django仅支持两种可能的连接类型
s、内部连接
和左侧外部连接
现在,我们可以使用查询对象的>>> print qs.query
SELECT "bar_topic"."id", "bar_topic"."name" FROM "bar_topic" LEFT OUTER JOIN "bar_record" ON ("bar_topic"."id" = "bar_record"."topic_id") WHERE "bar_record"."user_id" = 1
>>> qs = Topic.objects.filter(record__user_id=1).values_list('name', 'record__value')
>>> print qs.query
SELECT "bar_topic"."name", "bar_record"."value" FROM "bar_topic" LEFT OUTER JOIN "bar_record" ON ("bar_topic"."id" = "bar_record"."topic_id") WHERE "bar_record"."user_id" = 1
(LEFT OUTER|INNER) JOIN <lhs_alias> ON (<lhs_alias>.<lhs_join_col> = <rhs_alias>.<rhs_join_col>)
select * from bar_topic
left join bar_record
on bar_record.topic_id = bar_topic.id and bar_record.user_id = 1
print('\nnew orm\n---')
with self.assertNumQueries(1):
topics = Topic.objects.annotate(
filtered_record=FilteredRelation('record', condition=Q(record__user_id=1)),
).values_list('name', 'filtered_record__value')
for topic in topics:
print(*topic)
new orm
---
A 1
B None
C 3
SELECT "bar_topic"."name", filtered_record."value" FROM "bar_topic" LEFT OUTER JOIN "bar_record" filtered_record ON ("bar_topic"."id" = filtered_record."topic_id" AND (filtered_record."user_id" = 1))