Python Django&;研究生-百分位(中位数)和分组
我需要计算每个卖家ID的时段中位数(参见下面的简化模型)。问题是我无法构造ORM查询 模型 质疑Python Django&;研究生-百分位(中位数)和分组,python,django,postgresql,statistics,subquery,Python,Django,Postgresql,Statistics,Subquery,我需要计算每个卖家ID的时段中位数(参见下面的简化模型)。问题是我无法构造ORM查询 模型 质疑 我想我需要做的是以下几点 select t.*, p_25, p_75 from t join (select district, percentile_cont(0.25) within group (order by sales) as p_25, percentile_cont(0.75) within group (order
我想我需要做的是以下几点
select t.*, p_25, p_75
from t join
(select district,
percentile_cont(0.25) within group (order by sales) as p_25,
percentile_cont(0.75) within group (order by sales) as p_75
from t
group by district
) td
on t.district = td.district
Python3.7.5、Django 2.2.8、Postgres 11.1以下是它们的诀窍
from django.db.models import F, Func, IntegerField
from django.db.models.aggregates import Aggregate
queryset = (
MyModel.objects.filter(period=25)
.annotate(duration=Cast(KeyTextTransform("duration", "aux"), IntegerField()))
.filter(duration__isnull=False)
.annotate(seller_id=Func(F("seller_ids"), function="unnest"))
.values("seller_id") # group by
.annotate(
median=Aggregate(
F("duration"),
function="percentile_cont",
template="%(function)s(0.5) WITHIN GROUP (ORDER BY %(expressions)s)",
),
)
)
请注意与问题相同的and而不是Func
。
而且,以及非常重要
顺便说一句,生成的SQL没有嵌套的select和join。您可以像Ryan Murphy()所做的那样,创建
聚合
类的中位数
子类<代码>中值然后与平均值一样工作:
from django.db.models import Aggregate, FloatField
class Median(Aggregate):
function = 'PERCENTILE_CONT'
name = 'median'
output_field = FloatField()
template = '%(function)s(0.5) WITHIN GROUP (ORDER BY %(expressions)s)'
然后找到一个字段使用的中值
my_model_aggregate = MyModel.objects.all().aggregate(Median('period'))
然后作为
my\u model\u aggregate['period\u median']
来澄清一下,您是否将django与SQLServer一起使用?@ivissani问题下面有一个postgresql
标记,所以没有。。。对不起,你有什么错误?那你的问题是什么?你提出的问题有什么问题?您是否正在尝试使用ORM或?
from django.db.models import Aggregate, FloatField
class Median(Aggregate):
function = 'PERCENTILE_CONT'
name = 'median'
output_field = FloatField()
template = '%(function)s(0.5) WITHIN GROUP (ORDER BY %(expressions)s)'
my_model_aggregate = MyModel.objects.all().aggregate(Median('period'))