Python 在SQLAlchemy/Postgres中如何限制每个'group_by'的N个结果?
这是我的SQLAlchemy查询代码Python 在SQLAlchemy/Postgres中如何限制每个'group_by'的N个结果?,python,postgresql,sqlalchemy,greatest-n-per-group,Python,Postgresql,Sqlalchemy,Greatest N Per Group,这是我的SQLAlchemy查询代码 medium_contact_id_subq = (g.session.query(distinct(func.unnest(FUContact.medium_contact_id_lis))).filter(FUContact._id.in_(contact_id_lis))).subquery() q = (g.session.query(FUMessage). filter(FUMessage.fu_medium_contact_id
medium_contact_id_subq = (g.session.query(distinct(func.unnest(FUContact.medium_contact_id_lis))).filter(FUContact._id.in_(contact_id_lis))).subquery()
q = (g.session.query(FUMessage).
filter(FUMessage.fu_medium_contact_id.in_(medium_contact_id_subq))
.order_by(desc(FUMessage.timestamp_utc))
)
我想限制按medium\u contact\u id分组的带有N个结果的消息
作为一种解决方法,这是我目前的丑陋和未优化的代码:
medium_contact_id_lis = (g.session.query(distinct(func.unnest(FUContact.medium_contact_id_lis))).filter(FUContact._id.in_(contact_id_lis))).all()
q = None
for medium_contact_id_tup in medium_contact_id_lis:
medium_contact_id = medium_contact_id_tup[0]
if q is None:
q = (g.session.query(FUMessage)
.filter(FUMessage.fu_medium_contact_id == medium_contact_id)
.limit(MESSAGE_LIMIT)
)
else:
subq = (g.session.query(FUMessage)
.filter(FUMessage.fu_medium_contact_id == medium_contact_id)
.limit(MESSAGE_LIMIT)
)
q = q.union(subq)
q = q.order_by(desc(FUMessage.timestamp_utc))
获取每组前N行的一种方法是使用窗口函数,如子选择中具有所需分组和顺序的rank或row_number,然后根据封闭选择中的排序进行过滤。对于N=1,可以在Postgresql中使用该组合
使用function element的方法生成窗口表达式时,将其应用于SQLAlchemy非常简单:
medium_contact_id_subq = g.session.query(
func.unnest(FUContact.medium_contact_id_lis).distinct()).\
filter(FUContact._id.in_(contact_id_lis)).\
subquery()
# Perform required filtering in the subquery. Choose a suitable ordering,
# or you'll get indeterminate results.
subq = g.session.query(
FUMessage,
func.row_number().over(
partition_by=FUMessage.fu_medium_contact_id,
order_by=FUMessage.timestamp_utc).label('n')).\
filter(FUMessage.fu_medium_contact_id.in_(medium_contact_id_subq)).\
subquery()
fumessage_alias = aliased(FUMessage, subq)
# row_number() counts up from 1, so include rows with a row num
# less than or equal to limit
q = g.session.query(fumessage_alias).\
filter(subq.c.n <= MESSAGE_LIMIT)
获取每组前N行的一种方法是使用窗口函数,如子选择中具有所需分组和顺序的rank或row_number,然后根据封闭选择中的排序进行过滤。对于N=1,可以在Postgresql中使用该组合
使用function element的方法生成窗口表达式时,将其应用于SQLAlchemy非常简单:
medium_contact_id_subq = g.session.query(
func.unnest(FUContact.medium_contact_id_lis).distinct()).\
filter(FUContact._id.in_(contact_id_lis)).\
subquery()
# Perform required filtering in the subquery. Choose a suitable ordering,
# or you'll get indeterminate results.
subq = g.session.query(
FUMessage,
func.row_number().over(
partition_by=FUMessage.fu_medium_contact_id,
order_by=FUMessage.timestamp_utc).label('n')).\
filter(FUMessage.fu_medium_contact_id.in_(medium_contact_id_subq)).\
subquery()
fumessage_alias = aliased(FUMessage, subq)
# row_number() counts up from 1, so include rows with a row num
# less than or equal to limit
q = g.session.query(fumessage_alias).\
filter(subq.c.n <= MESSAGE_LIMIT)
由于构成联合的子查询不在限制之前排序,因此结果是不确定的。由于构成联合的子查询不在限制之前排序,因此结果是不确定的。