Python 如何在Sqlalchemy中加载数量有限的集合？_Python_Mysql_Orm_Sqlalchemy

Python 如何在Sqlalchemy中加载数量有限的集合？

python mysql orm sqlalchemy

Python 如何在Sqlalchemy中加载数量有限的集合？,python,mysql,orm,sqlalchemy,Python,Mysql,Orm,Sqlalchemy,我有两张桌子。使用Sqlalchemy，我将它们映射到两个类： class A(base): ... id = Column(BigInteger, primary_key=True, autoincrement=True) class B(base): ... id = Column(BigInteger, primary_key=True, autoincrement=True) a_id = Column(BigInteger, ForeignKey(A.id))

我有两张桌子。使用Sqlalchemy，我将它们映射到两个类：

class A(base):
  ...
  id = Column(BigInteger, primary_key=True, autoincrement=True)

class B(base):
  ...
  id = Column(BigInteger, primary_key=True, autoincrement=True)
  a_id = Column(BigInteger, ForeignKey(A.id))
  timestamp = Column(DateTime)

  a = relationship(A, backref="b_s")

我可以使用A.b_来获取b对象的集合，这些对象的外键与A的主键相同。使用惰性加载或渴望加载非常容易。但现在我有一个问题。我不想加载所有B对象。我只想加载按时间戳排序的前N个对象。也就是说，A.b_s只加载一些相关的b对象。我如何使用Sqlalchemy来实现它

非常感谢

您想要实现的目标与关系无关（这不是SA限制，而是处理关系和注意引用完整性的正确方法）。
但是，一个简单的查询（包装在一个方法中）就可以很好地实现这一点：

class A(Base):
    # ...
    def get_children(self, offset, count):
        # @todo: might need to handle some border cases
        qry = B.query.with_parent(self)
        #or: qry = object_session(self).query(B).with_parent(self)
        return qry[offset:offset+count]

my_a = session.query(A).get(a_id)
print my_a.get_children( 0, 10) # print first 10 children
print my_a.get_children(10, 10) # print second 10 children

edit-1：通过仅使用1-2条SQL语句来实现这一点现在，只需1-2条SQL语句就可以实现这一点。
首先，需要一种方法来获取每个
a
的
top N
的
B
标识符。为此，我们将使用
sqlalchemy.sql.expression.over
函数组成子查询：

# @note: this is the subquery using *sqlalchemy.orm.over* function to limit number of rows # this subquery is used for both queries below # @note: the code below sorts Bs by id, but you can change it in order_by subq = (session.query( B.__table__.c.id.label("b_id"), over(func.row_number(), partition_by="a_id", order_by="id").label("rownum") ).subquery()) # this produces the following SQL (@note: the RDBMS should support the OVER...) # >> SELECT b.id AS b_id, row_number() OVER (PARTITION BY a_id ORDER BY id) AS rownum FROM b
第1版： 现在，第一个版本将加载
A
s，第二个版本将加载
B
s。函数返回字典，其中
A
s为键，
B
s为值：

def get_A_with_Bs_in_batch(b_limit=10): """ @return: dict(A, [list of top *b_limit* A.b_s]) @note: uses 2 SQL statements, but does not screw up relationship. @note: if the relationship is requested via a_instance.b_s, the new SQL statement will be issued to load *all* related objects """ qry_a = session.query(A) qry_b = (session.query(B) .join(subq, and_(subq.c.b_id == B.id, subq.c.rownum <= b_limit)) ) a_s = qry_a.all() b_s = qry_b.all() res = dict((a, [b for b in b_s if b.a == a]) for a in a_s) return res

总之，Version-2可能是您问题的最直接答案。使用它要自担风险，因为在这里你是在欺骗SA，如果你以任何方式修改relationship属性，你可能会遇到“Kaboom！”
你的方法是正确的。谢谢但是我将使用多个查询来获得结果。是否可以使用一个SQL来完成这样的工作？使用此方法将只生成一个SQL来完成每个调用的工作。如果通过在引擎中设置echo=True来启用SQL日志记录，则可以看到这一点。。。。还是我不理解你的评论？首先，我会收集A。如果我不考虑B，那么我只能用一个简单的SQL来做。现在，我将为a的每个条目加载一个B集合。然后，我将为每个a条目发布一个新的SQL。是否可以只使用一个或两个SQL来完成所有数据的加载？根据您的解释，可能很难做到。@flypen：请参阅1/2SQL选项上的答案扩展。答案非常好。为了便于维护，我更喜欢使用get_children（）的第一种方法。谢谢！
def get_A_with_Bs_hack_relation(b_limit=10): """ @return: dict(A, [list of top *b_limit* A.b_s]) @note: the Bs are loaded as relationship A.b_s, but with the limit. """ qry = (session.query(A) .outerjoin(B) # @note: next line will trick SA to load joined Bs as if they were *all* objects # of relationship A.b_s. this is a @hack: and one should discard/reset a session after this # kind of hacky query!!! .options(contains_eager(A.b_s)) .outerjoin(subq, and_(subq.c.b_id == B.id, subq.c.rownum <= b_limit)) # @note: next line is required to make both *outerjoins* to play well together # in order produce the right result .filter(or_(B.id == None, and_(B.id != None, subq.c.b_id != None))) ) res = dict((a, a.b_s) for a in qry.all()) return res