如何使用python比较resultset对象中的两个连续行值

如何使用python比较resultset对象中的两个连续行值,python,postgresql,sqlalchemy,Python,Postgresql,Sqlalchemy,我有一张表格,上面有问题日志: 和rt_状态: 对于从_datetime='2018-09-06T16:34'到_datetime='2018-09-14T12:27'的日期范围,我要选择所有超过rt_状态表中定义的每个状态值的_时间设置的持续时间的问题。我应该从问题日志中获取ID为29、27和26的记录。IDS 29和26的记录应该考虑它们上次Upx日期和toyDATE时间之间的时间。 我想使用func.lag和over来执行此操作,但无法获得正确的记录。我正在使用Postgresql 9.6

我有一张表格,上面有问题日志:

和rt_状态:

对于从_datetime='2018-09-06T16:34'到_datetime='2018-09-14T12:27'的日期范围,我要选择所有超过rt_状态表中定义的每个状态值的_时间设置的持续时间的问题。我应该从问题日志中获取ID为29、27和26的记录。IDS 29和26的记录应该考虑它们上次Upx日期和toyDATE时间之间的时间。

我想使用func.lag和over来执行此操作,但无法获得正确的记录。我正在使用Postgresql 9.6和Python 2.7。仅使用SQLAlchemy Core如何才能使func.lag或func.lead正常工作

我尝试的是:

    s = select([
            rt_issues.c.id.label('rtissue_id'),
            rt_issues,
            rt_status.c.duration_in_min,
            rt_status.c.id.label('stage_id'),
            issue_status_logs.c.id.label('issue_log_id'),
            issue_status_logs.c.up_date.label('iss_log_update'),
            (issue_status_logs.c.up_date - func.lag(
                    issue_status_logs.c.up_date).over(
                    issue_status_logs.c.issue_id
                    )).label('mdiff'),
            ]).\
    where(and_(*conditions)).\
    select_from(rt_issues.
    outerjoin(issue_status_logs,
              rt_issues.c.id == issue_status_logs.c.issue_id).
    outerjoin(rt_status,
              issue_status_logs.c.to_status == rt_status.c.id)).\
    order_by(asc(issue_status_logs.c.up_date),
                  issue_status_logs.c.issue_id).\
    group_by(
             issue_status_logs.c.issue_id,
             rt_issues.c.id,
             issue_status_logs.c.id
             )
    rs = g.conn.execute(s)
    mcnt =  rs.rowcount
    print mcnt, 'rowcont'
    if rs.rowcount > 0:
        for r in rs:
            print dict(r)

这会产生包含错误记录的结果,即id为28的问题日志。有人能帮忙纠正错误吗

虽然您自己解决了问题,但这里有一个不使用窗口函数的方法,即滞后或超前。为了比较连续问题日志的最新时间戳之间的差异,您可以自行加入。在SQL中,查询可能如下所示

选择ilx.id 从发布日志ilx 在rsx.id=ilx.to_状态上加入rt_状态rsx 左连接问题\u登录到ily.from\u status=ilx.to\u status 和ily.issue\u id=ilx.issue\u id 其中ilx.up_date>=“2018-09-06T16:34”
和ilx.up_dateMy solution with modified sqlalchemy expression language:

s = select([
        rt_issues.c.id.label('rtissue_id'),
        rt_issues.c.title,
        rt_status.c.duration_in_min,
        rt_status.c.is_last_status,
        rt_status.c.id.label('stage_id'),
        issue_status_logs.c.id.label('issue_log_id'),
        issue_status_logs.c.up_date.label('iss_log_update'),
        (issue_status_logs.c.up_date - func.lag(
                issue_status_logs.c.up_date).over(
                issue_status_logs.c.issue_id)).
        label('mdiff'),
        (func.lead(
                issue_status_logs.c.issue_id).over(
                issue_status_logs.c.issue_id
                )).label('next_id'),
        (func.lead(
                issue_status_logs.c.up_date).over(
                issue_status_logs.c.issue_id,
                issue_status_logs.c.up_date,
                )).label('prev_up_date'),
        issue_status_logs.c.user_id,
        (users.c.first_name + ' ' + users.c.last_name).
        label('updated_by_user'),
        ]).\
    where(and_(*conditions)).\
    select_from(rt_issues.
    outerjoin(issue_status_logs,
              rt_issues.c.id == issue_status_logs.c.issue_id).
    outerjoin(users, issue_status_logs.c.user_id == users.c.id).
    outerjoin(rt_status,
              issue_status_logs.c.to_status == rt_status.c.id)).\
    order_by(issue_status_logs.c.issue_id,
             asc(issue_status_logs.c.up_date)).\
    group_by(
             issue_status_logs.c.issue_id,
             rt_issues.c.id,
             issue_status_logs.c.id,
             rt_status.c.id,
             users.c.id
             )
rs = g.conn.execute(s)
if rs.rowcount > 0:
    for r in rs:
        # IMPT: For issue with no last status
        if not r[rt_status.c.is_last_status]:
            if not r['mdiff'] and (not r['next_id']):
                n = (mto_dt - r['iss_log_update'].replace(tzinfo=None))
            elif ((not r['mdiff']) and
                  (r['next_id'] == r['rtissue_id'])):
                n = (r['prev_up_date'] - r['iss_log_update'])
            else:
                n = (r['mdiff'])
            n =  (n.total_seconds()/60)
            if n > r[rt_status.c.duration_in_min]:
                mx = dict(r)
                q_user_wise_pendency_list.append(mx)

    for t in q_user_wise_pendency_list:
        if not t in temp_list:
            temp_list.append(t)
    q_user_wise_pendency_list = temp_list

@我已经减少了代码,请检查您是否可以帮助第24行的状态似乎倒退。这是故意的还是错误的?@IljaEverilä24,25我可能只关注id 20,19的问题……忽略了24,25。对不起,我弄错了。我终于纠正了这个问题。将很快发布解决方案也是我的错:没有注意到记录24和25被您的from_datetime过滤掉。@感谢您费心提供上述解决方案。我已经在下面发布了我的解决方案版本,但不知道如何取消删除它。您以前的答案已被版主删除,因为它不是原来的答案,只能由版主取消删除。如果你想发布你自己的解决方案,只要创建另一个答案,如果可能的话。这个解决方案有一些不确定的行为,我认为:例如func.lagisue\u status\u logs.c.up\u date.overissue\u status\u logs.c.issue\u id没有定义排序,所以据我所知,它没有指定滞后值来自哪一行。它现在似乎可以工作了,但你不能相信它即使在连续的查询中也能给出相同的答案。@IljaEverilä,我认为你是对的,上面的更改是:func.lagisuústatusúlogs.c.upúdate.overissueústatusúlogs.c.issueúid,issueústatusúlogs.c.upúdate解决问题?这将产生决定性的结果,至少据我所知。@IljaEverilä当我使用上面的大数据时,我会让你知道。谢谢你指出缺点
    s = select([
            rt_issues.c.id.label('rtissue_id'),
            rt_issues,
            rt_status.c.duration_in_min,
            rt_status.c.id.label('stage_id'),
            issue_status_logs.c.id.label('issue_log_id'),
            issue_status_logs.c.up_date.label('iss_log_update'),
            (issue_status_logs.c.up_date - func.lag(
                    issue_status_logs.c.up_date).over(
                    issue_status_logs.c.issue_id
                    )).label('mdiff'),
            ]).\
    where(and_(*conditions)).\
    select_from(rt_issues.
    outerjoin(issue_status_logs,
              rt_issues.c.id == issue_status_logs.c.issue_id).
    outerjoin(rt_status,
              issue_status_logs.c.to_status == rt_status.c.id)).\
    order_by(asc(issue_status_logs.c.up_date),
                  issue_status_logs.c.issue_id).\
    group_by(
             issue_status_logs.c.issue_id,
             rt_issues.c.id,
             issue_status_logs.c.id
             )
    rs = g.conn.execute(s)
    mcnt =  rs.rowcount
    print mcnt, 'rowcont'
    if rs.rowcount > 0:
        for r in rs:
            print dict(r)
from_datetime = '2018-09-06T16:34'
to_datetime = '2018-09-14T12:27'

ilx = issue_status_logs.alias()
ily = issue_status_logs.alias()
rsx = rt_status

query = select([ilx.c.id]).\
    select_from(
        ilx.
        join(rsx, rsx.c.id == ilx.c.to_status).
        outerjoin(ily, and_(ily.c.from_status == ilx.c.to_status,
                            ily.c.issue_id == ilx.c.issue_id))).\
    where(and_(ilx.c.up_date >= from_datetime,
               ilx.c.up_date <= (func.coalesce(ily.c.up_date, to_datetime) -
                                 cast('1 minute', Interval) *
                                 rsx.c.duration_in_min)))
s = select([
        rt_issues.c.id.label('rtissue_id'),
        rt_issues.c.title,
        rt_status.c.duration_in_min,
        rt_status.c.is_last_status,
        rt_status.c.id.label('stage_id'),
        issue_status_logs.c.id.label('issue_log_id'),
        issue_status_logs.c.up_date.label('iss_log_update'),
        (issue_status_logs.c.up_date - func.lag(
                issue_status_logs.c.up_date).over(
                issue_status_logs.c.issue_id)).
        label('mdiff'),
        (func.lead(
                issue_status_logs.c.issue_id).over(
                issue_status_logs.c.issue_id
                )).label('next_id'),
        (func.lead(
                issue_status_logs.c.up_date).over(
                issue_status_logs.c.issue_id,
                issue_status_logs.c.up_date,
                )).label('prev_up_date'),
        issue_status_logs.c.user_id,
        (users.c.first_name + ' ' + users.c.last_name).
        label('updated_by_user'),
        ]).\
    where(and_(*conditions)).\
    select_from(rt_issues.
    outerjoin(issue_status_logs,
              rt_issues.c.id == issue_status_logs.c.issue_id).
    outerjoin(users, issue_status_logs.c.user_id == users.c.id).
    outerjoin(rt_status,
              issue_status_logs.c.to_status == rt_status.c.id)).\
    order_by(issue_status_logs.c.issue_id,
             asc(issue_status_logs.c.up_date)).\
    group_by(
             issue_status_logs.c.issue_id,
             rt_issues.c.id,
             issue_status_logs.c.id,
             rt_status.c.id,
             users.c.id
             )
rs = g.conn.execute(s)
if rs.rowcount > 0:
    for r in rs:
        # IMPT: For issue with no last status
        if not r[rt_status.c.is_last_status]:
            if not r['mdiff'] and (not r['next_id']):
                n = (mto_dt - r['iss_log_update'].replace(tzinfo=None))
            elif ((not r['mdiff']) and
                  (r['next_id'] == r['rtissue_id'])):
                n = (r['prev_up_date'] - r['iss_log_update'])
            else:
                n = (r['mdiff'])
            n =  (n.total_seconds()/60)
            if n > r[rt_status.c.duration_in_min]:
                mx = dict(r)
                q_user_wise_pendency_list.append(mx)

    for t in q_user_wise_pendency_list:
        if not t in temp_list:
            temp_list.append(t)
    q_user_wise_pendency_list = temp_list