如何使用python比较resultset对象中的两个连续行值
我有一张表格,上面有问题日志: 和rt_状态: 对于从_datetime='2018-09-06T16:34'到_datetime='2018-09-14T12:27'的日期范围,我要选择所有超过rt_状态表中定义的每个状态值的_时间设置的持续时间的问题。我应该从问题日志中获取ID为29、27和26的记录。IDS 29和26的记录应该考虑它们上次Upx日期和toyDATE时间之间的时间。 我想使用func.lag和over来执行此操作,但无法获得正确的记录。我正在使用Postgresql 9.6和Python 2.7。仅使用SQLAlchemy Core如何才能使func.lag或func.lead正常工作 我尝试的是:如何使用python比较resultset对象中的两个连续行值,python,postgresql,sqlalchemy,Python,Postgresql,Sqlalchemy,我有一张表格,上面有问题日志: 和rt_状态: 对于从_datetime='2018-09-06T16:34'到_datetime='2018-09-14T12:27'的日期范围,我要选择所有超过rt_状态表中定义的每个状态值的_时间设置的持续时间的问题。我应该从问题日志中获取ID为29、27和26的记录。IDS 29和26的记录应该考虑它们上次Upx日期和toyDATE时间之间的时间。 我想使用func.lag和over来执行此操作,但无法获得正确的记录。我正在使用Postgresql 9.6
s = select([
rt_issues.c.id.label('rtissue_id'),
rt_issues,
rt_status.c.duration_in_min,
rt_status.c.id.label('stage_id'),
issue_status_logs.c.id.label('issue_log_id'),
issue_status_logs.c.up_date.label('iss_log_update'),
(issue_status_logs.c.up_date - func.lag(
issue_status_logs.c.up_date).over(
issue_status_logs.c.issue_id
)).label('mdiff'),
]).\
where(and_(*conditions)).\
select_from(rt_issues.
outerjoin(issue_status_logs,
rt_issues.c.id == issue_status_logs.c.issue_id).
outerjoin(rt_status,
issue_status_logs.c.to_status == rt_status.c.id)).\
order_by(asc(issue_status_logs.c.up_date),
issue_status_logs.c.issue_id).\
group_by(
issue_status_logs.c.issue_id,
rt_issues.c.id,
issue_status_logs.c.id
)
rs = g.conn.execute(s)
mcnt = rs.rowcount
print mcnt, 'rowcont'
if rs.rowcount > 0:
for r in rs:
print dict(r)
这会产生包含错误记录的结果,即id为28的问题日志。有人能帮忙纠正错误吗 虽然您自己解决了问题,但这里有一个不使用窗口函数的方法,即滞后或超前。为了比较连续问题日志的最新时间戳之间的差异,您可以自行加入。在SQL中,查询可能如下所示 选择ilx.id 从发布日志ilx 在rsx.id=ilx.to_状态上加入rt_状态rsx 左连接问题\u登录到ily.from\u status=ilx.to\u status 和ily.issue\u id=ilx.issue\u id 其中ilx.up_date>=“2018-09-06T16:34”
和ilx.up_dateMy solution with modified sqlalchemy expression language:
s = select([
rt_issues.c.id.label('rtissue_id'),
rt_issues.c.title,
rt_status.c.duration_in_min,
rt_status.c.is_last_status,
rt_status.c.id.label('stage_id'),
issue_status_logs.c.id.label('issue_log_id'),
issue_status_logs.c.up_date.label('iss_log_update'),
(issue_status_logs.c.up_date - func.lag(
issue_status_logs.c.up_date).over(
issue_status_logs.c.issue_id)).
label('mdiff'),
(func.lead(
issue_status_logs.c.issue_id).over(
issue_status_logs.c.issue_id
)).label('next_id'),
(func.lead(
issue_status_logs.c.up_date).over(
issue_status_logs.c.issue_id,
issue_status_logs.c.up_date,
)).label('prev_up_date'),
issue_status_logs.c.user_id,
(users.c.first_name + ' ' + users.c.last_name).
label('updated_by_user'),
]).\
where(and_(*conditions)).\
select_from(rt_issues.
outerjoin(issue_status_logs,
rt_issues.c.id == issue_status_logs.c.issue_id).
outerjoin(users, issue_status_logs.c.user_id == users.c.id).
outerjoin(rt_status,
issue_status_logs.c.to_status == rt_status.c.id)).\
order_by(issue_status_logs.c.issue_id,
asc(issue_status_logs.c.up_date)).\
group_by(
issue_status_logs.c.issue_id,
rt_issues.c.id,
issue_status_logs.c.id,
rt_status.c.id,
users.c.id
)
rs = g.conn.execute(s)
if rs.rowcount > 0:
for r in rs:
# IMPT: For issue with no last status
if not r[rt_status.c.is_last_status]:
if not r['mdiff'] and (not r['next_id']):
n = (mto_dt - r['iss_log_update'].replace(tzinfo=None))
elif ((not r['mdiff']) and
(r['next_id'] == r['rtissue_id'])):
n = (r['prev_up_date'] - r['iss_log_update'])
else:
n = (r['mdiff'])
n = (n.total_seconds()/60)
if n > r[rt_status.c.duration_in_min]:
mx = dict(r)
q_user_wise_pendency_list.append(mx)
for t in q_user_wise_pendency_list:
if not t in temp_list:
temp_list.append(t)
q_user_wise_pendency_list = temp_list
@我已经减少了代码,请检查您是否可以帮助第24行的状态似乎倒退。这是故意的还是错误的?@IljaEverilä24,25我可能只关注id 20,19的问题……忽略了24,25。对不起,我弄错了。我终于纠正了这个问题。将很快发布解决方案也是我的错:没有注意到记录24和25被您的from_datetime过滤掉。@感谢您费心提供上述解决方案。我已经在下面发布了我的解决方案版本,但不知道如何取消删除它。您以前的答案已被版主删除,因为它不是原来的答案,只能由版主取消删除。如果你想发布你自己的解决方案,只要创建另一个答案,如果可能的话。这个解决方案有一些不确定的行为,我认为:例如func.lagisue\u status\u logs.c.up\u date.overissue\u status\u logs.c.issue\u id没有定义排序,所以据我所知,它没有指定滞后值来自哪一行。它现在似乎可以工作了,但你不能相信它即使在连续的查询中也能给出相同的答案。@IljaEverilä,我认为你是对的,上面的更改是:func.lagisuústatusúlogs.c.upúdate.overissueústatusúlogs.c.issueúid,issueústatusúlogs.c.upúdate解决问题?这将产生决定性的结果,至少据我所知。@IljaEverilä当我使用上面的大数据时,我会让你知道。谢谢你指出缺点
s = select([
rt_issues.c.id.label('rtissue_id'),
rt_issues,
rt_status.c.duration_in_min,
rt_status.c.id.label('stage_id'),
issue_status_logs.c.id.label('issue_log_id'),
issue_status_logs.c.up_date.label('iss_log_update'),
(issue_status_logs.c.up_date - func.lag(
issue_status_logs.c.up_date).over(
issue_status_logs.c.issue_id
)).label('mdiff'),
]).\
where(and_(*conditions)).\
select_from(rt_issues.
outerjoin(issue_status_logs,
rt_issues.c.id == issue_status_logs.c.issue_id).
outerjoin(rt_status,
issue_status_logs.c.to_status == rt_status.c.id)).\
order_by(asc(issue_status_logs.c.up_date),
issue_status_logs.c.issue_id).\
group_by(
issue_status_logs.c.issue_id,
rt_issues.c.id,
issue_status_logs.c.id
)
rs = g.conn.execute(s)
mcnt = rs.rowcount
print mcnt, 'rowcont'
if rs.rowcount > 0:
for r in rs:
print dict(r)
from_datetime = '2018-09-06T16:34'
to_datetime = '2018-09-14T12:27'
ilx = issue_status_logs.alias()
ily = issue_status_logs.alias()
rsx = rt_status
query = select([ilx.c.id]).\
select_from(
ilx.
join(rsx, rsx.c.id == ilx.c.to_status).
outerjoin(ily, and_(ily.c.from_status == ilx.c.to_status,
ily.c.issue_id == ilx.c.issue_id))).\
where(and_(ilx.c.up_date >= from_datetime,
ilx.c.up_date <= (func.coalesce(ily.c.up_date, to_datetime) -
cast('1 minute', Interval) *
rsx.c.duration_in_min)))
s = select([
rt_issues.c.id.label('rtissue_id'),
rt_issues.c.title,
rt_status.c.duration_in_min,
rt_status.c.is_last_status,
rt_status.c.id.label('stage_id'),
issue_status_logs.c.id.label('issue_log_id'),
issue_status_logs.c.up_date.label('iss_log_update'),
(issue_status_logs.c.up_date - func.lag(
issue_status_logs.c.up_date).over(
issue_status_logs.c.issue_id)).
label('mdiff'),
(func.lead(
issue_status_logs.c.issue_id).over(
issue_status_logs.c.issue_id
)).label('next_id'),
(func.lead(
issue_status_logs.c.up_date).over(
issue_status_logs.c.issue_id,
issue_status_logs.c.up_date,
)).label('prev_up_date'),
issue_status_logs.c.user_id,
(users.c.first_name + ' ' + users.c.last_name).
label('updated_by_user'),
]).\
where(and_(*conditions)).\
select_from(rt_issues.
outerjoin(issue_status_logs,
rt_issues.c.id == issue_status_logs.c.issue_id).
outerjoin(users, issue_status_logs.c.user_id == users.c.id).
outerjoin(rt_status,
issue_status_logs.c.to_status == rt_status.c.id)).\
order_by(issue_status_logs.c.issue_id,
asc(issue_status_logs.c.up_date)).\
group_by(
issue_status_logs.c.issue_id,
rt_issues.c.id,
issue_status_logs.c.id,
rt_status.c.id,
users.c.id
)
rs = g.conn.execute(s)
if rs.rowcount > 0:
for r in rs:
# IMPT: For issue with no last status
if not r[rt_status.c.is_last_status]:
if not r['mdiff'] and (not r['next_id']):
n = (mto_dt - r['iss_log_update'].replace(tzinfo=None))
elif ((not r['mdiff']) and
(r['next_id'] == r['rtissue_id'])):
n = (r['prev_up_date'] - r['iss_log_update'])
else:
n = (r['mdiff'])
n = (n.total_seconds()/60)
if n > r[rt_status.c.duration_in_min]:
mx = dict(r)
q_user_wise_pendency_list.append(mx)
for t in q_user_wise_pendency_list:
if not t in temp_list:
temp_list.append(t)
q_user_wise_pendency_list = temp_list