Postgresql:使用窗口函数从日期范围中选择最大值
我试图从过去两年内选择最大值 例如,如果我有一个如下表:Postgresql:使用窗口函数从日期范围中选择最大值,sql,postgresql,Sql,Postgresql,我试图从过去两年内选择最大值 例如,如果我有一个如下表: |person_id | pass_or_fail | timestamp| |----------|--------------|-----------| |1234 | 1 | 1990-01-01| |1234 | 0 | 1995-01-01| |1234 | NULL | 1995-12-12| |6789 | 0
|person_id | pass_or_fail | timestamp|
|----------|--------------|-----------|
|1234 | 1 | 1990-01-01|
|1234 | 0 | 1995-01-01|
|1234 | NULL | 1995-12-12|
|6789 | 0 | 1990-01-01|
|6789 | 0 | 1991-01-01|
|6789 | 1 | 1995-01-01|
|6789 | 1 | 1996-01-01|
|6789 | 0 | 1997-01-01|
|6789 | NULL | 1997-03-03|
我想从我的查询中获得以下信息:
person_id |highest_grade_from_past_two_years | pass_or_fail | timestamp
1234 |1 | 1 | 1990-01-01
1234 |0 | 0 | 1995-01-01
1234 |0 | NULL | 1995-12-12
6789 |0 | 0 | 1990-01-01
6789 |0 | 0 | 1991-01-01
6789 |1 | 1 | 1995-01-01
6789 |1 | 1 | 1996-01-01
6789 |1 | 0 | 1997-01-01
6789 |1 | NULL | 1997-03-03
我如何编写窗口函数来给出这个结果?我不知道如何使用窗口函数编写这个查询。不带窗口的版本:
SELECT t1.person_id,
max(t2.pass_or_fail) AS highest_grade_from_past_two_years,
t1.pass_or_fail,
t1.timestamp
FROM t AS t1
JOIN t AS t2 ON (
t1.person_id = t2.person_id
AND t2.timestamp < t1.timestamp + '2 year'::interval
AND t2.timestamp <= t1.timestamp
)
GROUP BY 1, 3, 4
我不知道如何使用窗口函数编写此查询。不带窗口的版本:
SELECT t1.person_id,
max(t2.pass_or_fail) AS highest_grade_from_past_two_years,
t1.pass_or_fail,
t1.timestamp
FROM t AS t1
JOIN t AS t2 ON (
t1.person_id = t2.person_id
AND t2.timestamp < t1.timestamp + '2 year'::interval
AND t2.timestamp <= t1.timestamp
)
GROUP BY 1, 3, 4
通过与实际第1年(如果存在)合并
select p1.person_id, p1.pass_or_fail, p1.timestamp,
case when coalesce(p1.pass_or_fail,0) > coalesce(p2.pass_or_fail,0)
then coalesce(p1.pass_or_fail,0) else coalesce(p2.pass_or_fail,0) end as highest_grade_from_past_two_years
from person p1
left join person p2
on p1.person_id = p2.person_id
and extract(year from p2.timestamp) = extract(year from p1.timestamp) - 1
order by p1.person_id, p1.timestamp
)
勾选此处:与实际年份-1(如果存在)连接
select p1.person_id, p1.pass_or_fail, p1.timestamp,
case when coalesce(p1.pass_or_fail,0) > coalesce(p2.pass_or_fail,0)
then coalesce(p1.pass_or_fail,0) else coalesce(p2.pass_or_fail,0) end as highest_grade_from_past_two_years
from person p1
left join person p2
on p1.person_id = p2.person_id
and extract(year from p2.timestamp) = extract(year from p1.timestamp) - 1
order by p1.person_id, p1.timestamp
)
检查这里:我看不到使用窗口功能的明显方法。相关子查询或横向联接将起作用:
select t.*,
(select max(t2.pass_or_fail)
from t t2
where t2.person_id = t.person_id and
t2.timestamp <= t.timestamp and
t2.timestamp >= t.timestamp - interval '2 year'
) as highest_grade_from_past_two_years
from t;
我认为使用窗口函数可以得到的最接近的范围规范。但是,Postgres不支持前面的范围规范。我看不到使用窗口函数的明显方法。相关子查询或横向联接将起作用:
select t.*,
(select max(t2.pass_or_fail)
from t t2
where t2.person_id = t.person_id and
t2.timestamp <= t.timestamp and
t2.timestamp >= t.timestamp - interval '2 year'
) as highest_grade_from_past_two_years
from t;
我认为使用窗口函数可以得到的最接近的范围规范。但是,Postgres不支持前面的范围规范。上面的第二个表是我希望查询结果的样子。我只使用一个表两个表都是相同的,我使用它是为了避免在select语句中使用子查询-它分别为每一行运行上面的第二个表就是我希望查询结果的样子。我只处理一个表这两个表都是相同的,我使用它是为了避免在select语句中使用子查询-它以每行分隔符的形式运行-将子查询放在select中不是很好-它以每行分隔符的形式运行。@RomanTkachuk。我认为您对SQL的理解不如我。这里可以使用相关子查询。如果在tperson\u id、timestamp、pass\u或fail上有一个索引,它将运行得更快。将子查询放在select中不是很好,它会为每一行单独运行。@RomanTkachuk。我认为您对SQL的理解不如我。这里可以使用相关子查询。如果在tperson\u id、timestamp、pass\u或fail上有索引,它将运行得更快。