SQL根据状态获取第一个非值

SQL根据状态获取第一个非值,sql,postgresql,Sql,Postgresql,我不确定标题是否正确,但这里是我的问题。我有一张这样的桌子: +----+--------+--------------+---------+------------+ | id | city | province | status | date | +----+--------+--------------+---------+------------+ | 1 | cainta | rizal | failed | 01/01/2020 | |

我不确定标题是否正确,但这里是我的问题。我有一张这样的桌子:

+----+--------+--------------+---------+------------+
| id |  city  |   province   | status  |    date    |
+----+--------+--------------+---------+------------+
|  1 | cainta | rizal        | failed  | 01/01/2020 |
|  1 | null   | null         | success | 02/01/2020 |
|  1 | cainta | rizal        | failed  | 03/01/2020 |
|  2 | pasig  | metro manila | failed  | 04/01/2020 |
|  2 | pasig  | metro manila | failed  | 05/01/2020 |
|  2 | null   | null         | success | 06/01/2020 |
|  3 | obando | bulacan      | failed  | 07/01/2020 |
|  3 | null   | null         | failed  | 08/01/2020 |
|  3 | obando | bulacan      | success | 09/01/2020 |
+----+--------+--------------+---------+------------+
|   id | city   | province   | status   | date       |
|------|--------|------------|----------|------------|
|    1 | nan    | nan        | success  | 02/01/2020 |
|    2 | nan    | nan        | success  | 06/01/2020 |
|    3 | obando | bulacan    | success  | 09/01/2020 |
现在我需要获取状态为“success”的所有事务。如果我这样做,输出将如下所示:

+----+--------+--------------+---------+------------+
| id |  city  |   province   | status  |    date    |
+----+--------+--------------+---------+------------+
|  1 | cainta | rizal        | failed  | 01/01/2020 |
|  1 | null   | null         | success | 02/01/2020 |
|  1 | cainta | rizal        | failed  | 03/01/2020 |
|  2 | pasig  | metro manila | failed  | 04/01/2020 |
|  2 | pasig  | metro manila | failed  | 05/01/2020 |
|  2 | null   | null         | success | 06/01/2020 |
|  3 | obando | bulacan      | failed  | 07/01/2020 |
|  3 | null   | null         | failed  | 08/01/2020 |
|  3 | obando | bulacan      | success | 09/01/2020 |
+----+--------+--------------+---------+------------+
|   id | city   | province   | status   | date       |
|------|--------|------------|----------|------------|
|    1 | nan    | nan        | success  | 02/01/2020 |
|    2 | nan    | nan        | success  | 06/01/2020 |
|    3 | obando | bulacan    | success  | 09/01/2020 |
我需要的是:

|   id | city   | province     | status   | date       |
|------|--------|--------------|----------|------------|
|    1 | cainta | rizal        | success  | 02/01/2020 |
|    2 | pasig  | metro manila | success  | 06/01/2020 |
|    3 | obando | bulacan      | success  | 09/01/2020 |

希望有人能解释一下如何处理这种情况。

您可以使用这里的分析函数

SELECT * FROM 
(SELECT T.ID, T.CITY, T.PROVINCE,
        MAX(CASE WHEN STATUS = 'success' THEN DATE END) 
             OVER (PARTITION BY ID ORDER BY DATE) AS DATE,
        ROW_NUMBER() OVER (PARTITION BY ID ORDER BY DATE) AS RN,
        SUM(CASE WHEN STATUS = 'success' THEN 1 ELSE 0 END) 
             OVER (PARTITION BY ID) AS CNT
  FROM YOUR_TABLE T)
 WHERE RN = 1 AND CNT > 0
更改样本数据后,您可以按如下方式使用
分组依据

SELECT ID, MAX(CITY) AS CITY, MAX(PROVINCE) AS PROVINCE, 
       MAX(CASE WHEN STATUS = 'success' THEN DATE END) AS DATE
  FROM YOUR_TABLE 
GROUP BY ID
HAVING SUM(CASE WHEN STATUS = 'success' THEN 1 END) > 0

也许窗口功能可以帮助:

SELECT id, city, province, status, date
FROM (SELECT id,
             max(city) OVER w AS city,
             max(province) OVER w AS province,
             status,
             date
      FROM atable
      WINDOW w AS (PARTITION BY id)) AS q
WHERE status = 'success';

使用
lag()

输出:

| id  | city   | province     | status  | date       |
| --- | ------ | ------------ | ------- | ---------- |
| 1   | cainta | rizal        | success | 02/01/2020 |
| 2   | pasig  | metro manila | success | 06/01/2020 |
| 3   | obando | bulacan      | success | 09/01/2020 |

如果每个id只需要一行,可以使用聚合:

select id, max(city) as city, max(province) as province,
       max(date) filter (where status = 'success') as date
from t
group by id
having count(*) filter (where status = 'success') > 0;
请注意,如果每个id可以有多个成功日期,则可以使用
array\u agg()


我在Having子句中添加了一个日期过滤器,但是查询时间太长,大约每秒10行。您能否建议加快查询执行的方法?谢谢每个ID只返回一条记录,最大成功日期为。