Sql 仅从每组中选择最上面的行
我对下表有问题。我只能选择具有最大USCO_DFR和USCO_AHO='1'的用户。因此,从下面的示例中可以看出:Sql 仅从每组中选择最上面的行,sql,sql-server,tsql,Sql,Sql Server,Tsql,我对下表有问题。我只能选择具有最大USCO_DFR和USCO_AHO='1'的用户。因此,从下面的示例中可以看出: +----------+-------------------------+----------+ | USER_KEY | USCO_DFR | USCO_AHO | +----------+-------------------------+----------+ | 1 | 2018-06-01 00:00:00.000 | NUL
+----------+-------------------------+----------+
| USER_KEY | USCO_DFR | USCO_AHO |
+----------+-------------------------+----------+
| 1 | 2018-06-01 00:00:00.000 | NULL |
| 1 | 2018-03-05 00:00:00.000 | 1 |
| 1 | 2018-02-10 00:00:00.000 | NULL |
| 2 | 2018-07-10 00:00:00.000 | 1 |
| 2 | 2018-04-05 00:00:00.000 | NULL |
| 2 | 2018-01-15 00:00:00.000 | NULL |
| 3 | 2018-09-10 00:00:00.000 | 1 |
| 3 | 2018-05-05 00:00:00.000 | NULL |
| 3 | 2018-04-15 00:00:00.000 | NULL |
+----------+-------------------------+----------+
只应选择用户密钥=2,3
预期产出:
+----------+-------------------------+----------+
| USER_KEY | USCO_DFR | USCO_AHO |
+----------+-------------------------+----------+
| 2 | 2018-07-10 00:00:00.000 | 1 |
| 3 | 2018-09-10 00:00:00.000 | 1 |
+----------+-------------------------+----------+
此查询对结果进行排序:
SELECT * FROM @BAUSCO ORDER BY USER_KEY, USCO_DFR DESC
但我不知道如何从结果中选择那些用户密钥?基本上,我必须从每个集合中只选择最上面的一行,这满足条件USCO_AHO='1'
DECLARE @BAUSCO TABLE
(
USER_KEY INT,
USCO_DFR DATETIME,
USCO_AHO CHAR(1)
)
INSERT @BAUSCO(USER_KEY, USCO_DFR, USCO_AHO)
VALUES (1, '2018-02-10', NULL),
(1, '2018-03-05', '1'),
(1, '2018-06-01', NULL),
(2, '2018-01-15', NULL),
(2, '2018-04-05', NULL),
(2, '2018-07-10', '1'),
(3, '2018-04-15', NULL),
(3, '2018-05-05', NULL),
(3, '2018-09-10', '1')
我们可以在此处使用
行数
来针对每个用户的最大USCO\u DFR
记录:
WITH cte AS (
SELECT *, ROW_NUMBER() OVER (PARTITION BY USER_KEY ORDER BY USCO_DFR DESC) rn
FROM @BAUSCO
)
SELECT USER_KEY, USCO_DFR, USCO_AHO
FROM cte
WHERE USCO_AHO = 1 and rn = 1;
这将选择其最大USCO\u DFR
值与USCO\u AHO
值为1的同一记录一致的所有用户记录。尝试此操作
SELECT A.* FROM @BAUSCO A INNER JOIN
(
SELECT USER_KEY, Max(USCO_DFR) MaxUSCO_DFR
FROM @BAUSCO
GROUP BY USER_KEY
) B
ON A.USER_KEY = B.USER_KEY AND A.USCO_DFR = B.MaxUSCO_DFR
WHERE A.USCO_AHO = 1
您可以将行号与CTE一起使用
;with cte as
(
select ROW_NUMBER() over (partition by USER_KEY order by USCO_DFR desc) AS ROWNUM,*
from
@BAUSCO
)
select USER_KEY, USCO_DFR, USCO_AHO from cte where ROWNUM=1 and USCO_AHO =1
以下查询应执行您想要的操作:
DECLARE @BAUSCO TABLE
(
USER_KEY INT,
USCO_DFR DATETIME,
USCO_AHO CHAR(1)
)
INSERT @BAUSCO(USER_KEY, USCO_DFR, USCO_AHO)
VALUES (1, '2018-02-10', NULL),
(1, '2018-03-05', '1'),
(1, '2018-06-01', NULL),
(2, '2018-01-15', '1'),
(2, '2018-04-05', NULL),
(2, '2018-07-10', '1'),
(3, '2018-04-15', '1'),
(3, '2018-05-05', NULL),
(3, '2018-09-10', '1')
SELECT USER_KEY, USCO_DFR, USCO_AHO FROM (
SELECT USER_KEY
,USCO_DFR
,USCO_AHO
,ROW_NUMBER() OVER (PARTITION BY USER_KEY ORDER BY (SELECT 1)) AS RNO
FROM @BAUSCO ) A
WHERE A.USCO_AHO = A.RNO AND A.USCO_AHO = 1
结果如下:
USER_KEY USCO_DFR USCO_AHO
2 2018-01-15 00:00:00.000 1
3 2018-04-15 00:00:00.000 1
在简单的SQL中,使用GROUP BY with sub查询,可以得到预期的结果:
SELECT Q.USER_KEY, Q.USCO_DFR, B.USCO_AHO
FROM (
SELECT USER_KEY, MAX(USCO_DFR) AS USCO_DFR
FROM @BAUSCO
GROUP BY USER_KEY
) Q
JOIN @BAUSCO B ON B.USER_KEY = Q.USER_KEY AND B.USCO_DFR = Q.USCO_DFR
WHERE B.USCO_AHO = '1'
也许有点过分了,可以用来分解分析函数,但它们真是太方便了
SELECT * FROM
(SELECT
*,
MAX(USCO_DFR) OVER (PARTITION BY USER_KEY) AS MAX_DFR
FROM
@BAUSCO
) T
WHERE
T.USCO_AHO = '1'
AND T.USCO_DFR = T.MAX_DFR
结果:
| USER_KEY | USCO_DFR | USCO_AHO | MAX_DFR |
|----------|----------------------|----------|----------------------|
| 2 | 2018-07-10T00:00:00Z | 1 | 2018-07-10T00:00:00Z |
| 3 | 2018-09-10T00:00:00Z | 1 | 2018-09-10T00:00:00Z |
只应选择用户2,3,因为他们的USCO_AHO='1'具有最大的USCO_DFR。您的查询返回三个结果。请参阅我的预期输出,我已添加此内容。您编辑的查询返回空表。你试过了吗?我觉得你的提问和你的概念很棒。仅此而已:ORDER BY USCO_DFR DESC->所以它是按USCO_DFR排序的,最大的结果在顶部。我非常喜欢分析函数,可以清楚地表达您想要实现的内容,并让SQL引擎决定如何实现。在这种情况下,我认为使用row_number和order稍微模糊了意图——您可以直接返回max值。它确实提出了如何处理USCO_DFR值完全相同的行的问题。@IanMcGowan您的答案也遇到了同样的问题。如果两个或多个记录绑定到最大值,您将得到多个结果。您的查询不会返回任何结果。空表。@FrenkyB我已经包括了完整的代码和结果
DECLARE @BAUSCO TABLE
(
USER_KEY INT,
USCO_DFR DATETIME,
USCO_AHO CHAR(1)
)
INSERT @BAUSCO(USER_KEY, USCO_DFR, USCO_AHO)
VALUES (1, '2018-02-10', NULL),
(1, '2018-03-05', '1'),
(1, '2018-06-01', NULL),
(2, '2018-01-15', NULL),
(2, '2018-04-05', NULL),
(2, '2018-07-10', '1'),
(3, '2018-04-15', NULL),
(3, '2018-05-05', NULL),
(3, '2018-09-10', '1')
select * from @BAUSCO a
where USCO_DFR=(select MAX(USCO_DFR) from @BAUSCO b where a.USER_KEY=b.USER_KEY )
and USCO_AHO=1