Sql 为表中的每个ID选择最新的3条记录
我有一个带有复合主键(Sql 为表中的每个ID选择最新的3条记录,sql,sqlite,group-by,sql-order-by,greatest-n-per-group,Sql,Sqlite,Group By,Sql Order By,Greatest N Per Group,我有一个带有复合主键(ID,Date)的表,如下所示 +------+------------+-------+ | ID | Date | Value | +------+------------+-------+ | 1 | 1433419200 | 15 | | 1 | 1433332800 | 23 | | 1 | 1433246400 | 41 | | 1 | 1433160000 | 55 | | 1 | 1432
ID
,Date
)的表,如下所示
+------+------------+-------+
| ID | Date | Value |
+------+------------+-------+
| 1 | 1433419200 | 15 |
| 1 | 1433332800 | 23 |
| 1 | 1433246400 | 41 |
| 1 | 1433160000 | 55 |
| 1 | 1432900800 | 24 |
| 2 | 1433419200 | 52 |
| 2 | 1433332800 | 23 |
| 2 | 1433246400 | 39 |
| 2 | 1433160000 | 22 |
| 3 | 1433419200 | 11 |
| 3 | 1433246400 | 58 |
| ... | ... | ... |
+------+------------+-------+
这里的快速解决方案是什么?首先,这里是对不等式方法的正确查询:
SELECT p1.ID, p1.Date, p1.Value
FROM MyTable p1 LEFT JOIN
MyTable AS p2
ON p1.ID = p2.ID AND p2.Date <= p1.Date
--------------------------^ fixed this condition
GROUP BY p1.ID, p1.Date, p1.Value
HAVING COUNT(*) <= 5
ORDER BY p1.ID, p1.Date DESC;
选择p1.ID、p1.Date、p1.Value
从MyTable p1左联接
MyTable作为p2
在p1.ID=p2.ID和p2.Date上,您可以查找每个ID的三个最近日期:
SELECT ID, Date, Value
FROM MyTable
WHERE Date IN (SELECT Date
FROM MyTable AS T2
WHERE T2.ID = MyTable.ID
ORDER BY Date DESC
LIMIT 3)
或者,查找每个ID的第三个最近日期,并将其用作限制:
SELECT ID, Date, Value
FROM MyTable
WHERE Date >= IFNULL((SELECT Date
FROM MyTable AS T2
WHERE T2.ID = MyTable.ID
ORDER BY Date DESC
LIMIT 1 OFFSET 2),
0)
这两个查询都应该从主键的索引中获得良好的性能。实际上,“可选”修复是必要的。我刚刚尝试过,如果没有额外的两个groupby
列,它仍然不会返回任何结果。在索引点上,由于ID
和Date
构成复合主键,因此默认情况下它们上有一个索引。但是查询在大约600K行的表上运行几乎需要18秒。这是一种不同的方法,也是一种非常好的方法。性能提高了10倍!
SELECT distinct x.ID,x.Date,X.Value
FROM ( SELECT DISTINCT ID FROM XXXTable ) c
CROSS APPLY (
select top 3 A.ID,a.Date,Value,[Count] from (
SELECT distinct ID,Date,Value, ROW_NUMBER()
over (
PARTITION BY ID
order by Date
) AS [Count] where c.ID = t.ID
) A order by [Count] desc
SELECT distinct x.ID,x.Date,X.Value
FROM ( SELECT DISTINCT ID FROM XXXTable ) c
CROSS APPLY (
select top 3 A.ID,a.Date,Value,[Count] from (
SELECT distinct ID,Date,Value, ROW_NUMBER()
over (
PARTITION BY ID
order by Date
) AS [Count] where c.ID = t.ID
) A order by [Count] desc