Sql 或中的x构成: SELECT id, name, other_columns FROM messages WHERE id IN ( SELECT MAX(id) FROM messages GROUP BY name );
我不知道与其他一些解决方案相比,它的性能如何,但它在我的300多万行的表中运行得非常出色。(4秒执行,1200多个结果)Sql 或中的x构成: SELECT id, name, other_columns FROM messages WHERE id IN ( SELECT MAX(id) FROM messages GROUP BY name );,sql,mysql,group-by,greatest-n-per-group,Sql,Mysql,Group By,Greatest N Per Group,我不知道与其他一些解决方案相比,它的性能如何,但它在我的300多万行的表中运行得非常出色。(4秒执行,1200多个结果) 这应该可以在MySQL和SQL Server上使用。我还没有使用大型数据库进行测试,但我认为这可能比连接表更快: SELECT *, Max(Id) FROM messages GROUP BY Name CREATE TABLE temperature( id INT UNSIGNED NOT NULL AUTO_INCREMENT, groupID INT U
这应该可以在MySQL和SQL Server上使用。我还没有使用大型数据库进行测试,但我认为这可能比连接表更快:
SELECT *, Max(Id) FROM messages GROUP BY Name
CREATE TABLE temperature(
id INT UNSIGNED NOT NULL AUTO_INCREMENT,
groupID INT UNSIGNED NOT NULL,
recordedTimestamp TIMESTAMP NOT NULL,
recordedValue INT NOT NULL,
INDEX groupIndex(groupID, recordedTimestamp),
PRIMARY KEY (id)
);
CREATE TEMPORARY TABLE selected_group(id INT UNSIGNED NOT NULL, PRIMARY KEY(id));
子查询解决方案
按连接条件求解
从消息m1中选择m1.*
左外连接消息m2
在(m1.id上,这里是另一种获取最后一条相关记录的方法,使用带有order by的GROUP_CONCAT
和SUBSTRING_INDEX
从列表中选择一条记录
SELECT
`Id`,
`Name`,
SUBSTRING_INDEX(
GROUP_CONCAT(
`Other_Columns`
ORDER BY `Id` DESC
SEPARATOR '||'
),
'||',
1
) Other_Columns
FROM
messages
GROUP BY `Name`
上面的查询将把所有其他_列
分组,这些列位于相同的名称
组中,并使用按id DESC排序
将所有其他_列
按降序与提供的分隔符连接在一个特定的组中,我在这个lis上使用了|
,使用了子字符串
t将选择第一个
Hi@Vijay Dev如果您的表消息包含自动递增主键的Id,则要根据主键获取最新记录,您的查询应如下所示:
SELECT m1.* FROM messages m1 INNER JOIN (SELECT max(Id) as lastmsgId FROM messages GROUP BY Name) m2 ON m1.Id=m2.lastmsgId
你也可以从这里观看
第一个解决方案
SELECT d1.ID,Name,City FROM Demo_User d1
INNER JOIN
(SELECT MAX(ID) AS ID FROM Demo_User GROUP By NAME) AS P ON (d1.ID=P.ID);
第二种解决方案
SELECT * FROM (SELECT * FROM Demo_User ORDER BY ID DESC) AS T GROUP BY NAME ;
如果要为每个名称
指定最后一行,则可以按名称
为每个行组指定行号,并按Id
降序排列
查询
SELECT t1.Id,
t1.Name,
t1.Other_Columns
FROM
(
SELECT Id,
Name,
Other_Columns,
(
CASE Name WHEN @curA
THEN @curRow := @curRow + 1
ELSE @curRow := 1 AND @curA := Name END
) + 1 AS rn
FROM messages t,
(SELECT @curRow := 0, @curA := '') r
ORDER BY Name,Id DESC
)t1
WHERE t1.rn = 1
ORDER BY t1.Id;
这个怎么样:
SELECT DISTINCT ON (name) *
FROM messages
ORDER BY name, id DESC;
我也有类似的问题(在postgresql上),在1M记录表上。这个解决方案需要1.7秒,而左连接的需要44秒。
在我的例子中,我必须针对空值过滤your name字段的Correspondant,从而使性能提高0.2秒,这是我的解决方案:
DROP TABLE IF EXISTS UniqueIDs;
CREATE Temporary table UniqueIDs (id Int(11));
INSERT INTO UniqueIDs
(SELECT T1.ID FROM Table T1 LEFT JOIN Table T2 ON
(T1.Field1 = T2.Field1 AND T1.Field2 = T2.Field2 #Comparison Fields
AND T1.ID < T2.ID)
WHERE T2.ID IS NULL);
DELETE FROM Table WHERE id NOT IN (SELECT ID FROM UniqueIDs);
SELECT
DISTINCT NAME,
MAX(MESSAGES) OVER(PARTITION BY NAME) MESSAGES
FROM MESSAGE;
速度相当快的方法如下
SELECT *
FROM messages a
WHERE Id = (SELECT MAX(Id) FROM messages WHERE a.Name = Name)
结果
Id Name Other_Columns
3 A A_data_3
5 B B_data_2
6 C C_data_1
很明显,有很多不同的方法可以获得相同的结果,你的问题似乎是,在MySQL中,获得每个组最后结果的有效方法是什么。如果你正在处理大量数据,并且假设你使用的是InnoDB,甚至是最新版本的MySQL(如5.7.21和8.0.4-rc)那么可能没有一个有效的方法来做到这一点
我们有时需要对超过6000万行的表执行此操作
对于这些示例,我将使用只有大约150万行的数据,其中查询需要查找数据中所有组的结果。在我们的实际案例中,我们通常需要返回大约2000个组的数据(假设不需要检查太多数据)
我将使用以下表格:
SELECT *, Max(Id) FROM messages GROUP BY Name
CREATE TABLE temperature(
id INT UNSIGNED NOT NULL AUTO_INCREMENT,
groupID INT UNSIGNED NOT NULL,
recordedTimestamp TIMESTAMP NOT NULL,
recordedValue INT NOT NULL,
INDEX groupIndex(groupID, recordedTimestamp),
PRIMARY KEY (id)
);
CREATE TEMPORARY TABLE selected_group(id INT UNSIGNED NOT NULL, PRIMARY KEY(id));
温度表包含大约150万条随机记录,以及100个不同的组。
选定的_组由这100个组填充(在我们的案例中,所有组的填充率通常小于20%)
由于此数据是随机的,这意味着多行可以具有相同的recordedTimestamp。我们想要的是按照groupID的顺序获取所有选定组的列表,每个组具有最后一个recordedTimestamp,如果同一组具有多个类似的匹配行,则获取这些行的最后一个匹配id
如果假设MySQL有一个last()函数,它以特殊的ORDER BY子句返回最后一行的值,那么我们可以简单地执行以下操作:
SELECT
last(t1.id) AS id,
t1.groupID,
last(t1.recordedTimestamp) AS recordedTimestamp,
last(t1.recordedValue) AS recordedValue
FROM selected_group g
INNER JOIN temperature t1 ON t1.groupID = g.id
ORDER BY t1.recordedTimestamp, t1.id
GROUP BY t1.groupID;
在这种情况下,它只需要检查几百行,因为它不使用任何正常的GROUP BY函数。这将在0秒内执行,因此非常高效。
请注意,通常在MySQL中,我们会在GROUP BY子句之后看到ORDER BY子句,但此ORDER BY子句用于确定last()函数的顺序,如果它在GROUP BY之后,则它将对组进行排序。如果不存在GROUP BY子句,则所有返回行中的最后值都将相同
然而MySQL并没有这样的功能,所以让我们看看它有什么不同的想法,并证明这些都不是有效的
示例1
SELECT t1.id, t1.groupID, t1.recordedTimestamp, t1.recordedValue
FROM selected_group g
INNER JOIN temperature t1 ON t1.id = (
SELECT t2.id
FROM temperature t2
WHERE t2.groupID = g.id
ORDER BY t2.recordedTimestamp DESC, t2.id DESC
LIMIT 1
);
这检查了3009254行,在5.7.21上花费了约0.859秒,在8.0.4-rc上花费了稍长的时间
示例2
SELECT t1.id, t1.groupID, t1.recordedTimestamp, t1.recordedValue
FROM temperature t1
INNER JOIN (
SELECT max(t2.id) AS id
FROM temperature t2
INNER JOIN (
SELECT t3.groupID, max(t3.recordedTimestamp) AS recordedTimestamp
FROM selected_group g
INNER JOIN temperature t3 ON t3.groupID = g.id
GROUP BY t3.groupID
) t4 ON t4.groupID = t2.groupID AND t4.recordedTimestamp = t2.recordedTimestamp
GROUP BY t2.groupID
) t5 ON t5.id = t1.id;
这检查了1505331行,在5.7.21上花费了约1.25秒,在8.0.4-rc上花费了稍长的时间
示例3
SELECT t1.id, t1.groupID, t1.recordedTimestamp, t1.recordedValue
FROM temperature t1
WHERE t1.id IN (
SELECT max(t2.id) AS id
FROM temperature t2
INNER JOIN (
SELECT t3.groupID, max(t3.recordedTimestamp) AS recordedTimestamp
FROM selected_group g
INNER JOIN temperature t3 ON t3.groupID = g.id
GROUP BY t3.groupID
) t4 ON t4.groupID = t2.groupID AND t4.recordedTimestamp = t2.recordedTimestamp
GROUP BY t2.groupID
)
ORDER BY t1.groupID;
这检查了3009685行,在5.7.21上花费了约1.95秒,在8.0.4-rc上花费了稍长的时间
示例4
SELECT t1.id, t1.groupID, t1.recordedTimestamp, t1.recordedValue
FROM selected_group g
INNER JOIN temperature t1 ON t1.id = (
SELECT max(t2.id)
FROM temperature t2
WHERE t2.groupID = g.id AND t2.recordedTimestamp = (
SELECT max(t3.recordedTimestamp)
FROM temperature t3
WHERE t3.groupID = g.id
)
);
这检查了6137810行,在5.7.21上花费了约2.2秒,在8.0.4-rc上花费了稍长的时间
示例5
SELECT t1.id, t1.groupID, t1.recordedTimestamp, t1.recordedValue
FROM (
SELECT
t2.id,
t2.groupID,
t2.recordedTimestamp,
t2.recordedValue,
row_number() OVER (
PARTITION BY t2.groupID ORDER BY t2.recordedTimestamp DESC, t2.id DESC
) AS rowNumber
FROM selected_group g
INNER JOIN temperature t2 ON t2.groupID = g.id
) t1 WHERE t1.rowNumber = 1;
这检查了6017808行,在8.0.4-rc上耗时约4.2秒
示例6
SELECT t1.id, t1.groupID, t1.recordedTimestamp, t1.recordedValue
FROM (
SELECT
last_value(t2.id) OVER w AS id,
t2.groupID,
last_value(t2.recordedTimestamp) OVER w AS recordedTimestamp,
last_value(t2.recordedValue) OVER w AS recordedValue
FROM selected_group g
INNER JOIN temperature t2 ON t2.groupID = g.id
WINDOW w AS (
PARTITION BY t2.groupID
ORDER BY t2.recordedTimestamp, t2.id
RANGE BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING
)
) t1
GROUP BY t1.groupID;
这检查了6017908行,在8.0.4-rc上花费了约17.5秒
示例7
SELECT t1.id, t1.groupID, t1.recordedTimestamp, t1.recordedValue
FROM selected_group g
INNER JOIN temperature t1 ON t1.groupID = g.id
LEFT JOIN temperature t2
ON t2.groupID = g.id
AND (
t2.recordedTimestamp > t1.recordedTimestamp
OR (t2.recordedTimestamp = t1.recordedTimestamp AND t2.id > t1.id)
)
WHERE t2.id IS NULL
ORDER BY t1.groupID;
这一列花了很长时间,所以我不得不终止它。如果性能确实是您关心的问题,您可以在表中引入一个新列,名为BIT类型的IsLastInGroup
对于最后一列,将其设置为true,并在每一行插入/更新/删除时维护它。写入速度会慢一些,但您会从读取中受益。这取决于您的用例,我建议您仅在以读取为中心的情况下使用
因此,您的查询将如下所示:
SELECT * FROM Messages WHERE IsLastInGroup = 1
您可以通过计数进行分组,还可以获得组的最后一项,如:
SELECT
user,
COUNT(user) AS count,
MAX(id) as last
FROM request
GROUP BY user
我们将研究如何使用MySQL获取一组记录中的最后一条记录
id类别\u id帖子\u标题
1标题1
2 1标题2
3 1标题3
42标题4
52
select m1.* from messages m1
left outer join messages m2
on ( m1.id<m2.id and m1.name=m2.name )
where m2.id is null
SELECT
`Id`,
`Name`,
SUBSTRING_INDEX(
GROUP_CONCAT(
`Other_Columns`
ORDER BY `Id` DESC
SEPARATOR '||'
),
'||',
1
) Other_Columns
FROM
messages
GROUP BY `Name`
SELECT
column1,
column2
FROM
table_name
WHERE id IN
(SELECT
MAX(id)
FROM
table_name
GROUP BY column1)
ORDER BY column1 ;
SELECT m1.* FROM messages m1 INNER JOIN (SELECT max(Id) as lastmsgId FROM messages GROUP BY Name) m2 ON m1.Id=m2.lastmsgId
SELECT d1.ID,Name,City FROM Demo_User d1
INNER JOIN
(SELECT MAX(ID) AS ID FROM Demo_User GROUP By NAME) AS P ON (d1.ID=P.ID);
SELECT * FROM (SELECT * FROM Demo_User ORDER BY ID DESC) AS T GROUP BY NAME ;
SELECT t1.Id,
t1.Name,
t1.Other_Columns
FROM
(
SELECT Id,
Name,
Other_Columns,
(
CASE Name WHEN @curA
THEN @curRow := @curRow + 1
ELSE @curRow := 1 AND @curA := Name END
) + 1 AS rn
FROM messages t,
(SELECT @curRow := 0, @curA := '') r
ORDER BY Name,Id DESC
)t1
WHERE t1.rn = 1
ORDER BY t1.Id;
select * from messages group by name desc
SELECT DISTINCT ON (name) *
FROM messages
ORDER BY name, id DESC;
SELECT
DISTINCT NAME,
MAX(MESSAGES) OVER(PARTITION BY NAME) MESSAGES
FROM MESSAGE;
SELECT *
FROM messages a
WHERE Id = (SELECT MAX(Id) FROM messages WHERE a.Name = Name)
Id Name Other_Columns
3 A A_data_3
5 B B_data_2
6 C C_data_1
CREATE TABLE temperature(
id INT UNSIGNED NOT NULL AUTO_INCREMENT,
groupID INT UNSIGNED NOT NULL,
recordedTimestamp TIMESTAMP NOT NULL,
recordedValue INT NOT NULL,
INDEX groupIndex(groupID, recordedTimestamp),
PRIMARY KEY (id)
);
CREATE TEMPORARY TABLE selected_group(id INT UNSIGNED NOT NULL, PRIMARY KEY(id));
SELECT
last(t1.id) AS id,
t1.groupID,
last(t1.recordedTimestamp) AS recordedTimestamp,
last(t1.recordedValue) AS recordedValue
FROM selected_group g
INNER JOIN temperature t1 ON t1.groupID = g.id
ORDER BY t1.recordedTimestamp, t1.id
GROUP BY t1.groupID;
SELECT t1.id, t1.groupID, t1.recordedTimestamp, t1.recordedValue
FROM selected_group g
INNER JOIN temperature t1 ON t1.id = (
SELECT t2.id
FROM temperature t2
WHERE t2.groupID = g.id
ORDER BY t2.recordedTimestamp DESC, t2.id DESC
LIMIT 1
);
SELECT t1.id, t1.groupID, t1.recordedTimestamp, t1.recordedValue
FROM temperature t1
INNER JOIN (
SELECT max(t2.id) AS id
FROM temperature t2
INNER JOIN (
SELECT t3.groupID, max(t3.recordedTimestamp) AS recordedTimestamp
FROM selected_group g
INNER JOIN temperature t3 ON t3.groupID = g.id
GROUP BY t3.groupID
) t4 ON t4.groupID = t2.groupID AND t4.recordedTimestamp = t2.recordedTimestamp
GROUP BY t2.groupID
) t5 ON t5.id = t1.id;
SELECT t1.id, t1.groupID, t1.recordedTimestamp, t1.recordedValue
FROM temperature t1
WHERE t1.id IN (
SELECT max(t2.id) AS id
FROM temperature t2
INNER JOIN (
SELECT t3.groupID, max(t3.recordedTimestamp) AS recordedTimestamp
FROM selected_group g
INNER JOIN temperature t3 ON t3.groupID = g.id
GROUP BY t3.groupID
) t4 ON t4.groupID = t2.groupID AND t4.recordedTimestamp = t2.recordedTimestamp
GROUP BY t2.groupID
)
ORDER BY t1.groupID;
SELECT t1.id, t1.groupID, t1.recordedTimestamp, t1.recordedValue
FROM selected_group g
INNER JOIN temperature t1 ON t1.id = (
SELECT max(t2.id)
FROM temperature t2
WHERE t2.groupID = g.id AND t2.recordedTimestamp = (
SELECT max(t3.recordedTimestamp)
FROM temperature t3
WHERE t3.groupID = g.id
)
);
SELECT t1.id, t1.groupID, t1.recordedTimestamp, t1.recordedValue
FROM (
SELECT
t2.id,
t2.groupID,
t2.recordedTimestamp,
t2.recordedValue,
row_number() OVER (
PARTITION BY t2.groupID ORDER BY t2.recordedTimestamp DESC, t2.id DESC
) AS rowNumber
FROM selected_group g
INNER JOIN temperature t2 ON t2.groupID = g.id
) t1 WHERE t1.rowNumber = 1;
SELECT t1.id, t1.groupID, t1.recordedTimestamp, t1.recordedValue
FROM (
SELECT
last_value(t2.id) OVER w AS id,
t2.groupID,
last_value(t2.recordedTimestamp) OVER w AS recordedTimestamp,
last_value(t2.recordedValue) OVER w AS recordedValue
FROM selected_group g
INNER JOIN temperature t2 ON t2.groupID = g.id
WINDOW w AS (
PARTITION BY t2.groupID
ORDER BY t2.recordedTimestamp, t2.id
RANGE BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING
)
) t1
GROUP BY t1.groupID;
SELECT t1.id, t1.groupID, t1.recordedTimestamp, t1.recordedValue
FROM selected_group g
INNER JOIN temperature t1 ON t1.groupID = g.id
LEFT JOIN temperature t2
ON t2.groupID = g.id
AND (
t2.recordedTimestamp > t1.recordedTimestamp
OR (t2.recordedTimestamp = t1.recordedTimestamp AND t2.id > t1.id)
)
WHERE t2.id IS NULL
ORDER BY t1.groupID;
SELECT * FROM Messages WHERE IsLastInGroup = 1
SELECT * FROM table_name WHERE primary_key IN (SELECT MAX(primary_key) FROM table_name GROUP BY column_name )
SELECT
user,
COUNT(user) AS count,
MAX(id) as last
FROM request
GROUP BY user
SELECT
*
FROM
message
WHERE
`Id` IN (
SELECT
MAX(`Id`)
FROM
message
GROUP BY
`Name`
)
ORDER BY
`Id` DESC
select * from properties p
join (
select max(m2_price) as max_price
from properties
group by program_id
) p2 on (p.program_id = p2.program_id)
having p.m2_price = max_price
WITH Temp_table AS
(
Select id, name, othercolumns, ROW_NUMBER() over (PARTITION BY name ORDER BY ID
desc)as rank from messages
)
Select id, name,othercolumns from Temp_table where rank=1
select * from `data` where `id` in (select max(`id`) from `data` group by `name_id`)
select *, max(id) from messages group by name
SELECT GROUP_CONCAT(id ORDER BY id DESC LIMIT 1) AS id,
name,
GROUP_CONCAT(Other_columns ORDER BY id DESC LIMIT 1) AS Other_columns
FROM t
GROUP BY name;