MySQL:查找重复记录,但从列表中排除第一个记录
我的表格如下:MySQL:查找重复记录,但从列表中排除第一个记录,mysql,sql,duplicates,Mysql,Sql,Duplicates,我的表格如下: MariaDB [groupdb]> select * from album; +----+---------+---------+ | id | artist | user_id | +----+---------+---------+ | 1 | ArtistX | 45677 | | 2 | ArtistY | 378798 | | 3 | ArtistX | 45677 | | 4 | ArtistZ | 123456 | | 5 |
MariaDB [groupdb]> select * from album;
+----+---------+---------+
| id | artist | user_id |
+----+---------+---------+
| 1 | ArtistX | 45677 |
| 2 | ArtistY | 378798 |
| 3 | ArtistX | 45677 |
| 4 | ArtistZ | 123456 |
| 5 | ArtistY | 888888 |
| 6 | ArtistX | 2312 |
| 7 | ArtistY | 378798 |
| 8 | ArtistY | 888888 |
| 9 | ArtistY | 888888 |
+----+---------+---------+
9 rows in set (0.000 sec)
MariaDB [groupdb]> select * from album where artist IN (select artist from album group by artist having count(artist)>1) and user_id IN (select user_id from album group by user_id having count(user_id)>1);
+----+---------+---------+
| id | artist | user_id |
+----+---------+---------+
| 1 | ArtistX | 45677 |
| 2 | ArtistY | 378798 |
| 3 | ArtistX | 45677 |
| 5 | ArtistY | 888888 |
| 7 | ArtistY | 378798 |
| 8 | ArtistY | 888888 |
| 9 | ArtistY | 888888 |
+----+---------+---------+
7 rows in set (0.001 sec)
我尝试使用以下查询查找重复记录:
MariaDB [groupdb]> select * from album;
+----+---------+---------+
| id | artist | user_id |
+----+---------+---------+
| 1 | ArtistX | 45677 |
| 2 | ArtistY | 378798 |
| 3 | ArtistX | 45677 |
| 4 | ArtistZ | 123456 |
| 5 | ArtistY | 888888 |
| 6 | ArtistX | 2312 |
| 7 | ArtistY | 378798 |
| 8 | ArtistY | 888888 |
| 9 | ArtistY | 888888 |
+----+---------+---------+
9 rows in set (0.000 sec)
MariaDB [groupdb]> select * from album where artist IN (select artist from album group by artist having count(artist)>1) and user_id IN (select user_id from album group by user_id having count(user_id)>1);
+----+---------+---------+
| id | artist | user_id |
+----+---------+---------+
| 1 | ArtistX | 45677 |
| 2 | ArtistY | 378798 |
| 3 | ArtistX | 45677 |
| 5 | ArtistY | 888888 |
| 7 | ArtistY | 378798 |
| 8 | ArtistY | 888888 |
| 9 | ArtistY | 888888 |
+----+---------+---------+
7 rows in set (0.001 sec)
这一切都很好。虽然我希望我的结果集有一个重复列表,但不包括第一个。i、 e类似于下面的一个。
预期产量
正如您在上面看到的,这是一个重复列表,不包括第一个
注意:要使记录重复,艺术家和用户id必须相同。
我的挑战是提出一个导致上述结果集的查询 这在最新版本的MariaDB中很容易处理,它支持行号: 以下是在上述查询中使用的中间CTE的外观:
+----+---------+---------+----+
| id | artist | user_id | rn |
+----+---------+---------+----+
| 1 | ArtistX | 45677 | 1 |
| 2 | ArtistY | 378798 | 1 |
| 3 | ArtistX | 45677 | 2 |
| 4 | ArtistZ | 123456 | 1 |
| 5 | ArtistY | 888888 | 1 |
| 6 | ArtistX | 2312 | 1 |
| 7 | ArtistY | 378798 | 2 |
| 8 | ArtistY | 888888 | 2 |
| 9 | ArtistY | 888888 | 3 |
+----+---------+---------+----+
请注意,没有重复项的艺术家/用户id对只会被分配一个1的行号,因此永远不会保留在输出中。这在支持行号的MariaDB的最新版本中很容易处理: 以下是在上述查询中使用的中间CTE的外观:
+----+---------+---------+----+
| id | artist | user_id | rn |
+----+---------+---------+----+
| 1 | ArtistX | 45677 | 1 |
| 2 | ArtistY | 378798 | 1 |
| 3 | ArtistX | 45677 | 2 |
| 4 | ArtistZ | 123456 | 1 |
| 5 | ArtistY | 888888 | 1 |
| 6 | ArtistX | 2312 | 1 |
| 7 | ArtistY | 378798 | 2 |
| 8 | ArtistY | 888888 | 2 |
| 9 | ArtistY | 888888 | 3 |
+----+---------+---------+----+
请注意,没有重复项的艺术家/用户id对将只分配一个1的行号,因此永远不会保留在输出中。您可以使用行号为每个艺术家/用户id获取一行:
select a.*
from (select a.*,
row_number() over (partition by artist, user_id order by id) as seqnum
from album a
) a
where seqnum > 1;
在旧版本中,您可以使用:
select a.*
from album a
where a.id > (select min(a2.id)
from album a2
where a2.artist = a.artist and a2.user_id = a.user_id
);
您可以使用row_number为每个艺术家/用户\u id获取一行:
select a.*
from (select a.*,
row_number() over (partition by artist, user_id order by id) as seqnum
from album a
) a
where seqnum > 1;
在旧版本中,您可以使用:
select a.*
from album a
where a.id > (select min(a2.id)
from album a2
where a2.artist = a.artist and a2.user_id = a.user_id
);
MariaDB 10.3及以上版本支持除函数外的其他功能,因此您可以简单地执行
select id, artist, user_id
from t
except
select min(id), artist, user_id
from t
group by artist, user_id;
如果这不是一个选项,您可以在
MariaDB 10.3及以上版本支持除函数外的其他功能,因此您可以简单地执行
select id, artist, user_id
from t
except
select min(id), artist, user_id
from t
group by artist, user_id;
如果这不是一个选项,您可以在
您希望选择存在ID较小的同级的所有行。我认为最简单的表达方式是:
select *
from album a
where exists
(
select *
from album a2
where a2.artist = a.artist
and a2.user_id = a.user_id
and a2.id < a.id
)
order by id;
您希望选择存在ID较小的同级的所有行。我认为最简单的表达方式是:
select *
from album a
where exists
(
select *
from album a2
where a2.artist = a.artist
and a2.user_id = a.user_id
and a2.id < a.id
)
order by id;
使用HAVING和MAX进行分组。使用HAVING和MAX进行分组。谢谢你的回答。它很好用。是否可以在不依赖查询中的id的情况下实现相同的功能?。我尝试将ORDER BY更改为其他类似ORDER BY Artister的内容,但现在id已关闭。根据示例数据,id列决定了哪些重复项是第一个还是最后一个。您可以使用任何也可以表示此订单的列来代替id。谢谢您的回答。它很好用。是否可以在不依赖查询中的id的情况下实现相同的功能?。我尝试将ORDER BY更改为其他类似ORDER BY Artister的内容,但现在id已关闭。根据示例数据,id列决定了哪些重复项是第一个还是最后一个。您可以使用也可以表示此订单的任何列来代替id。