MySQL通过更新/删除整合重复数据记录
我有一张这样的桌子:MySQL通过更新/删除整合重复数据记录,mysql,sql,duplicates,union,Mysql,Sql,Duplicates,Union,我有一张这样的桌子: mysql> SELECT * FROM Colors; ╔════╦══════════╦════════╦════════╦════════╦════════╦════════╦════════╗ ║ ID ║ USERNAME ║ RED ║ GREEN ║ YELLOW ║ BLUE ║ ORANGE ║ PURPLE ║ ╠════╬══════════╬════════╬════════╬════════╬════════╬════════╬═
mysql> SELECT * FROM Colors;
╔════╦══════════╦════════╦════════╦════════╦════════╦════════╦════════╗
║ ID ║ USERNAME ║ RED ║ GREEN ║ YELLOW ║ BLUE ║ ORANGE ║ PURPLE ║
╠════╬══════════╬════════╬════════╬════════╬════════╬════════╬════════╣
║ 1 ║ joe ║ 1 ║ (null) ║ 1 ║ (null) ║ (null) ║ (null) ║
║ 2 ║ joe ║ 1 ║ (null) ║ (null) ║ (null) ║ 1 ║ (null) ║
║ 3 ║ bill ║ 1 ║ 1 ║ 1 ║ (null) ║ (null) ║ 1 ║
║ 4 ║ bill ║ (null) ║ 1 ║ (null) ║ 1 ║ (null) ║ (null) ║
║ 5 ║ bill ║ (null) ║ 1 ║ (null) ║ (null) ║ (null) ║ (null) ║
║ 6 ║ bob ║ (null) ║ (null) ║ (null) ║ 1 ║ (null) ║ (null) ║
║ 7 ║ bob ║ (null) ║ (null) ║ (null) ║ (null) ║ (null) ║ 1 ║
║ 8 ║ bob ║ 1 ║ (null) ║ (null) ║ (null) ║ (null) ║ (null) ║
╚════╩══════════╩════════╩════════╩════════╩════════╩════════╩════════╝
我想运行一个更新和删除,它将查找和删除重复项,并合并记录,这样我们就可以以此结束
mysql> SELECT * FROM Colors;
╔════╦══════════╦═════╦════════╦════════╦════════╦════════╦════════╗
║ ID ║ USERNAME ║ RED ║ GREEN ║ YELLOW ║ BLUE ║ ORANGE ║ PURPLE ║
╠════╬══════════╬═════╬════════╬════════╬════════╬════════╬════════╣
║ 1 ║ joe ║ 1 ║ (null) ║ 1 ║ (null) ║ 1 ║ (null) ║
║ 3 ║ bill ║ 1 ║ 1 ║ 1 ║ 1 ║ (null) ║ 1 ║
║ 6 ║ bob ║ 1 ║ (null) ║ (null) ║ 1 ║ (null) ║ 1 ║
╚════╩══════════╩═════╩════════╩════════╩════════╩════════╩════════╝
我知道我可以用一个脚本轻松地完成这项工作,但为了更好地学习和理解MySQL,我想学习如何使用纯SQL完成这项工作。您真的需要更新基础表吗?如果不是,并且您只需要如示例所示的结果集,您可以简单地对表进行分组:
SELECT MIN(ID) AS ID,
Username AS Username,
MAX(Red) AS Red,
MAX(Green) AS Green,
MAX(Yellow) AS Yellow,
MAX(Blue) AS Blue,
MAX(Orange) AS Orange,
MAX(Purple) AS Purple
FROM Colors
GROUP BY Username
请在上查看。这只是一个投影。它不会更新表,也不会删除一些数据
SELECT MIN(ID) ID,
Username,
MAX(Red) max_Red,
MAX(Green) max_Green,
MAX(Yellow) max_Yellow,
MAX(Blue) max_Blue,
MAX(Orange) max_Orange,
MAX(Purple) max_Purple
FROM Colors
GROUP BY Username
更新
如果确实要删除这些记录,则需要先运行UPDATE语句,然后才能删除这些记录
UPDATE Colors a
INNER JOIN
(
SELECT MIN(ID) min_ID,
Username,
MAX(Red) max_Red,
MAX(Green) max_Green ,
MAX(Yellow) max_Yellow,
MAX(Blue) max_Blue,
MAX(Orange) max_Orange,
MAX(Purple) max_Purple
FROM Colors
GROUP BY Username
) b ON a.ID = b.Min_ID
SET a.Red = b.max_Red,
a.Green = b.max_Green,
a.Yellow = b.max_Yellow,
a.Blue = b.max_Blue,
a.Orange = b.max_Orange,
a.Purple = b.max_Purple
然后你现在可以删除这些记录
DELETE a
FROM Colors a
LEFT JOIN
(
SELECT MIN(ID) min_ID,
Username
FROM Colors
GROUP BY Username
) b ON a.ID = b.Min_ID
WHERE b.Min_ID IS NULL
NULL使这一点更复杂,因为NULL=NULL不是真的,但在SQL中是未知的。如果有0和1,则可以省略颜色条件中或之前的部分。如果要删除记录,我想您的意思是删除,而不是更新。FrankPI,您是对的。我已将问题更新为包含删除。使用我下面的建议,您不需要中间更新,只需删除一次
DELETE FROM Colors c1
WHERE EXISTS (SELECT 1
FROM Colors c2
WHERE c1.Username = c2.Username
AND ((c1.Red IS NULL AND c2.Red IS NULL) OR c1.Red = c2.Red )
AND ((c1.Green IS NULL AND c2.Green IS NULL) OR c1.Green = c2.Green )
AND ((c1.Yellow IS NULL AND c2.Yellow IS NULL) OR c1.Yellow = c2.Yellow)
AND ((c1.Blue IS NULL AND c2.Blue IS NULL) OR c1.Blue = c2.Blue )
AND ((c1.Orange IS NULL AND c2.Orange IS NULL) OR c1.Orange = c2.Orange)
AND ((c1.Purple IS NULL AND c2.Purple IS NULL) OR c1.Purple = c2.Purple)
AND c2.ID < c1.ID
)