MySQL通过更新/删除整合重复数据记录

MySQL通过更新/删除整合重复数据记录,mysql,sql,duplicates,union,Mysql,Sql,Duplicates,Union,我有一张这样的桌子: mysql> SELECT * FROM Colors; ╔════╦══════════╦════════╦════════╦════════╦════════╦════════╦════════╗ ║ ID ║ USERNAME ║ RED ║ GREEN ║ YELLOW ║ BLUE ║ ORANGE ║ PURPLE ║ ╠════╬══════════╬════════╬════════╬════════╬════════╬════════╬═

我有一张这样的桌子:

mysql> SELECT * FROM Colors;
╔════╦══════════╦════════╦════════╦════════╦════════╦════════╦════════╗
║ ID ║ USERNAME ║  RED   ║ GREEN  ║ YELLOW ║  BLUE  ║ ORANGE ║ PURPLE ║
╠════╬══════════╬════════╬════════╬════════╬════════╬════════╬════════╣
║  1 ║ joe      ║ 1      ║ (null) ║ 1      ║ (null) ║ (null) ║ (null) ║
║  2 ║ joe      ║ 1      ║ (null) ║ (null) ║ (null) ║ 1      ║ (null) ║
║  3 ║ bill     ║ 1      ║ 1      ║ 1      ║ (null) ║ (null) ║ 1      ║
║  4 ║ bill     ║ (null) ║ 1      ║ (null) ║ 1      ║ (null) ║ (null) ║
║  5 ║ bill     ║ (null) ║ 1      ║ (null) ║ (null) ║ (null) ║ (null) ║
║  6 ║ bob      ║ (null) ║ (null) ║ (null) ║ 1      ║ (null) ║ (null) ║
║  7 ║ bob      ║ (null) ║ (null) ║ (null) ║ (null) ║ (null) ║ 1      ║
║  8 ║ bob      ║ 1      ║ (null) ║ (null) ║ (null) ║ (null) ║ (null) ║
╚════╩══════════╩════════╩════════╩════════╩════════╩════════╩════════╝
我想运行一个更新和删除,它将查找和删除重复项,并合并记录,这样我们就可以以此结束

mysql> SELECT * FROM Colors;
╔════╦══════════╦═════╦════════╦════════╦════════╦════════╦════════╗
║ ID ║ USERNAME ║ RED ║ GREEN  ║ YELLOW ║  BLUE  ║ ORANGE ║ PURPLE ║
╠════╬══════════╬═════╬════════╬════════╬════════╬════════╬════════╣
║  1 ║ joe      ║   1 ║ (null) ║ 1      ║ (null) ║ 1      ║ (null) ║
║  3 ║ bill     ║   1 ║ 1      ║ 1      ║ 1      ║ (null) ║ 1      ║
║  6 ║ bob      ║   1 ║ (null) ║ (null) ║ 1      ║ (null) ║ 1      ║
╚════╩══════════╩═════╩════════╩════════╩════════╩════════╩════════╝

我知道我可以用一个脚本轻松地完成这项工作,但为了更好地学习和理解MySQL,我想学习如何使用纯SQL完成这项工作。

您真的需要更新基础表吗?如果不是,并且您只需要如示例所示的结果集,您可以简单地对表进行分组:

SELECT   MIN(ID)     AS ID,
         Username    AS Username,
         MAX(Red)    AS Red,
         MAX(Green)  AS Green,
         MAX(Yellow) AS Yellow,
         MAX(Blue)   AS Blue,
         MAX(Orange) AS Orange,
         MAX(Purple) AS Purple
FROM     Colors
GROUP BY Username

请在上查看。

这只是一个投影。它不会更新表,也不会删除一些数据

SELECT  MIN(ID) ID,
        Username,
        MAX(Red) max_Red,
        MAX(Green) max_Green,
        MAX(Yellow) max_Yellow,
        MAX(Blue) max_Blue,
        MAX(Orange) max_Orange,
        MAX(Purple) max_Purple
FROM    Colors
GROUP   BY Username
更新

如果确实要删除这些记录,则需要先运行UPDATE语句,然后才能删除这些记录

UPDATE  Colors a
        INNER JOIN
        (
            SELECT  MIN(ID) min_ID,
                    Username,
                    MAX(Red) max_Red,
                    MAX(Green) max_Green ,
                    MAX(Yellow) max_Yellow,
                    MAX(Blue) max_Blue,
                    MAX(Orange) max_Orange,
                    MAX(Purple) max_Purple
            FROM    Colors
            GROUP   BY Username
        ) b ON a.ID = b.Min_ID 
SET     a.Red = b.max_Red,
        a.Green = b.max_Green,
        a.Yellow = b.max_Yellow,
        a.Blue = b.max_Blue,
        a.Orange = b.max_Orange,
        a.Purple = b.max_Purple
然后你现在可以删除这些记录

DELETE  a
FROM    Colors a
        LEFT JOIN
        (
            SELECT  MIN(ID) min_ID,
                    Username
            FROM    Colors
            GROUP   BY Username
        ) b ON a.ID = b.Min_ID 
WHERE   b.Min_ID  IS NULL

NULL使这一点更复杂,因为NULL=NULL不是真的,但在SQL中是未知的。如果有0和1,则可以省略颜色条件中或之前的部分。

如果要删除记录,我想您的意思是删除,而不是更新。FrankPI,您是对的。我已将问题更新为包含删除。使用我下面的建议,您不需要中间更新,只需删除一次
DELETE FROM Colors c1
WHERE EXISTS (SELECT 1
                FROM Colors c2
               WHERE c1.Username = c2.Username
                 AND ((c1.Red    IS NULL AND c2.Red    IS NULL) OR c1.Red    = c2.Red   )
                 AND ((c1.Green  IS NULL AND c2.Green  IS NULL) OR c1.Green  = c2.Green )
                 AND ((c1.Yellow IS NULL AND c2.Yellow IS NULL) OR c1.Yellow = c2.Yellow)
                 AND ((c1.Blue   IS NULL AND c2.Blue   IS NULL) OR c1.Blue   = c2.Blue  )
                 AND ((c1.Orange IS NULL AND c2.Orange IS NULL) OR c1.Orange = c2.Orange)
                 AND ((c1.Purple IS NULL AND c2.Purple IS NULL) OR c1.Purple = c2.Purple)
                 AND c2.ID < c1.ID
             )