Mysql 在多个字段上查找重复行_Mysql_Duplicates

Mysql 在多个字段上查找重复行

mysql

Mysql 在多个字段上查找重复行,mysql,duplicates,Mysql,Duplicates,我使用此查询根据两个字段查找重复项： SELECT last_name, first_name, middle_initial, COUNT(last_name) AS Duplicates, IF(rec_id = '', 1, 0) AS has_REC_ID FROM files GROUP BY last_name, first_name HAVING COUNT(last_name) > 1 AND COUNT(first_name)

我使用此查询根据两个字段查找重复项：

SELECT 
    last_name, 
    first_name,
    middle_initial,
    COUNT(last_name) AS Duplicates,
    IF(rec_id = '', 1, 0) AS has_REC_ID 
FROM files
GROUP BY last_name, first_name
HAVING COUNT(last_name) > 1 AND COUNT(first_name) > 1;

好的，它返回的是一组包含首名、末名和中间名的行，一个名为“Duplicates”的列包含大量的2，一个名为has_REC_ID的列包含混合的1和0

最终，我要做的是找到哪些行具有匹配的名字和姓氏——然后对于这些对中的每一对，找到一个（“”）作为

rec\u id

的值，从一个具有rec\u id的行中分配rec\u id值，然后删除一个具有rec\u id的记录

因此，对于初学者，我想我会创建一个新的专栏，并做如下工作：

UPDATE files a 
SET a.has_dup    --new column
    = if(a.last_name IN (
                         SELECT b.last_name
                         FROM files b
                         GROUP BY b.last_name 
                         HAVING COUNT(b.last_name) > 1
                        )
      , 1, null);

但是MySQL返回：“您不能在from子句中为update指定目标表'a'”

我敢打赌，有比我在这里尝试的方法更不荒谬的东西。有人能帮我弄清楚那是什么吗

更新：我还尝试了：

UPDATE files a 
SET a.has_dup = 1
WHERE a.last_name IN (
                         SELECT b.last_name
                         FROM files b
                         GROUP BY b.last_name 
                         HAVING COUNT(b.last_name) > 1
                     );

…并收到了相同的错误消息。

我没有任何MySQL要测试，但我认为这应该是可行的：（编辑->失败）

编辑：另一次尝试：

UPDATE files f, (SELECT b.last_name
                   FROM files b
               GROUP BY b.last_name 
                 HAVING COUNT(b.last_name) > 1
                ) as duplicates
   SET f.has_dup = 1
 WHERE f.last_name = duplicates.last_name

发件人：

当前，无法更新表并在子查询中从同一表中进行选择

我想不出一个快速的解决办法

更新显然，但它是否有效是另一个问题。这一切都是关于通过引入临时表来添加新的间接层：

UPDATE files a 
SET a.has_dup    --new column
    = if(a.last_name IN (
                     SELECT b.last_name
                     FROM
                          (SELECT * FROM files)      -- new table target
                     b
                     GROUP BY b.last_name 
                     HAVING COUNT(b.last_name) > 1
                    ),
      1, null);

你可以：

1）创建一个等待表

2）使用具有匹配名字和姓氏且具有rec_id！=的行填充保留表“”

3）从原始表（文件）中删除具有匹配的名字和姓氏以及rec_id！=“”

4）更新原始表中名与姓匹配且rec_id为“”的行

5）放下桌子

比如：

create table temp
(
firstname varchar(100) not null,
lastname varchar(100) not null,
rec_id int not null
);


insert into temp (select firstname,lastname,rec_id from files where firstname =    lastname and rec_id != '');


delete from files where firstname = lastname and rec_id != '';

update files f
set f.rec_id = (select t.rec_id from temp t where f.firstname = t.firstname and f.lastname = t.lastname)
where f.firstname = f.lastname 
and f.rec_id != '';


drop table temp;

不。谢谢，但是现在消息是：“您不能在from子句'@tjb1982'中指定要更新的目标表'files'。对不起，请检查第二个表。（我应该添加其他解决方案还是编辑错误的解决方案？）我没有否决这一点，但看到Tomalak的回答，这不是正确的方法来处理文档。@tjb1982似乎有一个解决方法：这就可以解释了。然而，一个快速的解决方法可能是选择另一个名为“其他”的表，对吗？@tjb1982：首先，一个不涉及复制所有内容的表将数据放入第二个重复的表中。：）是的，沿着这些线做的事情应该非常有用。谢谢！

create table temp
(
firstname varchar(100) not null,
lastname varchar(100) not null,
rec_id int not null
);


insert into temp (select firstname,lastname,rec_id from files where firstname =    lastname and rec_id != '');


delete from files where firstname = lastname and rec_id != '';

update files f
set f.rec_id = (select t.rec_id from temp t where f.firstname = t.firstname and f.lastname = t.lastname)
where f.firstname = f.lastname 
and f.rec_id != '';


drop table temp;