Sql 删除重复的行并更新引用
如何删除一个表中的重复行,并将另一个表中的引用更新到剩余行?重复只发生在名称中。Id列是标识列 例子: 假设我们有两个表double和Data 现在我在Doubls表中有两个条目:Sql 删除重复的行并更新引用,sql,sql-server-2005,tsql,Sql,Sql Server 2005,Tsql,如何删除一个表中的重复行,并将另一个表中的引用更新到剩余行?重复只发生在名称中。Id列是标识列 例子: 假设我们有两个表double和Data 现在我在Doubls表中有两个条目: Id Name 1 Foo 2 Foo 和数据表中的两个条目: ID DoublesId 1 1 2 2 Id DoublesId 1 1 2 1 最后,Double表中应只有一个条目: Id Name 1 Foo 和数据表中的两个条目: ID DoublesId 1 1 2 2 Id
Id Name
1 Foo
2 Foo
和数据表中的两个条目:
ID DoublesId
1 1
2 2
Id DoublesId
1 1
2 1
最后,Double表中应只有一个条目:
Id Name
1 Foo
和数据表中的两个条目:
ID DoublesId
1 1
2 2
Id DoublesId
1 1
2 1
在doubles表中,每个名称可以有任意数量的重复行,最多30行,也可以是常规的“单”行。注意:我已冒昧地将您的Id分别重命名为DoubleID和DataID。我觉得这更容易相处
DECLARE @Doubles TABLE (DoubleID INT, Name VARCHAR(50))
DECLARE @Data TABLE (DataID INT, DoubleID INT)
INSERT INTO @Doubles VALUES (1, 'Foo')
INSERT INTO @Doubles VALUES (2, 'Foo')
INSERT INTO @Doubles VALUES (3, 'Bar')
INSERT INTO @Doubles VALUES (4, 'Bar')
INSERT INTO @Data VALUES (1, 1)
INSERT INTO @Data VALUES (1, 2)
INSERT INTO @Data VALUES (1, 3)
INSERT INTO @Data VALUES (1, 4)
SELECT * FROM @Doubles
SELECT * FROM @Data
UPDATE @Data
SET DoubleID = MinDoubleID
FROM @Data dt
INNER JOIN @Doubles db ON db.DoubleID = dt.DoubleID
INNER JOIN (
SELECT db.Name, MinDoubleID = MIN(db.DoubleID)
FROM @Doubles db
GROUP BY db.Name
) dbmin ON dbmin.Name = db.Name
/* Kudos to quassnoi */
;WITH q AS (
SELECT Name, ROW_NUMBER() OVER (PARTITION BY Name ORDER BY Name) AS rn
FROM @Doubles
)
DELETE
FROM q
WHERE rn > 1
SELECT * FROM @Doubles
SELECT * FROM @Data
看看这个,我试过了,效果很好
--create table Doubles ( Id int, Name varchar(50))
--create table Data( Id int, DoublesId int)
--select * from doubles
--select * from data
Declare @NonDuplicateID int
Declare @NonDuplicateName varchar(max)
DECLARE @sqlQuery nvarchar(max)
DECLARE DeleteDuplicate CURSOR FOR
SELECT Max(id),name AS SingleID FROM Doubles
GROUP BY [NAME]
OPEN DeleteDuplicate
FETCH NEXT FROM DeleteDuplicate INTO @NonDuplicateID, @NonDuplicateName
--Fetch next record
WHILE @@FETCH_STATUS = 0
BEGIN
--select b.ID , b.DoublesID, a.[name],a.id asdasd
--from doubles a inner join data b
--on
--a.ID=b.DoublesID
--where b.DoublesID<>@NonDuplicateID
--and a.[name]=@NonDuplicateName
print '---------------------------------------------';
select
@sqlQuery =
'update b
set b.DoublesID=' + cast(@NonDuplicateID as varchar(50)) + '
from
doubles a
inner join
data b
on
a.ID=b.DoublesID
where b.DoublesID<>' + cast(@NonDuplicateID as varchar(50)) +
' and a.[name]=''' + cast(@NonDuplicateName as varchar(max)) +'''';
print @sqlQuery
exec sp_executeSQL @sqlQuery
print '---------------------------------------------';
-- now move the cursor
FETCH NEXT FROM DeleteDuplicate INTO @NonDuplicateID ,@NonDuplicateName
END
CLOSE DeleteDuplicate --Close cursor
DEALLOCATE DeleteDuplicate --Deallocate cursor
---- Delete duplicate rows from original table
DELETE
FROM doubles
WHERE ID NOT IN
(
SELECT MAX(ID)
FROM doubles
GROUP BY [NAME]
)
如果这对你有帮助,请告诉我
谢谢
~Aamod我还没有运行这个,但希望它应该是正确的,并且离最终解决方案足够近,可以让您到达那里。如果你愿意,请告诉我任何错误,我会更新答案
--updates the data table to the min ids for each name
update Data
set id = final_id
from
Data
join
Doubles
on Doubles.id = Data.id
join
(
select
name
min(id) as final_id
from Doubles
group by name
) min_ids
on min_ids.name = Doubles.name
--deletes redundant ids from the Doubles table
delete
from Doubles
where id not in
(
select
min(id) as final_id
from Doubles
group by name
)
如果你使用的是MYSQL,下面的内容对我很有用。我做了两步 步骤1->将所有数据行更新为一个id最低的双表引用 步骤2->删除保留最低id的所有重复项 步骤1->
update Data
join
Doubles
on Data.DoublesId = Doubles.id
join
(
select name, min(id) as final_id
from Doubles
group by name
) min_ids
on min_ids.name = Doubles.name
set DoublesId = min_ids.final_id;
第2步->
DELETE c1 FROM Doubles c1
INNER JOIN Doubles c2
WHERE
c1.id > c2.id AND
c1.name = c2.name;
我自己也不明白,但这个示例代码似乎在SQLServer2008上工作。