Sql server 删除重复的行并使用重复的行id更新该行
这是一个场景:我的表中有一个重复的行,具有相同的Id、名称等等 1) 我必须找到匹配所有条件的重复行(已完成) 2) 仅当条件匹配时才删除它们 3) 使用已删除记录的id并更新表中的现有行 为此,我创建了一个2临时表。Temp1是包含所有记录的表。Temp2由重复的行组成Sql server 删除重复的行并使用重复的行id更新该行,sql-server,Sql Server,这是一个场景:我的表中有一个重复的行,具有相同的Id、名称等等 1) 我必须找到匹配所有条件的重复行(已完成) 2) 仅当条件匹配时才删除它们 3) 使用已删除记录的id并更新表中的现有行 为此,我创建了一个2临时表。Temp1是包含所有记录的表。Temp2由重复的行组成 IF OBJECT_ID('tempdb..#Temp1') IS NOT NULL DROP TABLE #Temp1 IF OBJECT_ID('tempdb..#Temp2') IS NOT NULL DROP TAB
IF OBJECT_ID('tempdb..#Temp1') IS NOT NULL
DROP TABLE #Temp1
IF OBJECT_ID('tempdb..#Temp2') IS NOT NULL
DROP TABLE #Temp2
IF OBJECT_ID('tempdb..#Temp3') IS NOT NULL
DROP TABLE #Temp3
CREATE Table #Temp1 (
Id int,
Name NVARCHAR(64),
StudentNo INT NULL,
ClassCode NVARCHAR(8) NULL,
Section NVARCHAR(8) NULL,
)
INSERT INTO #Temp1 (Id, Name,StudentNo,ClassCode,Section) Values(1,'Joe',123,'A1', 'I')
INSERT INTO #Temp1 (Id, Name,StudentNo,ClassCode,Section) Values(1,'Joe',123,'A1', 'I')
INSERT INTO #Temp1 (Id, Name,StudentNo,ClassCode,Section) Values(2,'Harry',113,'X2', 'H')
INSERT INTO #Temp1 (Id, Name,StudentNo,ClassCode,Section) Values(2,'Harry',113,'X2', 'H')
INSERT INTO #Temp1 (Id, Name,StudentNo,ClassCode,Section) Values(3,'Elle',121,'J1', 'E1')
INSERT INTO #Temp1 (Id, Name,StudentNo,ClassCode,Section) Values(3,'Elle',121,'J1', 'E')
INSERT INTO #Temp1 (Id, Name,StudentNo,ClassCode,Section) Values(8,'Jane',191,'A1', 'E')
INSERT INTO #Temp1 (Id, Name,StudentNo,ClassCode,Section) Values(5,'Silva',811,'S1', 'SE')
INSERT INTO #Temp1 (Id, Name,StudentNo,ClassCode,Section) Values(6,'Juan',411,'S2', 'SE')
INSERT INTO #Temp1 (Id, Name,StudentNo,ClassCode,Section) Values(7,'Carla',431,'S2', 'SE')
;WITH CTE AS (
select
ROW_NUMBER() over (partition by Id
, StudentNo
order by Id, StudentNo)as Duplicate_RowNumber
, * from #Temp1 )
select t1.Id,t1.Name,t1.StudentNo,t1.Section,t1.ClassCode
INTO #Temp2
from CTE as c INNER JOIN #Temp1 as t1 ON t1.Id = c.Id
and t1.StudentNo = t1.StudentNo
and c.Duplicate_RowNumber >1
-- this will have 6 rows all the duplicates are included
--select * from #Temp2
-- this is for output clause
DECLARE @inserted Table (Id int,
Name NVARCHAR(64),
StudentNo INT NULL,
ClassCode NVARCHAR(8) NULL,
Section NVARCHAR(8) NULL)
DELETE FROM #temp1
OUTPUT deleted.Id , deleted.Name ,deleted.StudentNo ,deleted.ClassCode ,deleted.Section into @inserted
WHERE EXISTS ( SELECT * FROM #Temp2 as t2
where #temp1.Id = t2.Id
and #temp1.Name = t2.Name
and #temp1.StudentNo = t2.StudentNo
and #temp1.ClassCode = t2.ClassCode
and #temp1.Section = t2.Section)
-- this is to check what is delete so that i can join it and update the table temp1
select * from @inserted
您可以在下面看到,查询不应删除最后两个突出显示的列,因为该部分不匹配。它应该只删除Temp1和Temp2中的匹配条件
场景2:删除Temp1中的重复记录,并使用键将Section和Classcode的数据更新为NULL。这就是我所期望的突出显示为空。
您可以自己运行此查询—只需复制并粘贴即可。是的,对于场景1,它将删除行,因为问题在本节中
I added this table for references.
添加了此#temp2表,以便于以后使用
CREATE Table #Temp2 (
Id int,
Name Varchar(64),
StudentNo INT NULL,
ClassCode Varchar(8) NULL,
Section Varchar(8) NULL,
)
IF OBJECT_ID('tempdb..#tmp4') IS NOT NULL
DROP TABLE #tmp4
select t1.Id,t1.Name,t1.StudentNo,t1.Section,t1.ClassCode,
Duplicate_RowNumber
INTO #Duplicatedata
from CTE as c INNER JOIN #Temp1 as t1 ON t1.Id = c.Id
and t1.StudentNo = t1.StudentNo
and c.Duplicate_RowNumber >1
select * from #Duplicatedata
这将同时满足这两个条件,因为#temp 1将同时包含Elle的两行,因为您的加入条件仅在ID和学生号上
为了清晰起见,我添加了行编号列
Id Name StudentNo Section ClassCode Duplicate_RowNumber
1 Joe 123 I A1 2
1 Joe 123 I A1 2
2 Harry 113 H X2 2
2 Harry 113 H X2 2
3 Elle 121 E1 J1 2
3 Elle 121 E J1 2
由于您的分区是基于学生编号和ID的,所以每个重复的行将有2个或更多的行号
您可以使用此方法删除
select
ROW_NUMBER() over (partition by Id
, StudentNo
order by Id, StudentNo, section)as Duplicate_RowNumber
, * into #tmp4 from #Temp1
--You can add section in your order as well for consistency purpose.
delete
from #tmp4
output deleted.id, deleted.Name, deleted.StudentNo, deleted.ClassCode,
deleted.Section into #Temp2
where Duplicate_RowNumber > 1
之后,您似乎希望在最终表中保留一行,并将另一行放入已删除的表中。对于Elle,它将从最终表中删除一行,并只保留一行,因为您的分区不基于节
I added this table for references.
要确保从最终表格中删除1行,可以使用此选项
DELETE t
OUTPUT deleted.Id , deleted.Name ,deleted.StudentNo ,deleted.ClassCode
,deleted.Section into @inserted FROM
(select *, row_number() over (Partition by tm.name, tm.studentNo Order by ID,
StudentNo, section ) rownum from #temp1 tm) t
join #Temp2 t2 on t.Id = t2.Id
and t.Name = t2.Name
and t.StudentNo = t2.StudentNo
and t.ClassCode = t2.ClassCode
and t.Section = t2.Section
where t.rownum > 1
如果您注意到我添加了这个行号,这样它就不会从最后一个表中删除两行,因为Joe和Harry拥有所有匹配的属性,它会删除两行
select * from @inserted
Output you get:
Id Name StudentNo ClassCode Section
3 Elle 121 J1 E1
2 Harry 113 X2 H
1 Joe 123 A1 I
最后,您可以用这种方式更新最终表#情景2
update TMP
SET ClassCode = NULL, SECTION = NULL
FROM
#Temp1 TMP
JOIN @INSERTED I ON TMP.Id = I.Id
AND TMP.StudentNo = I.StudentNo
SELECT * FROM #Temp1
最终输出:
Id Name StudentNo ClassCode Section
1 Joe 123 NULL NULL
2 Harry 113 NULL NULL
3 Elle 121 NULL NULL
8 Jane 191 A1 E
5 Silva 811 S1 SE
6 Juan 411 S2 SE
7 Carla 431 S2 SE
请注意,我只为需要更改的部分添加了脚本和输出,其余部分与您提供的脚本相同 您已经将--“into#Temp2”插入了两次,对吗?请更新您的查询--delete from#tmp4 output deleted.id、deleted.Name、deleted.StudentNo、deleted.ClassCode、deleted.Section到@inserted where Duplicate\u RowNumber>1@user3920526谢谢你提供的信息,我在复制你的脚本时没有意识到这一点。请查看更新