Delete duplicate rows in T-SQL

by Joe Hunter 6. October 2010 18:45

Very elegant solutions for deleting duplicate rows from a table:

Here's how I solve this problem. If the table does not have an identity column, add one, so all the rows are numbered. Here's an example script:

Create table ##Test (a int not null, b int not null, c int not null, id int not null identity) on [Primary]
GO
INSERT INTO ##Test (A,B,C) VALUES (1,1,1)
INSERT INTO ##Test (A,B,C) VALUES (1,1,1)
INSERT INTO ##Test (A,B,C) VALUES (1,1,1)

INSERT INTO ##Test (A,B,C) VALUES (1,2,3)
INSERT INTO ##Test (A,B,C) VALUES (1,2,3)
INSERT INTO ##Test (A,B,C) VALUES (1,2,3)

INSERT INTO ##Test (A,B,C) VALUES (4,5,6)
GO
Select * from ##Test
GO
Delete from ##Test where id <
(Select Max(id) from ##Test t where ##Test.a = t.a and
##Test.b = t.b and
##Test.c = t.c)
GO
Select * from ##Test
GO

This solution is extremely fast...

        SET @SQL = 'Delete from T1 '
                SET @SQL = @SQL + 'from ( '
                SET @SQL = @SQL + 'select PATIENTID, ROW_NUMBER() over ( '
                SET @SQL = @SQL + 'partition by PATIENTID '
                SET @SQL = @SQL + 'order by PATIENTID ) RowNumber '
                SET @SQL = @SQL + 'from '+@tempTbl+' '
                SET @SQL = @SQL + ') T1 '
                SET @SQL = @SQL + 'where ( t1.rownumber > 1 ) '
        PRINT @SQL
        EXEC sp_executesql @SQL

Reference: http://www.sqlservercentral.com/articles/T-SQL/70807/

sql

Powered by BlogEngine.NET 1.5.0.7
Theme by Mads Kristensen