You are on page 1of 12

ELIMINATION OF DUPLICATES

Relational Database Concepts


Mr. Vidya Sagar

M.C.M.,MCP.,O.C.A.

Let us assume there is table with following data :

EMPNO
1111
2222
3333
3333
3333
3333
3333
4444
5555
6666
6666
6666

ENAME
XXX
YYY
ZZZ
ZZZ
ZZZ
ZZZ
ZZZ
KKK
LLL
MMM
MMM
MMM

SAL
COMM
MGR
SEX
10000
100 NULL
30000
2000
1111
5000
500
1111
5000
500
1111
5000
500
1111
5000
500
1111
5000
500
1111
34000
4500
2222
4500
340
3333
7800
3400
4444
7800
3400
4444
7800
3400
4444

0
1
1
1
1
1
1
0
1
1
1
1

JOBID
J1
J2
J3
J3
J3
J3
J3
J1
J2
J2
J2
J2

DEPTNO
10
20
10
10
10
10
10
30
20
20
20
20

GRP
G1
G2
G3
G3
G3
G3
G3
G3
G2
G3
G3
G3

SELECT * FROM EMPLOY


UNION
SELECT * FROM EMPLOY

SELECT
EMPNO, ENAME,
SAL, COMM,
MGR, SEX,
JOBID, DEPTNO , GRP
FROM
EMPLOY
GROUP BY
EMPNO, ENAME, SAL, COMM,
MGR, SEX, JOBID, DEPTNO , GRP

EMPNO
1111
2222
3333
4444
5555
6666

ENAME
XXX
YYY
ZZZ
KKK
LLL
MMM

SAL
COMM
MGR
SEX
10000
100 NULL
30000
2000
1111
5000
500
1111
34000
4500
2222
4500
340
3333
7800
3400
4444

0
1
1
0
1
1

JOBID
J1
J2
J3
J1
J2
J2

DEPTNO
10
20
10
30
20
20

GRP
G1
G2
G3
G3
G2
G3

SELECTING ONLY DUPLICATED ROWS :

SELECT * FROM EMPLOY


GROUP BY
EMPNO, ENAME, SAL, COMM, MGR, SEX, JOBID,
DEPTNO , GRP
HAVING COUNT(*) > 1

Let us remove the duplicated rows by keeping one such row :


1.

Lets create a temporary table with distinct rows from duplicated


rows in main table.
SELECT * INTO #TEMP FROM EMPLOY
GROUP BY
EMPNO, ENAME, SAL, COMM,
MGR, SEX, JOBID, DEPTNO , GRP
HAVING
COUNT(*) > 1

EMPNO ENAME
3333 ZZZ
6666 MMM

SAL

COMM
MGR
SEX
5000
500
1111
7800
3400
4444

JOBID
1 J3
1 J2

DEPTNO GRP
10 G3
20 G3

2. Lets delete all the rows from main table which are available in
temporary table. Or which are duplicated.
DELETE FROM EMPLOY WHERE EXISTS
( SELECT 1 FROM #TEMP
WHERE
EMPNO = EMPLOY.EMPNO AND
ENAME = EMPLOY.ENAME AND
SAL = EMPLOY.SAL AND
COMM = EMPLOY.COMM AND
MGR = EMPLOY.MGR AND
SEX = EMPLOY.SEX AND
JOBID = EMPLOY.JOBID AND
DEPTNO = EMPLOY.DEPTNO AND
GRP = EMPLOY.GRP
)

EMPNO
1111
2222
4444
5555

ENAME
XXX
YYY
KKK
LLL

SAL
COMM
MGR
SEX
10000
100 NULL
30000
2000
1111
34000
4500
2222
4500
340
3333

0
1
0
1

JOBID
J1
J2
J1
J2

DEPTNO
10
20
30
20

GRP
G1
G2
G3
G2

3.

Then insert back to main table which are there in temporary table.
INSERT INTO EMPLOY SELECT * FROM #TEMP

Result :
You will have only distinct rows in the main table.

EMPNO ENAME
1111 XXX
2222 YYY
3333 ZZZ
4444 KKK
5555 LLL
6666 MMM

SAL
COMM MGR
SEX
10000
100 NULL
30000
2000
1111
5000
500
1111
34000
4500
2222
4500
340
3333
7800
3400
4444

JOBID
0 J1
1 J2
1 J3
0 J1
1 J2
1 J2

DEPTNO GRP
10 G1
20 G2
10 G3
30 G3
20 G2
20 G3

Let us assume there is table with following data :

ID
EMPNO ENAME SAL
COMM
MGR
SEX
JOBID
DEPTNO GRP
1
1111 XXX
10000
100 NULL
0 J1
10 G1
EMPNO
ENAME
SAL
COMM
MGR
SEX
JOBID
DEPTNO
GRP
ID
EMPNO
SAL
COMM 1111
MGR
JOBID 20 G2
DEPTNO GRP2
2222 YYY ENAME 30000
2000
1SEX
J2
1111
XXX
10000
100
NULL
0
J1
10 G1 10 G1 31
3333 1111
ZZZ XXX
5000 10000500
1111
1 J3
10 G3
100
NULL
0 J1
3333 2222
ZZZ YYY
500030000 500 2000 1111 1111
10 G3
20 G2 20 G2 42
2222
YYY
30000
2000
1111 1 J3 1 J2 1 J2
3333 3333
ZZZ ZZZ
5000 5000 500 500 1111 1111
1 J3 1 J3
10 G3
10 G3 10 G3 53
3333
ZZZ
5000
500
1111
1
J3
6
3333 ZZZ
5000
500
1111
1 J3
10 G3
4444
KKK
34000
4500
2222
0
J1
30
G3
2222 1 J3
0 J1
30 G3 78
3333 4444
ZZZ KKK
5000 34000500 4500
1111
10 G3
5555
20 G2 20 G2 89
4444 5555
KKK LLL
34000 4500
2222 33333333 0 J1 1 J2 1 J2
30 G3
LLL
45004500 340340
6666
20 G3 20 G3 910
5555 6666
LLL MMM
4500 7800
3333 44444444 1 J2 1 J2 1 J2
20 G2
MMM
7800340 3400
3400
10
6666 MMM
7800
3400
4444
1 J2
20 G3
11
6666 MMM
7800
3400
4444
1 J2
20 G3
12
6666 MMM
7800
3400
4444
1 J2
20 G3

1
2
13

2
3
8
98
109
10

Find the below script which gives a demo on elimination of duplicates.

--Creation of a table
IF EXISTS (SELECT * FROM DBO.SYSOBJECTS WHERE ID = OBJECT_ID(N'[DBO].[EMPLOY]') AND
OBJECTPROPERTY(ID, N'ISUSERTABLE') = 1)
DROP TABLE [DBO].[EMPLOY]
GO

CREATE TABLE [DBO].[EMPLOY] (

[EMPNO] [INT] NULL ,

[ENAME] [VARCHAR] (20),

[SAL] [FLOAT] NULL ,

[COMM] [FLOAT] NULL ,

[MGR] [INT] NULL ,

[SEX] [BIT] NULL ,

[JOBID] [CHAR] (2) ,

[DEPTNO] [INT] NULL ,

[GRP] [CHAR] (2),


) ON [PRIMARY]
GO

-- inserting some dummy data

Writing Query is not Important


Writing Optimized Query is Important

Thank You

You might also like