Query for finding duplicates

Question

I am having a table with following schema:

CUSTOMERS (id INT, name VARCHAR(10), height VARCHAR(10), weight INT)

id is the primary key. I want to find out rows in which people who are having exactly same name, same height and same weight. In other words, I want to find out duplicates with-respect-to name, height and weight.

Example table:

1, sam, 160, 100
2, ron, 167, 88
3, john, 150, 90
4, sam, 160, 100
5, rick, 158, 110
6, john, 150, 90
7, sam, 166, 110

Example Output:

Now since there are people with same name, same height and same weight:

sam (id=1), sam (id=4)

and

john (id=3), john (id=6)

I want to get these ids. It is also okay if I get only one id per match (i.e. id=1 from first match and id=3 from second match).

I am trying this query but not sure if it is correct or not.

SELECT id
FROM customers
GROUP BY name, height, weight

possible duplicate of [How do i find duplicate values in a table in Oracle?](http://stackoverflow.com/questions/59232/how-do-i-find-duplicate-values-in-a-table-in-oracle) — Ollie, Jan 11 '12 at 10:08

score 2 · Answer 1 · edited Jan 11 '12 at 09:38

2

SELECT * 
FROM customers C
INNER JOIN 
(
  SELECT name, height, weight
  FROM customers 
  GROUP BY name, height, weight 
  HAVING COUNT(*) > 1
) X ON C.name = X.name AND C.height = X.height AND C.weight = X.weight

edited Jan 11 '12 at 09:38

Oleg Dok

19,850
2
42
54

answered Jan 11 '12 at 09:34

Baatar

241
2
9

This should work. I suggest using DuplicateCustomers instead of X for your inner select table name. – Atilla Ozgur Jan 11 '12 at 10:08
Youre absolutely right my friend. I just did not pay attention on it because it is just a simple example. Teşekkürler :) – Baatar Jan 11 '12 at 10:58

score 2 · Accepted Answer · answered Jan 11 '12 at 09:36

Try this (valid for sql server):

SELECT 
    t.NAME,
    'Ids = '+
    (
        SELECT cast(Id as varchar)+',' 
        FROM Customers c 
        WHERE c.NAME = t.NAME AND c.Weight = t.Weight AND c.Height = t.Height
        FOR XML PATH('')
    )
FROM
(
    SELECT Name, height, weight
    FROM Customers
    GROUP BY Name, height, weight
    HAVING COUNT(*) > 1
) t

OR

as you asked - only one Id per match

SELECT 
    t.NAME,
    c.Id
FROM
(
    SELECT Name, height, weight
    FROM Customers
    GROUP BY Name, height, weight
    HAVING COUNT(*) > 1
) t
JOIN Customers c ON t.NAME AND c.Weight = t.Weight AND c.Height = t.Height

Florin Ghita · Answer 3 · 2012-01-11T10:06:29.547

1

you are on the right way:

SELECT min(id)
FROM customers
GROUP BY name, height, weight
HAVING COUNT(*) > 1

edited Jan 11 '12 at 10:06

answered Jan 11 '12 at 09:33

Florin Ghita

16,995
5
52
72

I don't think that this will work in this way, since you have to put the id field in the group by clause. – TheBoyan Jan 11 '12 at 09:53

score 1 · Answer 4 · answered Jan 11 '12 at 09:36

1

SELECT c.*
FROM customers c
JOIN (
    SELECT name, height, weight
    FROM 
    GROUP BY name, height, weight
    HAVING count(*) > 1
) t ON c.name = t.name and c.height = t.height and c.weight = t.weight

answered Jan 11 '12 at 09:36

Michał Powaga

20,726
7
45
60

TheBoyan · Answer 5 · 2012-01-11T10:10:18.220

I don't know what you are using since you tagged several databases.

In Sql server you won't be able to select the id without putting it in the SELECT.

so if you want to select other fields besides the ones in the group clasue you can use PARTITION BY. Something like this:

SELECT id,
ROW_NUMBER() OVER(PARTITION BY c.name, c.height, c.weight ORDER BY c.name) AS DuplicateCount
FROM customers c

This will give you the ids of the duplicates that you have with the same name, height and weight.
I'm not sure that this is faster that the other solutions though, but, you can profile it and compare.

score 0 · Answer 6 · answered Jan 11 '12 at 10:04

0

If it is okay to get only one id per match as you say, you are close to solution:

SELECT 
   min( id )
  ,name, height, weight  --<-- oncly if you need/want
FROM customers
GROUP BY name, height, weight
HAVING count(*) > 1

answered Jan 11 '12 at 10:04

dani herrera

39,746
4
87
153

Query for finding duplicates

6 Answers6