SQL Remove Duplicate Records Solutions

Today i face a little problem. I wanted to change one of my field into unique however there were already some duplicate records (around 900+ records) in that field. And here are some of the ways i can remove duplicate records i though that i would share it out. However, before doing any fixing on our duplicate records. How about we find out which records were duplicated? If you want to know which records is duplicated, just fire the below command,

SELECT photo
FROM photograph
GROUP BY photo
HAVING count(photo) > 1

This sort of sql command is tested in most interview question. However, it is pretty simple by using groupby and having to show which data have more than 1 records.

Duplicate another table

The easiest way is to extract all unique records into temporary table, remove your current table and rename your temporary table.


CREATE TABLE photograph_tmp AS SELECT * FROM photograph GROUP BY photo
DELETE TABLE photograph
RENAME TABLE photograph_tmp TO photograph

The above will definitely fix your duplication problem but do remember to backup your records before doing anything silly.

Similar Duplicate

Another similar way instead we backup our records first, empty our original table and insert the non duplicated records in.

CREATE TABLE photograph_backup AS SELECT * FROM photograph GROUP BY photo
TRUNCATE photograph
INSERT INTO photograph SELECT * FROM photograph_backup  GROUP BY productid

This way, you don't have to worry about backing up or destorying your precious data.

Delete duplicate records

Another method is to delete all the duplicated data in the table without creating another table.

DELETE FROM photograph
   WHERE photo IN
   (SELECT photo
       FROM photograph
	   GROUP BY photo HAVING count(photo) > 1);

bascailly we search all our duplicated data and delete them from the real table. This is quite risky so don't be lazy and try out the safer way above.

Hope it helps 🙂