Today i face a little problem. I wanted to change one of my field into unique however there were already some duplicate records (around 900+ records) in that field. And here are some of the ways i can remove duplicate records i though that i would share it out. However, before doing any fixing on our duplicate records. How about we find out which records were duplicated? If you want to know which records is duplicated, just fire the below command,
SELECT photo FROM photograph GROUP BY photo HAVING count(photo) > 1
This sort of sql command is tested in most interview question. However, it is pretty simple by using groupby and having to show which data have more than 1 records.
Duplicate another table
The easiest way is to extract all unique records into temporary table, remove your current table and rename your temporary table.
CREATE TABLE photograph_tmp AS SELECT * FROM photograph GROUP BY photo DELETE TABLE photograph RENAME TABLE photograph_tmp TO photograph
The above will definitely fix your duplication problem but do remember to backup your records before doing anything silly.
Similar Duplicate
Another similar way instead we backup our records first, empty our original table and insert the non duplicated records in.
CREATE TABLE photograph_backup AS SELECT * FROM photograph GROUP BY photo TRUNCATE photograph INSERT INTO photograph SELECT * FROM photograph_backup GROUP BY productid
This way, you don't have to worry about backing up or destorying your precious data.
Delete duplicate records
Another method is to delete all the duplicated data in the table without creating another table.
DELETE FROM photograph WHERE photo IN (SELECT photo FROM photograph GROUP BY photo HAVING count(photo) > 1);
bascailly we search all our duplicated data and delete them from the real table. This is quite risky so don't be lazy and try out the safer way above.
Hope it helps 🙂