DISTINCT Query or DELETE Query

Question

On each page load, what's faster, carrying out a distinct, or deleting duplicate records?

Try it out and you will know what the answer is for your specific context. — Mat, Sep 11 '11 at 12:24
Your favorite search engine (and this site's search function) should give you lots of option for that. — Mat, Sep 11 '11 at 12:30
Use MySql Command Line Client, you can see the query time at the bottom of the result set for each query. — nobody, Sep 11 '11 at 12:30
As you mentioned page load, do you use PHP with mysql for web page? please let us know specifically. I mean if you thinking of PHP with mysql, you can check the query time as described in this post: http://stackoverflow.com/questions/4182843/how-can-i-measure-mysql-time-the-time-and-or-load-of-an-sql-query-in-php — Nagaraj Tantri, Sep 11 '11 at 12:31

score 2 · Answer 1 · answered Sep 11 '11 at 12:39

If having multiple records means that the data is incorrect, then neither of these options is exactly what you should be doing.

What you should do is determine how bad data is entering the system in the first place and prevent it from happening. At any given point in time the database should contain only good, meaningful data. It should, entirely in and of itself and without the application, maintain the integrity of the data at rest.

Don't add work-arounds to your application to handle the bad data. Add validation to your application and your database to prevent bad data. In this case it sounds like you have a database table which doesn't properly define constraints for what constitutes a "unique" record. That should be fixed.

DISTINCT Query or DELETE Query

1 Answers1