I have a MySQL table in which each row is a TV episode. It looks like this:
showTitle | season | episode | episodeTitle | airdate | absoluteEpisode
----------------------------------------------------------------------------------------
The X-Files 5 12 Bad Blood 1998-02-22 109
The X-Files 5 13 Patient X 1998-03-01 110
(Where absoluteEpisode
is the episode's overall number counting from episode 1.)
It is populated using a Ruby program I wrote which fetches the data from a web service. Periodically, I'd like to run the program again to fetch new episodes. The question then becomes, how do I avoid adding duplicates of the already-existing rows? None of the columns in this table are suitable for use as a primary key or unique field.
I had two ideas. The first was to create a new column, md5
, with an MD5 hash of all of those values, and make that a unique column, to prevent two rows with identical data from being added. That seems like it would work, but be messy.
My second was to use this solution from StackOverflow. But I can't quite get that to work. My SQL query is
INSERT INTO `tv`.`episodes` (showTitle,episodeTitle,season,episode,date,absoluteEpisode)
SELECT '#{show}','#{title}','#{y['airdate']}' FROM `tv`.`episodes`
WHERE NOT EXISTS (SELECT * from `tv`.`episodes`
WHERE showTitle='#{show}' AND episodeTitle='#{title}' AND season='#{season_string}' AND episode='#{y['seasonnum']}' AND date='#{y['airdate']}' AND absoluteEpisode='#{y['epnum']}'")
The #{...}
bits are Ruby variables. This gets me the obvious error You have an error in your SQL syntax
.
Flipping through the books and documentation I can find on the subject, I'm still not sure how to properly execute this query, or if it's not a smart way of solving my problem. I'd appreciate any advice!