1

I have a table users with primary key column email. I have a piece of code where I store the user that simply invokes userDao.store(user); Since the constraint exists, I can catch the exception and show the error on the UI. This approach works fine. Another solution is to check first if the user exists and then store him in the database. This would result in two consecutive queries - select and then insert. So basically if the user exists I show the error. The issue I see here that if two users with the same email try to register at the same time and provide the same email. It may happen than both threads check the existence of the user and return nothing. Then the first thread saves the user and the second throws exception. The third approach is to use MERGE query (I use hsqldb). Basically in one query I insert the user only if he does not exist. Then I can see the result of the query. If no rows have changed then it means that the user exists and I can show the error. Either of these approaches would not violate the consistency of my data. But I am looking for the best practices on how to handle this kind of problem.

user1745356
  • 3,851
  • 5
  • 33
  • 61
  • 1
    **‘UPSERT’** FYI, this feature of "update if a row exists and otherwise insert the row if it does not exist" is colloquially called "upsert" (update-insert, mashed together). May help when seaching. – Basil Bourque Jun 10 '15 at 19:12

2 Answers2

2

Your first instinct was correct. To protect against duplicates, define a UNIQUE constraint on that column. Then catch any exception resulting from a violation of that constraint.

SQL lacks an atomic insert-if-not-exists command. You will see code using a nested SELECT statement, but such code is not atomic, so you would still need to trap for the UNIQUE constraint violations.

This Question is basically a duplicate. Search StackOverflow for more discussion and examples.

By the way, I would recommend against using email address as a primary key. If a user wants to change their email address on their account, you will have to update all related records using that value as a Foreign Key. I suggest using a Surrogate Key instead of a Natural Key almost always.

Basil Bourque
  • 218,480
  • 72
  • 657
  • 915
-1

The chance of that happening is so remote you really don't have to consider it. Especially if you use email validation before someone can use the system. If you still are worried you can minimize the chance by using a synchronize operation on the call that checks for the existence of the email. The only way this would not work is if you have a clustered environment with the code running on 2 or more load balanced servers.

markbernard
  • 1,341
  • 9
  • 16
  • 3
    Ignoring concurrency issues by hoping a conflict never occurs is a bad idea. – Basil Bourque Feb 23 '15 at 18:13
  • How many people have the same email address? It's not going to happen. It's not hoping. You may get an attacker trying to spam by using someone's email but verification emails make this a non-issue. – markbernard Feb 23 '15 at 20:08