82

If there a way to protect against concurrent modifications of the same data base entry by two or more users?

It would be acceptable to show an error message to the user performing the second commit/save operation, but data should not be silently overwritten.

I think locking the entry is not an option, as a user might use the "Back" button or simply close his browser, leaving the lock for ever.

Anton Strogonoff
  • 27,755
  • 7
  • 50
  • 60
Ber
  • 34,859
  • 15
  • 60
  • 79
  • 4
    If one object can be updated by multiple, concurrent users, you may have a bigger design issue. It might be worth thinking about user-specific resources or separating processing steps into separate tables to prevent this from being an issue. – S.Lott Nov 26 '08 at 17:03

10 Answers10

49

This is how I do optimistic locking in Django:

updated = Entry.objects.filter(Q(id=e.id) && Q(version=e.version))\
          .update(updated_field=new_value, version=e.version+1)
if not updated:
    raise ConcurrentModificationException()

The code listed above can be implemented as a method in Custom Manager.

I am making the following assumptions:

  • filter().update() will result in a single database query because filter is lazy
  • a database query is atomic

These assumptions are enough to ensure that no one else has updated the entry before. If multiple rows are updated this way you should use transactions.

WARNING Django Doc:

Be aware that the update() method is converted directly to an SQL statement. It is a bulk operation for direct updates. It doesn't run any save() methods on your models, or emit the pre_save or post_save signals

Sam
  • 1,501
  • 3
  • 19
  • 27
Andrei Savu
  • 7,999
  • 7
  • 42
  • 53
  • 12
    Nice! Shouldn't that be '&' instead of '&&', though? – Giles Thomas Oct 11 '10 at 17:05
  • 1
    Could you sidestep the issue of 'update' not running save() methods by putting the call to 'update' inside your own overridden save() method? – Jonathan Hartley Jun 22 '11 at 15:14
  • 1
    What happens when two threads concurrently call `filter`, both receive an identical list with unmodified `e`, and then both concurrently call `update`? I see no semaphore that blocks filter and update simultaneously. EDIT: oh I understand lazy filter now. But what is the validity of assuming update() is atomic? surely the DB handles concurrent access – totowtwo Jun 24 '11 at 16:13
  • 1
    @totowtwo The I in ACID guarantees ordering (http://en.wikipedia.org/wiki/ACID). If an UPDATE is executing on data relating to a concurrent (but later started) SELECT it'll block until UPDATE is done. However multiple SELECT can be executed at the same time. – Kit Sunde Apr 21 '13 at 11:09
  • 1
    Looks like this will work properly only with autocommit mode (which is default). Otherwise final COMMIT will go separated from this updating SQL statement, so concurrent code can run between them. And we have ReadCommited isolation level in Django, so it will read old version. (Why I want manual transaction here - because I want to create a row in another table along with this update.) Great idea, though. – Alex Lokk Jul 29 '13 at 22:15
  • It's also possible to use a hybrid approach combining optimistic & pessimistic concurrency, which gets the best of both worlds by doing logic in a "read phase", checking the db rows are unchanged, then obtaining a lock only for the "write phase". More details here: https://github.com/pirate/django-concurrency-talk – Nick Sweeting Oct 08 '18 at 21:27
40

This question is a bit old and my answer a bit late, but after what I understand this has been fixed in Django 1.4 using:

select_for_update(nowait=True)

see the docs

Returns a queryset that will lock rows until the end of the transaction, generating a SELECT ... FOR UPDATE SQL statement on supported databases.

Usually, if another transaction has already acquired a lock on one of the selected rows, the query will block until the lock is released. If this is not the behavior you want, call select_for_update(nowait=True). This will make the call non-blocking. If a conflicting lock is already acquired by another transaction, DatabaseError will be raised when the queryset is evaluated.

Of course this will only work if the back-end support the "select for update" feature, which for example sqlite doesn't. Unfortunately: nowait=True is not supported by MySql, there you have to use: nowait=False, which will only block until the lock is released.

giZm0
  • 1,441
  • 20
  • 35
  • 2
    This isn't a great answer - the question explicitly didn't want (pessimistic) locking, and the two higher-voted answers currently focus on optimistic concurrency control ("optimistic locking") for that reason. Select-for-update is fine in other situations though. – RichVel Mar 15 '14 at 11:10
  • @giZm0 That still makes it pessimistic locking. The first thread that obtains the lock can hold it indefinitely. – knaperek Jun 24 '14 at 10:51
  • 6
    I like this answer because is of the documentation of Django and not a beautiful invention of any third party. – anizzomc May 22 '15 at 13:43
30

Actually, transactions don't help you much here ... unless you want to have transactions running over multiple HTTP requests (which you most probably don't want).

What we usually use in those cases is "Optimistic Locking". The Django ORM doesn't support that as far as I know. But there has been some discussion about adding this feature.

So you are on your own. Basically, what you should do is add a "version" field to your model and pass it to the user as a hidden field. The normal cycle for an update is :

  1. read the data and show it to the user
  2. user modify data
  3. user post the data
  4. the app saves it back in the database.

To implement optimistic locking, when you save the data, you check if the version that you got back from the user is the same as the one in the database, and then update the database and increment the version. If they are not, it means that there has been a change since the data was loaded.

You can do that with a single SQL call with something like :

UPDATE ... WHERE version = 'version_from_user';

This call will update the database only if the version is still the same.

S.Lott
  • 359,791
  • 75
  • 487
  • 757
Guillaume
  • 17,054
  • 8
  • 51
  • 73
  • 1
    This same question also appeared on Slashdot. The Optimistic Locking you suggest was also proposed there, but explained a bit better imho: http://hardware.slashdot.org/comments.pl?sid=1381511&cid=29536367 – hopla Sep 29 '09 at 09:35
  • 5
    Also note that you do want to use transactions on top of this, to avoid this situation: http://hardware.slashdot.org/comments.pl?sid=1381511&cid=29536613 Django provides middleware to automatically wrap every action on the database in a transaction, starting from the initial request and only committing after a succesfull response: http://docs.djangoproject.com/en/dev/topics/db/transactions/ (mind you: the transaction middleware only helps to avoid the above problem with optimistic locking, it does not provide locking by itself) – hopla Sep 29 '09 at 12:12
  • I'm also looking for details on how to do this. No luck so far. – seanyboy Dec 09 '09 at 15:48
  • 1
    you can do this by using django bulk updates. check my answer. – Andrei Savu Jan 30 '10 at 00:43
14

Django 1.11 has three convenient options to handle this situation depending on your business logic requirements:

  • Something.objects.select_for_update() will block until the model become free
  • Something.objects.select_for_update(nowait=True) and catch DatabaseError if the model is currently locked for update
  • Something.objects.select_for_update(skip_locked=True) will not return the objects that are currently locked

In my application, which has both interactive and batch workflows on various models, I found these three options to solve most of my concurrent processing scenarios.

The "waiting" select_for_update is very convenient in sequential batch processes - I want them all to execute, but let them take their time. The nowait is used when an user wants to modify an object that is currently locked for update - I will just tell them it's being modified at this moment.

The skip_locked is useful for another type of update, when users can trigger a rescan of an object - and I don't care who triggers it, as long as it's triggered, so skip_locked allows me to silently skip the duplicated triggers.

kravietz
  • 8,571
  • 1
  • 30
  • 24
  • 1
    Do I need to wrap the select for update in with transaction.atomic()? If im actually using the results for an update? Wont it lock the entire table making the select_for_update a noop? – Paul Kenjora Jul 15 '19 at 23:06
3

For future reference, check out https://github.com/RobCombs/django-locking. It does locking in a way that doesn't leave everlasting locks, by a mixture of javascript unlocking when the user leaves the page, and lock timeouts (e.g. in case the user's browser crashes). The documentation is pretty complete.

Jeremy
  • 1
  • 77
  • 324
  • 346
1

You should probably use the django transaction middleware at least, even regardless of this problem.

As to your actual problem of having multiple users editing the same data... yes, use locking. OR:

Check what version a user is updating against (do this securely, so users can't simply hack the system to say they were updating the latest copy!), and only update if that version is current. Otherwise, send the user back a new page with the original version they were editing, their submitted version, and the new version(s) written by others. Ask them to merge the changes into one, completely up-to-date version. You might try to auto-merge these using a toolset like diff+patch, but you'll need to have the manual merge method working for failure cases anyway, so start with that. Also, you'll need to preserve version history, and allow admins to revert changes, in case someone unintentionally or intentionally messes up the merge. But you should probably have that anyway.

There's very likely a django app/library that does most of this for you.

Lee B
  • 2,097
  • 11
  • 16
  • This is also Optimistic Locking, like Guillaume proposed. But he seemed to get all the points :) – hopla Sep 29 '09 at 09:37
0

Another thing to look for is the word "atomic". An atomic operation means that your database change will either happen successfully, or fail obviously. A quick search shows this question asking about atomic operations in Django.

Community
  • 1
  • 1
Harley Holcombe
  • 155,163
  • 15
  • 67
  • 62
  • I don't want to perform a transaction or locking across multiple requests, as this can take any length of time (and may never finish at all) – Ber Nov 26 '08 at 10:57
  • If a transaction starts, it has to finish. You should only lock the record (or start the transaction, or whatever you decide to do) after the user clicks "submit", not when they open the record for viewing. – Harley Holcombe Nov 26 '08 at 11:01
  • Yes, but my problem is different, in that two users open the same form and then they both commit their changes. I don't think locking is the solution for this. – Ber Nov 26 '08 at 16:24
  • You're right, but the problem is there _is_ no solution for this. One user gets to win, the other gets a fail message. The later you lock the record the less problems you'll have. – Harley Holcombe Nov 26 '08 at 21:56
  • I agree. I totally accept the fail message for the other user. I am looking for a good way to detect this case (which I expect to be very rare). – Ber Nov 26 '08 at 22:58
0

The idea above

updated = Entry.objects.filter(Q(id=e.id) && Q(version=e.version))\
      .update(updated_field=new_value, version=e.version+1)
if not updated:
      raise ConcurrentModificationException()

looks great and should work fine even without serializable transactions.

The problem is how to augment the deafult .save() behavior as to not have to do manual plumbing to call the .update() method.

I looked at the Custom Manager idea.

My plan is to override the Manager _update method that is called by Model.save_base() to perform the update.

This is the current code in Django 1.3

def _update(self, values, **kwargs):
   return self.get_query_set()._update(values, **kwargs)

What needs to be done IMHO is something like:

def _update(self, values, **kwargs):
   #TODO Get version field value
   v = self.get_version_field_value(values[0])
   return self.get_query_set().filter(Q(version=v))._update(values, **kwargs)

Similar thing needs to happen on delete. However delete is a bit more difficult as Django is implementing quite some voodoo in this area through django.db.models.deletion.Collector.

It is weird that modren tool like Django lacks guidance for Optimictic Concurency Control.

I will update this post when I solve the riddle. Hopefully solution will be in a nice pythonic way that does not involve tons of coding, weird views, skipping essential pieces of Django etc.

Kiril
  • 828
  • 7
  • 18
-2

To be safe the database needs to support transactions.

If the fields is "free-form" e.g. text etc. and you need to allow several users to be able to edit the same fields (you can't have single user ownership to the data), you could store the original data in a variable. When the user committs, check if the input data has changed from the original data (if not, you don't need to bother the DB by rewriting old data), if the original data compared to the current data in the db is the same you can save, if it has changed you can show the user the difference and ask the user what to do.

If the fields is numbers e.g. account balance, number of items in a store etc., you can handle it more automatically if you calculate the difference between the original value (stored when the user started filling out the form) and the new value you can start a transaction read the current value and add the difference, then end transaction. If you can't have negative values, you should abort the transaction if the result is negative, and tell the user.

I don't know django, so I can't give you teh cod3s.. ;)

Stein G. Strindhaug
  • 5,036
  • 2
  • 26
  • 40
-6

From here:
How to prevent overwriting an object someone else has modified

I'm assuming that the timestamp will be held as a hidden field in the form you're trying to save the details of.

def save(self):
    if(self.id):
        foo = Foo.objects.get(pk=self.id)
        if(foo.timestamp > self.timestamp):
            raise Exception, "trying to save outdated Foo" 
    super(Foo, self).save()
Community
  • 1
  • 1
seanyboy
  • 5,465
  • 7
  • 40
  • 56
  • 1
    the code is broken. a race condition can still occur between the if check and save query. you need to use objects.filter(id=.. & timestamp check).update(...) and raise an exception if no row was updated. – Andrei Savu Jan 29 '10 at 23:54