#28704 closed Bug (needsinfo)
update_or_create() calls select_for_update(), which locks database row
| Reported by: | Rafal Radulski | Owned by: | nobody |
|---|---|---|---|
| Component: | Database layer (models, ORM) | Version: | 1.11 |
| Severity: | Normal | Keywords: | |
| Cc: | Triage Stage: | Unreviewed | |
| Has patch: | no | Needs documentation: | no |
| Needs tests: | no | Patch needs improvement: | no |
| Easy pickings: | no | UI/UX: | no |
Description
An issue arises when update_or_create() is executed during a long-running transaction. It's caused by update_or_create() calling select_for_update(), which locks a row until the end of the entire transaction.
def update_or_create(self, defaults=None, **kwargs):
defaults = defaults or {}
lookup, params = self._extract_model_params(defaults, **kwargs)
self._for_write = True
with transaction.atomic(using=self.db):
try:
obj = self.select_for_update().get(**lookup)
except self.model.DoesNotExist:
obj, created = self._create_object_from_params(lookup, params)
if created:
return obj, created
for k, v in defaults.items():
setattr(obj, k, v() if callable(v) else v)
obj.save(using=self.db)
return obj, False
Let's say that we are using a PostgreSQL database with "read committed" isolation level. There are two processes. First process starts a transaction. It updates and retrieves an existing model using update_or_create(). select_for_update() is called, which locks the row until the end of the transaction. Then, second process starts. It retrieves the same row using get() method. At this point, database query in second process is blocked until the end of transaction in first process.
I believe this is a bug. update_or_create() should not call select_for_update(). Doing so can create a long-lived database lock, even when database transaction isolation level is relaxed.
I don't have a possible solution. I believe that QuerySet.update_or_create() should call QuerySet.update(). Unfortunately, QuerySet.update() does not support generic relationships. If QuerySet.update() did support generic relationships, a solution could work as follows:
def update_or_create(self, defaults=None, **kwargs):
defaults = defaults or {}
lookup, params = self._extract_model_params(defaults, **kwargs)
self._for_write = True
with transaction.atomic(using=self.db):
try:
obj = self.only('pk').get(**lookup)
except self.model.DoesNotExist:
obj, created = self._create_object_from_params(lookup, params)
if created:
return obj, created
update_params = {k: v() if callable(v) else v for k, v in defaults.items()}
self.filter(pk=obj.pk).update(**update_params)
obj = self.get(pk=obj.pk)
return obj, False
Related ticket:
https://code.djangoproject.com/ticket/26804
Change History (3)
comment:1 by , 8 years ago
comment:2 by , 8 years ago
| Resolution: | → needsinfo |
|---|---|
| Status: | new → closed |
comment:3 by , 8 years ago
I thought that a call to select_for_update was problematic, but I was wrong. Thank you for the explanation.
I'm assuming you're passing something in
defaults, right?update_or_createcurrently callsModel.saveand you are suggesting it should callQuerySet.updateinstead. Both of them sendUPDATESQL statements to the database. That will create a row-level lock. In Postgres, the lock level is eitherFOR UPDATEorFOR NO KEY UPDATE, see https://www.postgresql.org/docs/9.6/static/explicit-locking.html#LOCKING-ROWS.But, neither of these locks will block an ordinary
SELECTstatement like the.get()in the second transaction.