Opened 7 years ago

Closed 7 years ago

Last modified 7 years ago

#28704 closed Bug (needsinfo)

update_or_create() calls select_for_update(), which locks database row

Reported by: Rafal Radulski Owned by: nobody
Component: Database layer (models, ORM) Version: 1.11
Severity: Normal Keywords:
Cc: Triage Stage: Unreviewed
Has patch: no Needs documentation: no
Needs tests: no Patch needs improvement: no
Easy pickings: no UI/UX: no

Description

An issue arises when update_or_create() is executed during a long-running transaction. It's caused by update_or_create() calling select_for_update(), which locks a row until the end of the entire transaction.

def update_or_create(self, defaults=None, **kwargs):
    defaults = defaults or {}
    lookup, params = self._extract_model_params(defaults, **kwargs)
    self._for_write = True
    with transaction.atomic(using=self.db):
        try:
            obj = self.select_for_update().get(**lookup)
        except self.model.DoesNotExist:
            obj, created = self._create_object_from_params(lookup, params)
            if created:
                return obj, created
        for k, v in defaults.items():
            setattr(obj, k, v() if callable(v) else v)
        obj.save(using=self.db)
    return obj, False

Let's say that we are using a PostgreSQL database with "read committed" isolation level. There are two processes. First process starts a transaction. It updates and retrieves an existing model using update_or_create(). select_for_update() is called, which locks the row until the end of the transaction. Then, second process starts. It retrieves the same row using get() method. At this point, database query in second process is blocked until the end of transaction in first process.

I believe this is a bug. update_or_create() should not call select_for_update(). Doing so can create a long-lived database lock, even when database transaction isolation level is relaxed.

I don't have a possible solution. I believe that QuerySet.update_or_create() should call QuerySet.update(). Unfortunately, QuerySet.update() does not support generic relationships. If QuerySet.update() did support generic relationships, a solution could work as follows:

def update_or_create(self, defaults=None, **kwargs):
    defaults = defaults or {}
    lookup, params = self._extract_model_params(defaults, **kwargs)
    self._for_write = True
    with transaction.atomic(using=self.db):
        try:
            obj = self.only('pk').get(**lookup)
        except self.model.DoesNotExist:
            obj, created = self._create_object_from_params(lookup, params)
            if created:
                return obj, created

        update_params = {k: v() if callable(v) else v for k, v in defaults.items()}
        self.filter(pk=obj.pk).update(**update_params)
        obj = self.get(pk=obj.pk)
    return obj, False

Related ticket:
https://code.djangoproject.com/ticket/26804

Change History (3)

comment:1 by Tomer Chachamu, 7 years ago

I'm assuming you're passing something in defaults, right?

update_or_create currently calls Model.save and you are suggesting it should call QuerySet.update instead. Both of them send UPDATE SQL statements to the database. That will create a row-level lock. In Postgres, the lock level is either FOR UPDATE or FOR NO KEY UPDATE, see https://www.postgresql.org/docs/9.6/static/explicit-locking.html#LOCKING-ROWS.

But, neither of these locks will block an ordinary SELECT statement like the .get() in the second transaction.

comment:2 by Tomer Chachamu, 7 years ago

Resolution: needsinfo
Status: newclosed

comment:3 by Rafal Radulski, 7 years ago

I thought that a call to select_for_update was problematic, but I was wrong. Thank you for the explanation.

Note: See TracTickets for help on using tickets.
Back to Top