﻿id	summary	reporter	owner	description	type	status	component	version	severity	resolution	keywords	cc	stage	has_patch	needs_docs	needs_tests	needs_better_patch	easy	ui_ux
26804	update_or_create() updates all columns, which could cause a race condition	Jensen Cochran	Jensen Cochran	"
The update_or_create() helper function in Django is passed a 'defaults' kwarg, which contains the fields that we expect the function to update.  When .save() is called in the function to do the update, it does not pass the update_fields parameter to .save(), and so the SQL generated to complete the update sets every column (except pk), even those not passed in 'defaults'.

This creates a potential race condition.  In the period of time between when the function calls .get() to retrieve the object and when it calls .save(), some other process may have updated a column in that row that has nothing to do with this update.  When save() is called, these changes will be overwritten.  This is not an issue for the fields passed into 'defaults' since we expected those to be updated anyway.  The main issue is that update_or_create updates fields/columns other than those passed in 'defaults', and does so in a way that could cause data loss.

Example:

{{{

class Authors(models.Model):
    id = models.AutoField(primary_key=True)
    name = models.CharField(max_length=128, unique=True)
    newest_release = models.DateTimeField()
    active = models.BooleanField(default=True)


Authors.objects.update_or_create(
    name='Bob',
    defaults={
        'newest_release': timezone.now()
    })


# Django update_or_create code from https://github.com/django/django/blob/master/django/db/models/query.py

    def update_or_create(self, defaults=None, **kwargs):
        """"""
        Looks up an object with the given kwargs, updating one with defaults
        if it exists, otherwise creates a new one.
        Returns a tuple (object, created), where created is a boolean
        specifying whether an object was created.
        """"""
        defaults = defaults or {}
        lookup, params = self._extract_model_params(defaults, **kwargs)
        self._for_write = True
        try:
            obj = self.get(**lookup)
        except self.model.DoesNotExist:
            obj, created = self._create_object_from_params(lookup, params)
            if created:
                return obj, created
        for k, v in six.iteritems(defaults):
            setattr(obj, k, v)
        obj.save(using=self.db)
        return obj, False

}}}

Assume that the author Bob already exists in the database.  
update_or_create retrieves the Author object using .get().

Meanwhile, another process runs the following SQL:
{{{ UPDATE authors SET active = FALSE WHERE id = 1234; }}}

Then, update_or_create sets the values from 'defaults' on the object, then calls .save() on the object. Generated SQL would look something like 
{{{ UPDATE authors SET name='Bob', active=TRUE, newest_release='#######'::timestamp, WHERE id = 1234"" }}}

Notice that both 'name' and 'active' are set in the SQL, even though they were not passed in defaults.  Bob has been wrongly made active again, even though the update portion of the update_or_create() call had nothing to do with the 'active' field.  Besides the data issues, it is also not efficient to have the database update every field/column when there is no need.

I have tested this behavior by putting a time.sleep() into update_or_create right after the .get() and updating the database during this sleep.  
Once the sleep finished, my changes were overwritten by .save()

My initial thought to patch this was to change
{{{ obj.save(using=self.db) }}}
in update_or_create into
{{{ obj.save(using=self.db, update_fields=defaults.keys()) }}}

However, this causes a single test to fail in generic_relations, since update_fields in .save() does not accept GenericForeignKey fields, which may be bug in and of itself.  There are other ways to fix this like locking the row, but that would be less efficient and not fix the underlying issue.

{{{
ERROR: test_update_or_create_defaults (generic_relations.tests.GenericRelationsTests)
----------------------------------------------------------------------
Traceback (most recent call last):
  File ""/usr/lib64/python2.7/unittest/case.py"", line 365, in run
    testMethod()
  File ""/home/jensen/sandbox/django/tests/generic_relations/tests.py"", line 540, in test_update_or_create_defaults
    tag, created = TaggedItem.objects.update_or_create(tag=""shiny"", defaults={'content_object': diamond})
  File ""/home/jensen/sandbox/django/django/db/models/manager.py"", line 85, in manager_method
    return getattr(self.get_queryset(), name)(*args, **kwargs)
  File ""/home/jensen/sandbox/django/django/db/models/query.py"", line 493, in update_or_create
    obj.save(using=self.db, update_fields=defaults.keys())
  File ""/home/jensen/sandbox/django/django/db/models/base.py"", line 782, in save
    % ', '.join(non_model_fields))
ValueError: The following fields do not exist in this model or are m2m fields: content_object
}}}


"	Bug	closed	Database layer (models, ORM)	dev	Normal	fixed	update_or_create		Accepted	1	0	0	1	0	0
