Opened 9 years ago

Closed 9 years ago

Last modified 9 years ago

#25363 closed Bug (needsinfo)

Model fields pre_save not called if force_insert=True

Reported by: David Fischer Owned by: David Fischer
Component: Database layer (models, ORM) Version: 1.8
Severity: Normal Keywords: pre_save, save
Cc: Triage Stage: Unreviewed
Has patch: no Needs documentation: no
Needs tests: no Patch needs improvement: no
Easy pickings: no UI/UX: no

Description

Hello,

We use UUIDs as primary keys.
Because a collision is unlikely to happen, we set the default value of the model's PK field to uuid.uuid4.

But then, we had some issues with Django trying to UPDATE and then INSERT (OK we can understand the logic, but its boring). So we created a small mixin that set the value of force_insert to self._state.adding.

Unfortunately we discovered that the pre_save method of the fields aren't called if insert is forced: https://github.com/django/django/blob/master/django/db/models/base.py#L786.

That's an undocumented and unexpected feature. The documentation specify that the Field's pre_save methods are called on save, and its a save.

Expected: Field.pre_save called unconditionally.
Current: Field.pre_save called at some conditions.
Workaround: Connect to model pre_save signal ...

Change History (9)

comment:1 by Tim Graham, 9 years ago

I'm not sure if there might be any problems in changing this behavior. Would you be able to investigate and offer a patch?

comment:2 by David Fischer, 9 years ago

Owner: changed from nobody to David Fischer
Status: newassigned

Yes I will.

comment:3 by David Fischer, 9 years ago

Has patch: set

I just wrote a test, and discovered the pre_save method is always called, but in two different places depending if its an INSERT or an UPDATE. INSERT: in SqlInsertCompiler and UPDATE: in Model._save_table.

https://github.com/django/django/blob/stable/1.8.x/django/db/models/sql/compiler.py#L927
https://github.com/django/django/blob/stable/1.8.x/django/db/models/base.py#L823

Please, can anyone explain me why?

Also, unfortunately the fact that pre_save is called lately during the save defeat the implementation of a clean mix-in to track updated fields and make UPDATE queries lighter. But that's another story.

comment:4 by David Fischer, 9 years ago

Has patch: unset

comment:5 by Tim Graham, 9 years ago

Those types of "why" questions are often easiest answered by trying to make a change you think will solve your problem and then seeing if any tests break. Looking through git blame can also be helpful.

comment:6 by Tim Graham, 9 years ago

Resolution: needsinfo
Status: assignedclosed

Feel free to reopen if you have a suggestion on what changes we could make. I'm not sure how to proceed with this ticket otherwise. Thanks!

comment:7 by Anssi Kääriäinen, 9 years ago

A clean mix-in to track updated fields to both make UPDATE queries lighter, and to also skip update completely when no data has changed is a big issue for me. So, if we can make that easier for users, I think we should do that.

BTW here is my approach for this issue:

class ModTrackingMixin(object):
    @classmethod
    def from_db(cls, *args, **kwargs):
        new = super().from_db(*args, **kwargs)
        new._state.db_vals = new.__dict__.copy()
        return new

    def save(self, update_fields=None):
        if hasattr(self._state, 'db_vals') and update_fields is None:
            changed = []
            orig = self._state.db_vals
            for f, fname in [(f, f.attname) for f in self._meta.get_fields()
                                       if f.concrete]:
                if (fname in orig and
                         f.to_python(getattr(self, fname)) != orig.get(fname, NO_VALUE)):
                    changed.append(fname)
            return super().save(update_fields=changed)
        else:
            return super().save(update_fields=update_fields)


class Organization(models.Model):
    code = models.CharField(max_length=20)
    name= models.CharField(max_length=100)

    class Meta:
        db_table = 'organization'


class ModTrackingOrganization(ModTrackingMixin, Organization):
    class Meta:
        proxy = True

The idea is that if you do:

o = ModTrackingOrganization.objects.first()
o.name = o.name
o.save()

Then the save will actually be silently skipped (as no data has changed in the DB).

I wonder if this is actually broken for some cases by the pre_save logic we have currently in Django?

comment:8 by David Fischer, 9 years ago

Hi akaariai,

Your mix-in is interesting but it has some issues, hopefully easy to fix.

  1. This is actually broken by the pre_save logic for a simple reason : The pre_save methods are called inside the save, a.k.a inside the super().save(...) so your mix-in and my mix-in AutoUpdateFieldsMixin will not detect those. Its a reason why I implemented CallFieldsPreSaveMixin.

I also implemented another mix-in (sometimes I feel like a DJ, doing many mixes),AutoRemovePKFromUpdateFieldsMixin that will check if the primary key is set to a new value, then the intention of the developer is probably to duplicate the model by saving it with a new primary key. So the mix-in let Django save with its own default options for save.

  1. __dict__.copy() is not sufficient, if one of the fields, is, for example, a JSONField then the content is a mutable type (dict). This container may contains nested mutable types, and you have to deepcopy to ensure your copy will reflect the unchanged data from the database and will not be modified.
  1. your overload of save() isn't generic enough to support all possible arguments.

Interesting you saved the data in _state.
That's cleaner than my mix-in.

I am thinking loudly, but is it more performant that any field set by the code should be updated regardless the value, something that will not happen often, or is it better to check for diff. Checking for diff will also double the memory usage for the data (is it any copy-on-write mix-in out here?).

We also need to think about concurrent updates when designing a save-the-changes mix-in. For that reason I implemented another mix-in UpdatePreconditionsMixin. I don't know what is the best option, but maybe its more explicit and intuitive that if the program sets the fields, that they will be saved regardless the value.

Yes, many questions and thoughts.
I am eager to heard about you guys.

comment:9 by David Fischer, 9 years ago

Interesting: https://docs.python.org/3.4/library/collections.html#collections.ChainMap.
That can be our copy-on-write "kind of" data structure!

Note: See TracTickets for help on using tickets.
Back to Top