Opened 5 years ago

Closed 5 years ago

Last modified 8 weeks ago

#27017 closed Uncategorized (invalid)

Why doesn't Django's Model.save() save only the dirty fields by default? And how can I do that if I want?

Reported by: prajnamort Owned by: nobody
Component: Database layer (models, ORM) Version: 1.8
Severity: Normal Keywords:
Cc: Dan Tao Triage Stage: Unreviewed
Has patch: no Needs documentation: no
Needs tests: no Patch needs improvement: no
Easy pickings: no UI/UX: no

Description

I've noticed that Model.save() will update all fields by default, which can introduce a lot of race conditions.
If it update only the dirty fields, the situation would be much better.
How can I do that?

Change History (6)

comment:1 Changed 5 years ago by Gwildor Sok

You can manually pass update_fields to the save() method. Only the fields in that list will be updated through the query. See the docs: https://docs.djangoproject.com/en/1.9/ref/models/instances/#specifying-which-fields-to-save

comment:2 Changed 5 years ago by Tim Graham

Resolution: invalid
Status: newclosed

Please see TicketClosingReasons/UseSupportChannels for places to ask usage questions.

comment:3 Changed 2 years ago by Dan Tao

Cc: Dan Tao added

Can I make a case for re-opening this?

I understand that update_fields makes it possible to only update specific fields of a model. But it places a significant burden on calling code and introduces a maintenance cost. For me to explain, first consider a typical function where update_fields can be useful:

def update_thing(pk, foo):
    thing = Thing.objects.get(pk=pk)
    thing.foo = foo
    thing.save()

Code like this is incredibly common but potentially problematic, especially for sites with heavy production traffic. Different processes running to update various fields on the same model at the same time are prone to clobber each other's writes. This is where update_fields is currently the best fix available:

def update_thing(pk, foo):
    thing = Thing.objects.get(pk=pk)
    thing.foo = foo
    thing.save(update_fields=['foo'])

I see two ways this could be better. First, this solution requires calling code to define the same information twice (what field(s) to update). Second, it adds a maintenance tax, as any developer who sets another field in the future has to remember to also update update_fields:

def update_thing(pk, foo, bar):
    thing = Thing.objects.get(pk=pk)
    thing.foo = foo
    thing.bar = bar
    thing.save(update_fields=['foo', 'bar'])

The above example is contrived, of course; most real-world functions are bigger and more complex than this, meaning the opportunity to make mistakes is typically greater.

In my opinion Django could make most code bases inherently more resilient against latent race conditions by implementing some form of dirty field tracking and effectively providing the functionality of update_fields automatically. I would like to propose a new setting, something like SAVE_UPDATE_DIRTY_FIELDS_ONLY, to change the ORM's default behavior so that calls to Model.save() only update the fields that have been set on the model instance. Naturally for backwards compatibility this setting would be False by default.

I admit I probably haven't thought through all of the scenarios in which this might not be desirable. But my intuition is that more often than not, this change would be a very good one. Off the top of my head, some necessary exceptions to this behavior include:

  • Calling save() on a new model instance without a PK (when inserting a record for the first time we obviously want to save all fields' default values)
  • Fields that are designed to be set automatically, e.g. DateTimeField(auto_now=True)
  • Any calls to save() where update_fields has been explicitly specified should remain untouched, I would think

If I'm making sense here, and there is support for re-opening this, perhaps it would make sense to update the title of this ticket to sound more like a feature request since I realize it currently reads like a usage question.

comment:4 Changed 2 years ago by Tim Graham

There's discussion in #4102 about trying to save only dirty fields. It looks like there were too many complications. If you want to try to tackle this, you should make your proposal on the DevelopersMailingList.

comment:5 Changed 16 months ago by Andreas Bergström

There is the django-dirtyfields that will at least tell you if a model is dirty and what fields that's dirty, but it won't do the saving...
https://github.com/romgar/django-dirtyfields

comment:6 Changed 8 weeks ago by Jack Linke

Just a note for anyone coming across Andreas' comment above. django-dirtyfields does now make it possible to update only the dirty (changed) fields: https://django-dirtyfields.readthedocs.io/en/develop/#saving-dirty-fields

Note: See TracTickets for help on using tickets.
Back to Top