Opened 9 years ago

Closed 9 years ago

Last modified 9 years ago

#24591 closed Cleanup/optimization (fixed)

Copy ModelState.fields and ModelState.managers instead of cloning.

Reported by: Marten Kenbeek Owned by: Marten Kenbeek
Component: Migrations Version: dev
Severity: Normal Keywords:
Cc: Triage Stage: Ready for checkin
Has patch: yes Needs documentation: no
Needs tests: no Patch needs improvement: no
Easy pickings: no UI/UX: no

Description

Fields in aModelState are guaranteed to not be bound to a model, and are deconstructed and reconstructed in ModelState.from_model(). They are also properly deconstructed and reconstructed when rendering a model. Furthermore, the fields themselves must not change during the lifetime of a ModelState object, though you may freely add, remove and replace fields (1). They should be considered immutable within model states.

I think the current implementation of ModelState.clone() is too conservative in this regard. Both the input (in ModelState.from_model()) and the output (in ModelState.render()) do a full de- and reconstruct. The only change in between is the addition and removal of fields from the list. ModelState.clone() can do a simple list(self.fields) and still maintain the correct behaviour. The same holds for ModelState.managers. This change saves about 75% in rendering time, and 66% in memory usage in my benchmarks.

If you want to safeguard against accidental changes that can affect other model states, you can use copy.deepcopy(), though at a performance cost. Other options are to somehow shield against accidental changes by generating an immutable field class, or by saving the deconstructed field and constructing the fields on demand.

As this is the most performance-critical part of migrations -- save the actual database manipulation -- I suggest we take a careful look into this. The current implementation with deconstruct() is far from the ideal solution.

(1) https://github.com/django/django/blob/master/django/db/migrations/state.py#L294

Change History (12)

comment:1 by Marten Kenbeek, 9 years ago

Summary: Copy ModelState.fields and ModelState.managers instead of cloneCopy ModelState.fields and ModelState.managers instead of cloning.

comment:2 by Markus Holtermann, 9 years ago

Triage Stage: UnreviewedAccepted

comment:3 by Marten Kenbeek, 9 years ago

Status: newassigned

The last commit in !4460 stores the deconstructed state when initializing the wrapper. I think that's the most comprehensive protection we can offer, without a noticeable performance cost. We might issue a warning instead of raising an exception in __setattr__: it is no longer an error to change the field, it is simply useless.

comment:4 by Tim Graham, 9 years ago

Triage Stage: AcceptedReady for checkin

comment:5 by Simon Charette, 9 years ago

Patch needs improvement: set
Triage Stage: Ready for checkinAccepted

comment:6 by Markus Holtermann, 9 years ago

Needs documentation: set

Checking "Needs documentation" because the operations documentation needs a warning about not modifying fields and possible breaking change for 3rd party operations in the release notes.

comment:7 by Marten Kenbeek, 9 years ago

Needs documentation: unset
Patch needs improvement: unset

A more conservative backport for 1.8: https://github.com/django/django/pull/4533

comment:8 by Markus Holtermann, 9 years ago

Triage Stage: AcceptedReady for checkin

comment:9 by Tim Graham <timograham@…>, 9 years ago

In 51690754:

[1.8.x] Refs #24591 -- Optimized cloning of ModelState objects.

Changed ModelState.clone() to create a deepcopy of self.fields.

comment:10 by Tim Graham <timograham@…>, 9 years ago

Resolution: fixed
Status: assignedclosed

In 1a1f16d6:

Fixed #24591 -- Optimized cloning of ModelState objects.

Changed ModelState.clone() to create a shallow copy of self.fields
and self.managers.

comment:11 by Matthew Jacobi, 9 years ago

I know this has been marked fixed but how is this fixed? I followed another ticket to here in which was about the migrations being slow, well this is what I get after upgrading Django from 1.6 to 1.8 when running test (To see if everything works):

Running migrations:

Rendering model states...
DONE (1410.571s)

Then it goes on and runs each migration taking 1 - 60 per migrations. How is 20+ minutes acceptable? This is the time it takes before it even starts applying the new initial migrations.

This is an old project that started in the early django days, even Andrew Godwin himself worked on this project for a bit.

comment:12 by Tim Graham, 9 years ago

It's an ongoing project to improve the speed of migrations. Django 1.9 has more improvements, please test it if you are able. Feel free to create a ticket with details and profiling output to help us out. Unfortunately, I don't think Django 1.8 is likely to get more improvements at this point.

Note: See TracTickets for help on using tickets.
Back to Top