Opened 9 years ago

Closed 9 years ago

Last modified 9 years ago

#24554 closed Cleanup/optimization (fixed)

Optimize migration's state rendering

Reported by: Ryan Hall Owned by: Ryan Hall
Component: Migrations Version: dev
Severity: Normal Keywords:
Cc: Triage Stage: Accepted
Has patch: no Needs documentation: no
Needs tests: no Patch needs improvement: no
Easy pickings: no UI/UX: no

Description (last modified by Ryan Hall)

We're currently running django 1.7.2 and been having performance issues with migrate/makemigrations for the last few months now, but until recently it hasn't caused us too much trouble. Recently, migrate/makemigrations calls have reached a point where they take up to 45minutes to run. Unfortunately, this is due to the fact that we have around 700 models defined with almost 850 migrations since the initial project. Regardless, I've been looking for any ways to speed this process up in 1.7.2/1.7.7/1.8b2/master and found that we have a significant bottleneck with the way states are cloned.

Out of curiousity, I profiled latest master against a set of ~7 migrations to run and found this:

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
        1    0.001    0.001 2687.730 2687.730 manage.py:2(<module>)
        1    0.000    0.000 2687.669 2687.669 django/core/management/__init__.py:325(execute_from_command_line)
        1    0.000    0.000 2687.669 2687.669 django/core/management/__init__.py:265(execute)
        1    0.000    0.000 2680.113 2680.113 django/core/management/base.py:326(run_from_argv)
        1    0.002    0.002 2680.111 2680.111 django/core/management/base.py:361(execute)
        1    0.198    0.198 2678.912 2678.912 django/core/management/commands/migrate.py:50(handle)
        1   10.405   10.405 2676.084 2676.084 django/db/migrations/executor.py:65(migrate)
      803    0.024    0.000 2581.336    3.215 django/db/migrations/migration.py:72(mutate_state)
  7528892   36.422    0.000 1578.058    0.000 django/db/migrations/state.py:468(construct_fields)
     4137    0.504    0.000 1470.027    0.355 django/db/migrations/state.py:81(reload_model)
      847    0.012    0.000 1192.585    1.408 django/db/migrations/state.py:135(clone)
      847    0.939    0.001 1171.224    1.383 django/db/migrations/state.py:138(<dictcomp>)
   402948    6.238    0.000 1170.285    0.003 django/db/migrations/state.py:488(clone)

In migration.py(mutate_state), the project state gets cloned before applying the operations everytime. I realize that the project state is intended to be immutable since the intermediary states need to be used most of the time, but I'm curious if it is required. During phase 1 of migration execution, the state gets cloned on every iteration inside mutate_state at line 90:

        for migration, _ in full_plan:
            if migration in migrations_to_run:
                states[migration] = state.clone()
            state = migration.mutate_state(state)  # state is cloned inside

and also in migrations/graph.py(make_state) at line 274:

        for node in plan:
            project_state = self.nodes[node].mutate_state(project_state)

If I modify mutate_state and add an option to not preserve (no clone) the passed in state in these specific cases, the migration time is cut in half:

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
        1    0.001    0.001  955.337  955.337 manage.py:2(<module>)
        1    0.000    0.000  955.269  955.269 django/core/management/__init__.py:325(execute_from_command_line)
        1    0.000    0.000  955.269  955.269 django/core/management/__init__.py:265(execute)
        1    0.000    0.000  946.453  946.453 django/core/management/base.py:326(run_from_argv)
        1    0.002    0.002  946.450  946.450 django/core/management/base.py:361(execute)
        1    0.169    0.169  945.012  945.012 django/core/management/commands/migrate.py:50(handle)
        1    0.006    0.006  941.582  941.582 django/db/migrations/executor.py:65(migrate)
     4137    0.387    0.000  869.433    0.210 django/db/migrations/state.py:81(reload_model)
      803    0.020    0.000  863.434    1.075 django/db/migrations/migration.py:72(mutate_state)
   100499   28.233    0.000  405.016    0.004 django/apps/registry.py:323(clear_cache)
     1753    0.027    0.000  399.287    0.228 django/db/migrations/operations/fields.py:43(state_forwards)

My question is: does the state need to be cloned in mutate_state during phase 1? The intermediary states are thrown away besides the explicit calls to clone here:

            if migration in migrations_to_run:
                states[migration] = state.clone()

Here's an example commit with this change: https://github.com/ryanahall/django/commit/a4777e11d8950bd86b07d00ff2d291567c7ba0ff

Change History (13)

comment:1 by Ryan Hall, 9 years ago

Summary: Migrations taking close to an hour to runMigrations taking up to an hour to run

comment:2 by Ryan Hall, 9 years ago

Description: modified (diff)

comment:3 by Ryan Hall, 9 years ago

Owner: changed from nobody to Ryan Hall
Status: newassigned

comment:4 by Marten Kenbeek, 9 years ago

Triage Stage: UnreviewedAccepted
Version: 1.7master

At first glance this looks like an effective and safe way to speed up migrations.

Could you also check how 888c9b6429a44824078a49fb1ebacf2e950cd887 affects the migration time on your project? I'm quite curious how much of a difference that will make.

comment:5 by Markus Holtermann, 9 years ago

Hey Ryan, thanks for reporting the issue. Would you mind checking our https://github.com/django/django/pull/4151 . Besides the change you made above (except for the part where you touch the graph) it also incorporates some further speedups.

Last edited 9 years ago by Markus Holtermann (previous) (diff)

comment:6 by Ryan Hall, 9 years ago

@knbk thanks for pointing out that change, I'll pull it and give it a shot. I also just commented on my pr, but for the sake of consistency I'll put it here as well.

Thanks for the quick response @MarkusH, I just pulled that pr's branch and tested it out - looks good to me! Here's the output from cProfile this time around:

        1    0.001    0.001 1047.865 1047.865 manage.py:2(<module>)
        1    0.000    0.000 1047.800 1047.800 /django/core/management/__init__.py:325(execute_from_command_line)
        1    0.000    0.000 1047.800 1047.800 /django/core/management/__init__.py:265(execute)
        1    0.000    0.000 1040.621 1040.621 /django/core/management/base.py:326(run_from_argv)
        1    0.002    0.002 1040.619 1040.619 /django/core/management/base.py:361(execute)
        1    0.174    0.174 1039.459 1039.459 /django/core/management/commands/migrate.py:50(handle)
        1    0.084    0.084 1036.643 1036.643 /django/db/migrations/executor.py:65(migrate)
     3851    0.329    0.000  958.341    0.249 /django/db/migrations/state.py:87(reload_model)
      800    0.014    0.000  953.813    1.192 /django/db/migrations/migration.py:72(mutate_state)
     3153    0.178    0.000  455.806    0.145 /django/db/migrations/state.py:230(render_multiple)
   133631   31.220    0.000  438.594    0.003 /django/apps/registry.py:323(clear_cache)
    56702    2.274    0.000  425.492    0.008 /django/db/migrations/state.py:505(render)
     1745    0.021    0.000  420.844    0.241 /django/db/migrations/operations/fields.py:43(state_forwards)
    62640    0.281    0.000  407.227    0.007 /django/db/migrations/state.py:279(unregister_model)
     1166    0.031    0.000  402.408    0.345 /django/db/migrations/operations/fields.py:172(state_forwards)
  1358075    6.334    0.000  383.427    0.000 /django/db/migrations/state.py:474(construct_fields)
 30963570  233.718    0.000  347.263    0.000 /django/db/models/options.py:712(_expire_cache)
   139593  116.218    0.001  253.901    0.002 /django/db/models/fields/related.py:297(swappable_setting)
   127628    0.681    0.000  238.117    0.002 /django/db/models/fields/related.py:1927(deconstruct)
   127628    1.165    0.000  237.204    0.002 /django/db/models/fields/related.py:1635(deconstruct)

For reference, most of the time was spent while rendering the states which was expected.

comment:7 by Ryan Hall, 9 years ago

Actually @knbk, it looks like I've been profiling with that commit all along. I pulled master yesterday, so these results are fairly recent.

comment:8 by Markus Holtermann <info@…>, 9 years ago

Resolution: fixed
Status: assignedclosed

In 57dc8dd3:

Fixed #24554 -- Sped up migrations by rendering initial apps when they are first needed

Calling Migration.mutate_state() now also allows to do in_place
mutations in case an intermediate state is thrown away later.

Thanks Anssi Kääriäinen for the idea, Ryan Hall for parts of the patch,
and Claude Paroz and Tim Graham for the review

comment:9 by Markus Holtermann <info@…>, 9 years ago

In 3e7d9d05:

Refs #24554 -- Prevented rendering of unused migrations

Thanks Claude Paroz and Tim Graham for the review

comment:10 by Markus Holtermann <info@…>, 9 years ago

In 0c7e283:

[1.8.x] Fixed #24554 -- Sped up migrations by rendering initial apps when they are first needed

Calling Migration.mutate_state() now also allows to do in_place
mutations in case an intermediate state is thrown away later.

Thanks Anssi Kääriäinen for the idea, Ryan Hall for parts of the patch,
and Claude Paroz and Tim Graham for the review

Backport of 57dc8dd3fa3c3adf133c522ecadb501d94bacd52 from master

comment:11 by Markus Holtermann <info@…>, 9 years ago

In 2218f761:

[1.8.x] Refs #24554 -- Prevented rendering of unused migrations

Thanks Claude Paroz and Tim Graham for the review

Backport of 3e7d9d05ac5efff4e4732c3453c7a8ef502d0ed0 from master

comment:12 by Tim Graham, 9 years ago

Summary: Migrations taking up to an hour to runOptimize migration's state rendering

comment:13 by Tim Graham <timograham@…>, 9 years ago

In 8f987323:

Added Ryan Hall to AUTHORS. Refs #24554

Note: See TracTickets for help on using tickets.
Back to Top