Opened 3 weeks ago

Closed 2 weeks ago

Last modified 2 weeks ago

#36869 closed Cleanup/optimization (fixed)

Optimise migration graph planning

Reported by: James Fysh Owned by: nessita <124304+nessita@…>
Component: Database layer (models, ORM) Version: dev
Severity: Normal Keywords: migration digraph loader
Cc: Nick Pope Triage Stage: Ready for checkin
Has patch: yes Needs documentation: no
Needs tests: no Patch needs improvement: no
Easy pickings: no UI/UX: no

Description

When django needs to work with migrations - making & applying migrations, running checks - it loads available migrations from disk and builds up an in-memory digraph.

The django project that I work on has managed to accumulate a large number of django apps and associated migrations. I'm going to leave the obvious discussion around migration squashing out of this - let's just assume it is not practical to do in the near term.

Anecdotally, starting the django application, eg. when running management commands or simply launching the server is noticeably slow. Profiling has revealed that a significant portion of startup time is spent running migration checks.

To cut a long story short, MigrationGraph._generate_plan() currently runs in O(n*n), and with a ~3 line patch we can reduce this to O(n).

For small django projects I believe this will not result in any change in performance. For large projects such as ours, this can result in a very significant reduction in startup time. Given the nature of the change (simple and arguably low-risk) I think it would be a reasonable change to make.

Change History (5)

comment:1 by Lily, 3 weeks ago

Triage Stage: UnreviewedAccepted

comment:2 by Lily, 3 weeks ago

Has patch: set

comment:3 by Natalia Bidart, 3 weeks ago

Cc: Nick Pope added

I started the triage for this ticket and I truly wanted to see the numbers. I built a benchmarking script, and got this numbers:

  • code from main:
    compare   100 against   200 migrations (2.0x size): 2.92x time
    compare   200 against   400 migrations (2.0x size): 3.40x time
    compare   400 against   800 migrations (2.0x size): 3.75x time
    compare   800 against  1600 migrations (2.0x size): 3.69x time
    compare  1600 against  3200 migrations (2.0x size): 3.96x time
    
  • code from the proposed PR:
    compare   100 against   200 migrations (2.0x size): 1.87x time
    compare   200 against   400 migrations (2.0x size): 2.01x time
    compare   400 against   800 migrations (2.0x size): 2.06x time
    compare   800 against  1600 migrations (2.0x size): 2.10x time
    compare  1600 against  3200 migrations (2.0x size): 1.97x time
    
  • code proposed from Nick:
    compare   100 against   200 migrations (2.0x size): 1.75x time
    compare   200 against   400 migrations (2.0x size): 2.22x time
    compare   400 against   800 migrations (2.0x size): 2.12x time
    compare   800 against  1600 migrations (2.0x size): 1.77x time
    compare  1600 against  3200 migrations (2.0x size): 2.13x time
    

Perhaps we should consider Nick's suggestion to keep the logic a little more straightforward?

comment:4 by nessita <124304+nessita@…>, 2 weeks ago

Owner: set to nessita <124304+nessita@…>
Resolution: fixed
Status: newclosed

In 59fcd2a:

Fixed #36869 -- Optimized MigrationGraph._generate_plan membership checks.

Previously, _generate_plan() relied on list membership checks,
resulting in quadratic behavior as the plan grew. On large migration
graphs this became a significant performance bottleneck.

This change uses OrderedSet for the plan, reducing the complexity to
linear while preserving insertion order and behavior.

Co-authored-by: Nick Pope <nick@…>

comment:5 by Jacob Walls, 2 weeks ago

Triage Stage: AcceptedReady for checkin
Note: See TracTickets for help on using tickets.
Back to Top