Opened 18 months ago

Last modified 12 months ago

#26760 new Cleanup/optimization

Delete nonexistent migrations from django_migrations table

Reported by: Jarek Glowacki Owned by: nobody
Component: Migrations Version: master
Severity: Normal Keywords: django_migrations squash migrations
Cc: Markus Holtermann, dev@… Triage Stage: Accepted
Has patch: no Needs documentation: no
Needs tests: no Patch needs improvement: no
Easy pickings: no UI/UX: no

Description

Django adds a django_migrations table to the database which list all of the migrations that have been applied (and when).

With the introduction of squashmigrations, it is possible for this table to contain a lot of old migrations that no longer exist. This can be problematic if naming duplication occurs:

Example:
I have an app with:

my_app/migrations/
0001_initial.py
0002_blah.py
0003_blah.py

I squash and delete replaced migrations:

my_app/migrations/
0001_initial_squashed_0003_blah.py

I create a new migration and use poor naming:

my_app/migrations/
0001_initial_squashed_0003_blah.py
0002_blah.py

My new migration never runs because the django_migrations table thinks it has already been applied.

I propose truncation of the django_migrations table so that it includes only migrations that actually exist in the django project. This could be done automatically (when executor runs, or inside the migrate mcommand). Or have its own mcommand that requires it be run manually. I prefer the automatic approach though.

Pros:

  • Cleans up old data that just bloats the database.
  • Protects users from the trap mentioned above where a new migration is created with the same name as one that was applied in the past.

Cons:

  • A loss of historical information.

Note:
Need to be careful with implementation to avoid a possible new trap if a user squashes migrations and then proceeds to delete the replaced migrations before running the squashed migrations on their database -> django will think squashed migrations haven't been applied and will attempt to reapply them. This can be remedied simply by not removing migrations mentioned in replaces lists of other migrations from django_migrations (ie. we'd consider replaced migrations as still existing, even if their actual files have already been removed).

Change History (4)

comment:1 Changed 18 months ago by Tim Graham

Summary: django_migrations table is never cleaned.Delete nonexistent migrations from django_migrations table
Triage Stage: UnreviewedAccepted

In #26429 we added a timestamp to merge migration names to reduce the likelihood of collisions there. I acknowledge this could happen in other situations though.

comment:2 Changed 17 months ago by Shai Berger

Note #25255 and #24900 - people sometimes still want to use the squashed migrations (e.g. migrate back into the series that was squashed) in the presence of the merged migration. I note that the suggestion is only to remove from the database migrations whose files no longer exist. This plays well with the suggestion to keep the migrations named in "replaces", as we recommend that when the squashed migration files are removed, the "replaces" clause is also removed from the migration.

comment:3 Changed 16 months ago by Сергей

I think there is no need to delete records from db because they are almost always generated automatically and have unique names.

I see one issue with automatic deleting of records: if I have django-apps which are temporary disabled or are enabled and deploy in this moment, data about their migrations will be lost.

comment:4 Changed 12 months ago by Ryan Kaskel

Cc: dev@… added
Note: See TracTickets for help on using tickets.
Back to Top