#29193 closed Bug (fixed)
Altering a field with a unique constraint drops and rebuilds FKs to other fields in the table
Reported by: | Jeremy Bowman | Owned by: | Jeremy Bowman |
---|---|---|---|
Component: | Migrations | Version: | 1.11 |
Severity: | Release blocker | Keywords: | |
Cc: | Triage Stage: | Accepted | |
Has patch: | yes | Needs documentation: | no |
Needs tests: | no | Patch needs improvement: | yes |
Easy pickings: | no | UI/UX: | no |
Description
In #28305, a bug was fixed that prevented a MySQL field with a unique constraint from being altered if there was a foreign key constraint on it. However, the fix itself seems to have a more subtle bug in it: instead of dropping and recreating just foreign keys affecting that particular field, it does so for all foreign key constraints on any field in that table. This still yields a valid result, but for large tables with many incoming FK constraints this can dramatically increase the duration of the migration.
In particular, migration 0008 from django.contrib.auth (which changes the length of the username, a field with a unique constraint) now causes all foreign keys to the user ID to be dropped and later recreated throughout the database. I'm working with a MySQL database where this increases the duration of the migration from 10-20 minutes to nearly 48 hours, during most of which writes are blocked on at least one of the tables for which the indexes need to be recreated. Thankfully, we caught it during load testing with a production-sized database before attempting a production deployment. This particular manifestation of the problem only hits installations that were populated with data using a pre-1.10 version of Django but are attempting to go straight to 1.11 or above (which I suspect is actually fairly common given that 1.8 and 1.11 are LTS versions, and 1.10 is no longer being supported).
The problem seems to be in the usage of _related_non_m2m_objects
to enumerate the foreign key constraints; this function returns all relations involving the table of the field provided, not just that specific field. I discovered the bug in 1.11, but it still seems to be present in master.
Change History (10)
comment:1 by , 7 years ago
Severity: | Normal → Release blocker |
---|---|
Triage Stage: | Unreviewed → Accepted |
comment:2 by , 6 years ago
Sure, I'll go ahead and attempt a patch sometime in the next few days. I'm helping deploy the upgrade to Django 1.11 on our main service today, so my availability will partially depend on how smoothly that goes.
comment:3 by , 6 years ago
https://github.com/massover/bug_29193
The above is a small project to reproduce.
I could not reproduce it with the built in auth User model. I used a custom User model. While using the built in auth User model, it looked _related_non_m2m_objects
did not return my related models because of related_objects
.
It looks like the code that changed the behavior is this https://github.com/django/django/commit/c3e0adcad8d8ba94b33cabd137056166ed36dae0#diff-b69190ab88f6c5737b2562d94d7bf36bR539
# Drop incoming FK constraints if the field is a primary key or unique, # which might be a to_field target, and things are going to change. drop_foreign_keys = ( ( (old_field.primary_key and new_field.primary_key) (old_field.unique and new_field.unique) ) and old_type != new_type )
Before it would only drop and recreate constraints if the field was a foreign key. Now it drops and recreates all constraints if the field is unique.
Naively removing the unique check that was added in https://code.djangoproject.com/ticket/28305 will fix this problem. All the tests pass. Looking back at the ticket I don't really understand the intention, so this seems like an awful idea.
comment:4 by , 6 years ago
I have a tentative fix for this now, but still working on a regression test case. I'm also having enough trouble getting mysqlclient working on my Mac that I'm probably going to switch over to a Linux VM for testing this (most of my regular work involving MySQL is in a Docker container using the non-Python-3-compatible MySQL-python). The change itself is simple enough:
--- a/django/db/backends/base/schema.py +++ b/django/db/backends/base/schema.py @@ -18,8 +18,10 @@ def _related_non_m2m_objects(old_field, new_field): # Filter out m2m objects from reverse relations. # Return (old_relation, new_relation) tuples. return zip( - (obj for obj in old_field.model._meta.related_objects if not obj.field.many_to_many), - (obj for obj in new_field.model._meta.related_objects if not obj.field.many_to_many) + (obj for obj in old_field.model._meta.related_objects + if not obj.field.many_to_many and old_field.name in obj.field.to_fields), + (obj for obj in new_field.model._meta.related_objects + if not obj.field.many_to_many and new_field.name in obj.field.to_fields) )
This restricts the returned relations to just the ones involving the given field. If anybody spots a problem with this approach while I'm getting the tests written, let me know.
A quick note regarding #28305 - the problem there was that MySQL would fail to modify a field with a unique index if there were still any foreign key constraints on it from another model. That was a legitimate issue, and the fix provided did solve that; it was just accidentally overzealous in temporarily dropping all foreign key constraints to any field in the table instead of just the single field being modified. This was actually a problem even with the previous code that only did the FK dropping when changing the primary key; that just didn't cause a problem very often because the vast majority of FK constraints are in fact on the primary key, so there was rarely any collateral damage in dropping any FK constraints that happened to exist on other fields in the table.
comment:5 by , 6 years ago
You'll notice some test failures in the migrations
app (using PostgreSQL or MySQL) with that patch.
comment:6 by , 6 years ago
Has patch: | set |
---|---|
Owner: | changed from | to
Status: | new → assigned |
I've submitted a PR for this; the problem with my first attempt above was that foreign keys which don't specify a target field end up with to_fields
set to [None]
instead of ['id']
. The new version correctly handles this case, splits out the relevant logic into a separate function, and adds a regression test which counts the number of SQL statements executed by a unique field alteration that previously dropped and recreated an FK on the primary key of its table.
comment:7 by , 6 years ago
Patch needs improvement: | set |
---|
The report seems credible, although I haven't confirmed it. The patch for #28305 was difficult for me. Do you have any time to work on this issue? We'll probably make an exception to the 1.11 support timeline and backport a fix to 1.11 since it's a serious performance regression.