Opened 5 years ago

Closed 4 years ago

Last modified 4 years ago

#23581 closed Cleanup/optimization (fixed)

Migration shouldn't generate "DROP DEFAULT" SQL if no database changes are required

Reported by: John-Scott Atlakson Owned by: Tim Graham
Component: Migrations Version: 1.7
Severity: Release blocker Keywords:
Cc: Andrew Godwin, Iacopo Spalletti, Shai Berger Triage Stage: Accepted
Has patch: yes Needs documentation: no
Needs tests: no Patch needs improvement: no
Easy pickings: no UI/UX: no

Description (last modified by John-Scott Atlakson)

Consider a simple model:

from django.db import models

CHOICES = (
    ('1', '1'),
)

class BasicModel(models.Model):
    choices = models.CharField(choices=CHOICES, max_length=100)

If CHOICES is changed e.g. CHOICES += (('2', '2'),), then makemigrations insists on creating a new migration for this change. However, the migration appears to mostly be a no-op. sqlmigrate on PostgreSQL shows:

BEGIN;
ALTER TABLE "core_basicmodel" ALTER COLUMN "choices" DROP DEFAULT;

COMMIT;

Which is slightly strange since the initial migration never set a DEFAULT, so there is nothing to DROP.

I first noticed this working with django-cms (see https://github.com/divio/django-cms/issues/3479). In some cases when I attempt to make a migration for an unrelated app, makemigrations will forcefully create a migration for the cms app because the settings.CMS_TEMPLATES choices have changed. It then makes this automatically created migration (which is output in the virtualenv e.g. env/src/django-cms/cms/migrations_django) a dependency for the migration I actually intended to create. There doesn't appear to be a way to avoid this. This means I now have a migration I actually need for one of my apps that is completely dependent on me creating a migration for a third party app, which I don't directly control, resulting in migrations that are broken for other developers.

This can also happen when the choices differ in development vs deployment, Django will complain:

  Your models have changes that are not yet reflected in a migration, and so won't be applied.
  Run 'manage.py makemigrations' to make new migrations, and then re-run 'manage.py migrate' to apply them.

I am not familiar with the migration internals, but since 'choices' isn't used (at least not on PostgreSQL) to make any actual alterations to the database shouldn't it ignore this attribute?

Change History (26)

comment:1 Changed 5 years ago by John-Scott Atlakson

Description: modified (diff)

comment:2 Changed 5 years ago by John-Scott Atlakson

Description: modified (diff)

comment:3 Changed 5 years ago by Tim Graham

Summary: Changing model field choices results in spurious migrationsMigration shouldn't generate "DROP DEFAULT" SQL if no database changes are required
Triage Stage: UnreviewedAccepted
Type: UncategorizedCleanup/optimization

It's by design that migrations are generated even if no database changes are required (see #22837 for the rationale). We should probably document this if it's not already as this is becoming a FAQ.

We should also look into if we can avoid the DROP DEFAULT SQL when it's not required.

comment:4 Changed 5 years ago by John-Scott Atlakson

Is there then a corresponding 'best practice' for how to handle this? Again the issue is that Django is forcibly creating migrations for 3rd party apps which for obvious reasons we do not control. We now have to manually edit every migration we *do* want to create for our apps to make sure it points to the last official djangoc-cms migration (since that is the 'true' representation and present in all environments) rather than whatever Django insisted on creating locally. Otherwise we have broken migrations for the rest of the team.

The choices attribute can point to user overridable settings which can differ in any and all environments, there is no single representation of the model in this case. Every django-cms installation has its own list of available templates in settings.CMS_TEMPLATES which the Page model points to, so the 'full representation' of the Page model are legion. We'll eventually wind up with these no-op migrations clashing with real migrations when a django-cms update includes a new migration:

CommandError: Conflicting migrations detected (0003_auto_20140926_2347, 0003_auto_20141002_0045 in cms).
To fix them run 'python manage.py makemigrations --merge'

This comment https://code.djangoproject.com/ticket/21498#comment:9 suggests that the auto detector *has* to pick up any and all changes as it doesn't know a priori whether it requires DDL work at a later stage. Is there some point 'later' where the migrations framework can realize it's a no-op and cancel creating a migrations file (unless of course the user specified --empty intentionally) or at least provide a way for these to be ignored?

comment:5 Changed 5 years ago by Tim Graham

Cc: Andrew Godwin added

Not sure if this issue has been raised before, I'll add Andrew to the ticket to get his input.

comment:6 Changed 5 years ago by Iacopo Spalletti

Cc: Iacopo Spalletti added

Adding myself to CC, because for django CMS this is blocker to support Django 1.7

comment:7 Changed 5 years ago by John-Scott Atlakson

The DROP DEFAULT is due to ticket #23043. In @5875b8d13333bf709b54f91a143304826d57f2fd a bit of code was added so that any time DatabaseSchemaEditor._alter_field() is called it will always add DROP DEFAULT even when there is no default to drop.

comment:8 Changed 5 years ago by John-Scott Atlakson

The rationale for always including all field args/kwargs provided in ticket:21498#comment:6 was "I'd rather we included every single field - white/blacklisting fields is dangerous. What if a field subclass redefines help_text? What if it uses it to help populate ORM objects?" and ticket:21498#comment:9 "...(we just can't guarantee it's not useful at database or field runtime)".

Is this really the case though? We *can* determine whether or not any of the fields provided by Django use choices or help_text to make database altering changes. If custom field authors choose to override these field constructor arguments to result in DDL changes, then it would understandably also be their responsibility to write a sufficient deconstruct() method. It's unclear to me why we need the migrations noise in the common case of using default Django fields.

At any rate, after digging into this deeper I realize there are two separate and unrelated issues here (excessive DROP DEFAULTs and creating migrations for field constructor arguments that do not modify the db) so let me know if these should be split into two separate tickets.

comment:9 Changed 5 years ago by Andrew Godwin

So: excessive DROP DEFAULT is a valid issue, and what this ticket should represent.

Issuing a field alter on every change is by design (as tim said above) and we can't go back on that - due to the way state building works we need to have every attribute change represented that might affect the database OR data migrations - and things like choices and especially defaults matter in database migrations.

comment:10 Changed 5 years ago by Shai Berger

Cc: Shai Berger added

John: I think Andrew is mostly correct. You can probably fix things by editing the Django-CMS migrations to set the field attributes from settings, the same way your models do. Then, they will have the correct values per installation. Changing the relevant settings may still require a migration, and probably should.

It may be possible to set things up so migrations are auto-generated in these cases with settings references, similarly to the way they take swappable models into account. If you can come up with a reasonable API that will serve this purpose, I think it will be considered favorably.

comment:11 Changed 5 years ago by John-Scott Atlakson

I'm clearly missing something then. Yes I understand that these options can affect data migrations, but neither choices nor default are used as far as I've seen in schema migrations. And of course this ticket partly exists because Django tries too hard to make sure a specified default never results in database changes.

Apologies if I'm being thick but I'm still not getting a clear idea of what users of django-cms (or any 3rd party app that lets users define choices in settings) are supposed to do in the short term. Sorry to beat this point into the ground but I don't own django-cms's migrations, I only control my apps migrations. So even if it pointed to a settings attribute, Django will still forcibly create a spurious migration in our local virtualenvs (e.g. env/src/django-cms). As a result these migrations won't be a visible part of our version controlled project. Are we expected to dig these out, fork django-cms and commit/squash/merge migrations with upstream? I mentioned above, this is doubly painful since anytime we create a migration for our app, Django will forcibly create one of these migrations for django-cms and then make our migration dependent on it. Our current workflow is to manually delete Django's spurious migration from our virtualenv and then edit our intended migration in our project so that its dependency points to the last official django-cms migration and to otherwise ignore Django's complaint about unmigrated changes. Which we have to do every single time we create a migration.

It's also not clear what authors of reusable apps are supposed to do to eliminate this pain point for their users. A *nuclear* option mentioned was to subclass fields to del kwargs['choices'] in the deconstruct method, which is obviously not a desirable way out of this.

I've been banging my head against this and far too many other bugs lately and am probably suffering from a bit of brain damage, so apologies if I'm completely missing the obvious. Frustrations aside, endless thanks to Andrew and others for their hard work on migrations support!

comment:12 Changed 5 years ago by Andrew Godwin

Because of the way the new migrations system works, every attribute needs to be present in an AlterField in order to make sure it's represented in the historical state when it comes time to do a RunPython command. The only way around that particular implementation detail would be to use SeparateDatabaseAndState to say "the database changes are <empty list>, the state changes are <alter field>", but that's pointless once we fix the bug here where spurious default alteration is sent - once that's fixed an AlterField with no schema changes won't do anything to the database anyway.

If django-cms is using settings to define choices then, as Shai suggests, django-cms can edit their migration so that the choices attribute points to that settings variable rather than being copied from it and then the autodetector won't detect any changes. You can't do anything as a user other than encourage them to fix their migrations. You could also fork their migration tree and use MIGRATION_MODULES to point it to something you control for now, but that's not a great solution.

This is just one of those pain points as people (and third party apps) adapt to the new migrations - sorry about this, but there's no quick short term fix, third parties just have to get used to crafting migrations along with models. If you set any field attribute to something outside the scope of model control (like a setting) you just have to point it to the same thing in the resultant migration that makes that field and the autodetector will be happy.

Subclassing fields and removing choices is a possible but very extreme solution; that would only make sense for a field whose choices change literally every time the application starts (e.g. they're randomised or something), which is pretty rare from what I've seen (people who are pulling choices from querysets at boot, for example, are probably doing it wrong)

comment:13 Changed 5 years ago by Aymeric Augustin

people who are pulling choices from querysets at boot, for example, are probably doing it wrong

... and hitting a AppRegistryNotReady exception in Django 1.7 :-)

comment:14 Changed 5 years ago by John-Scott Atlakson

Thanks Andrew for the reply and for clearly laying out some options.

If I understand correctly, the only way out of this particular mess for django-cms is to make BOTH the model field's choices point directly to settings.CMS_TEMPLATES *and* the initial migration (and any subsequent legitimate AlterField migrations) must also point to settings.CMS_TEMPLATES to avoid tripping the autodetector.

django-cms indirectly gets choices from settings (tl;dr they first pass the display names through ugettext_lazy and then point choices at the resulting variable). All of which used to be completely legal Django/Python/South but looks like they'll have to make some additional backwards incompatible changes to accommodate Django 1.7, which will likely delay compatibility for some time. The only other option for django-cms that comes to mind is to move choices into ModelForm logic but would have to think through the implications.

I get the mockery of choices pointing to querysets, but it's not so obvious that the django-cms community did anything wrong using settings so developers can define custom template choices (that in this case point to real, version-controlled files on disk so database driven choices aren't a good fit) and have these used in form rendering and validation (which is primarily what help_text and choices are for). Is there a core-developer-approved best practice for this sort of use case that wouldn't involve pointing model field options at settings?

At any rate, not sure how widespread of an issue this might pose to the wider ecosystem but do you think it's at least worth a clear warning in the documentation for reusable apps authors?

comment:15 Changed 5 years ago by Andrew Godwin

I agree it's a pain but there's not much of a way round this; the alternative is to then break fields inside data migrations, at which point someone else will come and tell me how much of a pain it is.

If choices is genuinely not important to ANYTHING, then they should just use a custom field subclass that omits it from the deconstruct response (deconstruct is an official API now and everything).

It's not that django-cms did anything wrong, it's just that we had to tighten up a few previously-loose things to get migrations working properly. South had these same issues, it's just that it decided to abandon choices completely (which itself caused a few bugs where people got angry as they were using choices to populate database columns and stuff).

There's no "core developer approved" method of how to do this stuff; we're not all-powerful or all-knowledgeable, and I personally at least have not run into this case (this is why we had such a long alpha and beta period to get feedback, and changed quite a lot of stuff during it).

I can give suggestions, but ultimately it's up to the community to work out good solutions and/or write code to support them. If you want to revert to the old South way of always throwing away choices, that's just a small patch to django-cms away.

comment:16 Changed 5 years ago by John-Scott Atlakson

Thanks again for your feedback. It seems we've been able to fix the migrations in django-cms, so thanks indeed for showing me the way.

comment:17 Changed 5 years ago by John-Scott Atlakson

I would like to contribute a patch for the DROP DEFAULT issue if I can. My previous comments were focused just on the code added @ 2fb1939a9efae24560d41fbb7f6a280a1d2e8d06 The issue with that bit of code appears to be that _alter_field() will almost always output the DROP DEFAULT since:

  • all database backends besides MySQL will pass the not self.skip_default(new_field) check
  • most fields will also pass the new_field.default is not None test since if no default was actually set new_field.default is actually django.db.models.fields.NOT_PROVIDED rather than None.

However, now that I've looked into it a bit more I'm uncertain how to proceed:

Here is the output of sqlmigrate after adding a default to a field:

BEGIN;
ALTER TABLE "accounts_user" ALTER COLUMN "phone" SET DEFAULT '123-123-1234';
ALTER TABLE "accounts_user" ALTER COLUMN "phone" DROP DEFAULT;

COMMIT;

comment:18 Changed 5 years ago by Shai Berger

I just took a short look, and, TBH, I don't quite understand why we're setting and removing defaults around line 545. I see reason to use defaults in the database only when adding new columns; while adding columns can be the result of alter_field (e.g. when changing a M2M to a FK), I think this is not the case being handled here.

The test for new_field.default is not None looks like a bug -- the intention was probably to use self.effective_default(new_field) is not None.

The comments in #23043 are that no default should be left in the database; setting defaults is unavoidable when adding new columns to existing tables. That is also why add_field drops the defaults it adds. In a Django app, defaults are handled in Python, not SQL; if the code tries to insert rows in a way that relies on database defaults, it is most probably a bug -- so it is better to error out.

comment:19 Changed 5 years ago by Shai Berger

Oh, I've just realized that the code has changed somewhat between the time comment:17 was made and my comment:18. As far as I can see, the changes make the use of defaults in this code completely redundant (they sort-of made sense for SQLite, but they don't anymore).

comment:20 Changed 5 years ago by Markus Holtermann

The reason why Django adds a SET DEFAULT and DROP DEFAULT are explained and implemented based on #23609. I'm not sure though where the standalone DROP DEFAULT comes from.

comment:21 Changed 5 years ago by John-Scott Atlakson

The standalone DROP DEFAULT comes from this line in schema.py. Essentially anything that triggers _alter_field will end up passing this test and it will insert the DROP DEFAULT into the output. Perhaps shaib's comment in comment:18 is the fix for this line and that is enough to resolve the only remaining issue in this ticket? Unless generating add/drop as in comment:17 when the field was already not NULL is also undesirable behavior. Both of these seem harmless for the most part, just unexpected behavior that was uncovered while debugging something else.

comment:22 Changed 5 years ago by Tim Graham

More tests in Django's test suite are definitely needed. I can comment out every usage of sql_alter_column_no_default in db/backends/schema.py and there are no test failures in migrations or schema. There was also a suggestion by Claude to check if the default was NOT_PROVIDED: https://code.djangoproject.com/ticket/23719#comment:2. Anyway, if you can send a pull request with a solution and some tests I'll be happy to take a closer look.

comment:23 Changed 4 years ago by Tim Graham

Severity: NormalRelease blocker

This can cause the sequence to be dropped from auto-increment fields as noted in #24058 so I think we should increase the priority of this.

comment:24 Changed 4 years ago by Tim Graham

Has patch: set
Owner: changed from nobody to Tim Graham
Status: newassigned

comment:25 Changed 4 years ago by Tim Graham <timograham@…>

Resolution: fixed
Status: assignedclosed

In ab4f709da4516672b0bd811f2b4d0c4ba9f5b636:

Fixed #23581 -- Prevented extraneous DROP DEFAULT statements.

Thanks john_scott for the report and Markus Holtermann for review.

comment:26 Changed 4 years ago by Tim Graham <timograham@…>

In a9da5dd5b6f722a230a69211ad8e00a20cafcd38:

[1.7.x] Fixed #23581 -- Prevented extraneous DROP DEFAULT statements.

Thanks john_scott for the report and Markus Holtermann for review.

Backport of ab4f709da4516672b0bd811f2b4d0c4ba9f5b636 from master

Note: See TracTickets for help on using tickets.
Back to Top