Opened 3 weeks ago

Last modified 13 days ago

#36754 assigned Bug

Broken migration created when GeneratedField's expression references a foreign key that has been deferred to another initial migration file

Reported by: Ou7law007 Owned by: Ou7law007
Component: Migrations Version: 5.0
Severity: Normal Keywords: autodetector GeneratedField
Cc: Triage Stage: Accepted
Has patch: yes Needs documentation: no
Needs tests: no Patch needs improvement: yes
Easy pickings: no UI/UX: no

Description

When the autodetector builds CreateModel operations, it defers related fields (FK/M2M) into separate AddField operations. However, GeneratedField expressions are evaluated before those deferred fields exist in the CreateModel operation, causing the generated expression to reference a non‑existent column name. This results in inconsistent or invalid initial migrations on new apps.

This is not easy to reproduce

In a small Django project, the dependency graph between apps is small. When Django detects initial migrations, it produces a single 0001_initial file per app. And since all fields (including FK targets used inside GeneratedField expressions) are created in that one file, Django never attempts to evaluate a GeneratedField expression that references a relationship that does not yet exist.

When the bug happens

In larger projects, Django may split initial migrations into multiple files (0001_initial, 0002_initial, sometimes 0003_initial, etc.).
When this happens:

  • Django may place a GeneratedField into 0001_initial.
  • Django may place the ForeignKey needed by the GeneratedField expression into 0002_initial.
  • Django then tries to serialize the GeneratedField expression in 0001_initial, but the FK field it references does not yet exist.
  • This results in the following error: django.core.exceptions.FieldError: Cannot resolve keyword 'category_id' into field. Choices are: id, ... etc. with category_id being a foreign key to a model that hasn't been processed yet.

In other words, the bug only appears when Django generates an initial migration before creating the FK that a GeneratedField depends on.

Also, note that it only happens on INITIAL migrations, which makes it even harder to reproduce, because you need an existing large project that needs to initialize all its models' migrations for the first time.

Example (if you copy paste into a small project, you won't be able to reproduce the issue, read above for why)

app_a:

class Category(models.Model):
    name = models.CharField(max_length=100)

app_b:

class Item(models.Model):
    category = models.ForeignKey("A.Category", on_delete=models.CASCADE)

    # Problematic GeneratedField referring to category_id
    somefield = models.GeneratedField(
        expression=Concat(
            F("category_id"), # or just category_id, makes no difference, both are bugged
            Value("-"),
            F("id"),
        ),
        output_field=models.CharField(max_length=50),
        db_persist=True,
        unique=True,
    )
B/0001_initial.py   # creates Item without the category FK
B/0002_initial.py   # adds the category FK

Workarounds

  1. Manually merge migrations i.e. move the FK field (or other dependencies) from 0002_initial.py into 0001_initial.py to ensure that all fields used by GeneratedField expressions are created together.
  1. Improt the user model inside the initial custom user migration!!!! (That's a werid one) This one is based on django cookiecutter which creates a users app with a custom user model. I noticed that importing that custom model inside the initial user migration file i.e. put import myproject.users.models inside myproject/users/migrations/0001_initial.py, which is the default for django-cookiecutter projects, that's why if you're using cookiecutter, you will not have this bug unless you delete all migration files after your project grows enough and then try to migrate from scratch, then you notice the only difference between the old and new initial migration files is this line import myproject.users.models which fixes the bug. I also noticed that other apps had up to 3 000x_initial.py files but once I added import myproject.users.models, they went down to just one for each app.

Briefly and as a summary, the issue is that if a generated field (somefield in the example above) has a reference to a foreign key (or any other related field) (category_id in the example above) AND django happens to create multiple 000x_initial.py migration files for that app (see 1) AND the foreign relation field is declared in a later file than the generated field, then the bug happens because django processes the generated field, but the field that is mentioned inside of it.

I'd also appreciate it, if someone can answer this question: Is it an expected behavior that importing import myproject.users.models i.e. the custom user model in the users' apps initial migration file i.e. in myproject/userss/migrations/0001_initial.py causes less initial migration files to be generated for each app? Is this normal? Because this is what seems to solve the issue (or make it not even an issue)

  1. https://docs.djangoproject.com/en/5.2/topics/migrations/#:~:text=but%20in%20some%20cases%20of%20complex%20model%20interdependencies%20it%20may%20have%20two%20or%20more

Change History (4)

comment:1 by David Sanders, 3 weeks ago

If this isn't already a duplicate, I coincidentally also just noticed this bug when squashing my migrations today.

Also my workaround was to simply remove the generated field from my CreateModel operation, rerun makemigrations and then move the newly gazetted operation back into the squashed migration.

Last edited 3 weeks ago by David Sanders (previous) (diff)

in reply to:  1 comment:2 by Ou7law007, 3 weeks ago

Replying to David Sanders:

If this isn't already a duplicate, I coincidentally also just noticed this bug when squashing my migrations today.

Also my workaround was to simply remove the generated field from my CreateModel operation, rerun makemigrations and then move the newly gazetted operation back into the squashed migration.

OMG thank you. I thought I was crazy. I already fixed it by editing the auto detector (I think): https://github.com/Ou7law007/django/tree/fix-generatedfield-autodetector

Don't mind the '330 commits ahead', it's actually just 1 commit and 1 file that changed. It's a fork of /stable/5.2.x and not main so...

This is the one commit: https://github.com/Ou7law007/django/commit/a8fabc62cfdc52800d18341bfcd073303b6f4e63

Feel free to test it.

I still don't understand why importing the user model in the users initial migration file causes django to create less initial migration files.

comment:3 by Jacob Walls, 2 weeks ago

Component: Database layer (models, ORM)Migrations
Keywords: GeneratedField added; migrations removed
Needs tests: set
Owner: set to Ou7law007
Status: newassigned
Summary: Bug in GeneratedField when it references a related field (e.g. ForeignKey) with 2 conditions (django happens to create multiple 000x_inital.py for the app && the ForeignKey is first initialized in the later file 000x_inital.py)Broken migration created when GeneratedField's expression references a foreign key that has been deferred to another initial migration file
Triage Stage: UnreviewedAccepted
Version: 5.25.0

Thanks for the report, reproduced at ce36c35e76f82f76cdfa5777456e794d481e5afc and at 5.0.9.

It's all good, but next time, please include some portion of the stacktrace, as it can help clarify that the error manifests at migration time, not makemigrations time.

Reproduced with these models, riffing on AutodetectorTests.test_arrange_for_graph_with_multiple_initial. It's likely that this can be reduced even further.

# testapp/models.py
from django.db import models

class Author(models.Model):
    name = models.CharField(max_length=200)
    book = models.ForeignKey("otherapp.Book", models.CASCADE, related_name="+")
# otherapp/models.py
from django.db import models
from django.db.models.functions import Concat

class Book(models.Model):
    author = models.ForeignKey("testapp.Author", models.CASCADE, related_name="+")
    title = models.CharField(max_length=200)


class Attribution(models.Model):
    author = models.ForeignKey("testapp.Author", models.CASCADE)
    book = models.ForeignKey("otherapp.Book", models.CASCADE)
    author_book_pairs = models.GeneratedField(
        expression=Concat(models.F("author_id"), models.Value("-"), models.F("book_id")),
        output_field=models.CharField(max_length=50),
        db_persist=True,
        unique=True,
    )
./manage.py makemigrations
./manage.py migrate
  File "/Users/jwalls/django/django/core/management/commands/migrate.py", line 354, in handle
    post_migrate_state = executor.migrate(
        targets,
    ...<3 lines>...
        fake_initial=fake_initial,
    )
  File "/Users/jwalls/django/django/db/migrations/executor.py", line 137, in migrate
    state = self._migrate_all_forwards(
        state, plan, full_plan, fake=fake, fake_initial=fake_initial
    )
  File "/Users/jwalls/django/django/db/migrations/executor.py", line 169, in _migrate_all_forwards
    state = self.apply_migration(
        state, migration, fake=fake, fake_initial=fake_initial
    )
  File "/Users/jwalls/django/django/db/migrations/executor.py", line 257, in apply_migration
    state = migration.apply(state, schema_editor)
  File "/Users/jwalls/django/django/db/migrations/migration.py", line 132, in apply
    operation.database_forwards(
    ~~~~~~~~~~~~~~~~~~~~~~~~~~~^
        self.app_label, schema_editor, old_state, project_state
        ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
    )
    ^
  File "/Users/jwalls/django/django/db/migrations/operations/models.py", line 100, in database_forwards
    schema_editor.create_model(model)
    ~~~~~~~~~~~~~~~~~~~~~~~~~~^^^^^^^
  File "/Users/jwalls/django/django/db/backends/base/schema.py", line 511, in create_model
    sql, params = self.table_sql(model)
                  ~~~~~~~~~~~~~~^^^^^^^
  File "/Users/jwalls/django/django/db/backends/base/schema.py", line 222, in table_sql
    definition, extra_params = self.column_sql(model, field)
                               ~~~~~~~~~~~~~~~^^^^^^^^^^^^^^
  File "/Users/jwalls/django/django/db/backends/base/schema.py", line 392, in column_sql
    " ".join(
    ~~~~~~~~^
        # This appends to the params being returned.
        ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
    ...<7 lines>...
        )
        ^
    ),
    ^
  File "/Users/jwalls/django/django/db/backends/base/schema.py", line 357, in _iter_column_sql
    generated_sql, generated_params = self._column_generated_sql(field)
                                      ~~~~~~~~~~~~~~~~~~~~~~~~~~^^^^^^^
  File "/Users/jwalls/django/django/db/backends/base/schema.py", line 457, in _column_generated_sql
    expression_sql, params = field.generated_sql(self.connection)
                             ~~~~~~~~~~~~~~~~~~~^^^^^^^^^^^^^^^^^
  File "/Users/jwalls/django/django/db/models/fields/generated.py", line 58, in generated_sql
    resolved_expression = self.expression.resolve_expression(
        self._query, allow_joins=False
    )
  File "/Users/jwalls/django/django/db/models/expressions.py", line 301, in resolve_expression
    expr.resolve_expression(query, allow_joins, reuse, summarize, for_save)
    ~~~~~~~~~~~~~~~~~~~~~~~^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/jwalls/django/django/db/models/expressions.py", line 301, in resolve_expression
    expr.resolve_expression(query, allow_joins, reuse, summarize, for_save)
    ~~~~~~~~~~~~~~~~~~~~~~~^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/jwalls/django/django/db/models/expressions.py", line 904, in resolve_expression
    return query.resolve_ref(self.name, allow_joins, reuse, summarize)
           ~~~~~~~~~~~~~~~~~^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/jwalls/django/django/db/models/sql/query.py", line 2070, in resolve_ref
    join_info = self.setup_joins(
        field_list, self.get_meta(), self.get_initial_alias(), can_reuse=reuse
    )
  File "/Users/jwalls/django/django/db/models/sql/query.py", line 1920, in setup_joins
    path, final_field, targets, rest = self.names_to_path(
                                       ~~~~~~~~~~~~~~~~~~^
        names[:pivot],
        ^^^^^^^^^^^^^^
    ...<2 lines>...
        fail_on_missing=True,
        ^^^^^^^^^^^^^^^^^^^^^
    )
    ^
  File "/Users/jwalls/django/django/db/models/sql/query.py", line 1825, in names_to_path
    raise FieldError(
    ...<2 lines>...
    )
django.core.exceptions.FieldError: Cannot resolve keyword 'author_id' into field. Choices are: author_book_pairs, id

Feel free to test it.

Would you like to submit your solution as a pull request? You could likely riff on AutodetectorTests.test_arrange_for_graph_with_multiple_initial to create a test for this.

Last edited 13 days ago by Jacob Walls (previous) (diff)

comment:4 by Jacob Walls, 13 days ago

Needs tests: unset
Patch needs improvement: set
Note: See TracTickets for help on using tickets.
Back to Top