Opened 3 weeks ago
Last modified 13 days ago
#36754 assigned Bug
Broken migration created when GeneratedField's expression references a foreign key that has been deferred to another initial migration file
| Reported by: | Ou7law007 | Owned by: | Ou7law007 |
|---|---|---|---|
| Component: | Migrations | Version: | 5.0 |
| Severity: | Normal | Keywords: | autodetector GeneratedField |
| Cc: | Triage Stage: | Accepted | |
| Has patch: | yes | Needs documentation: | no |
| Needs tests: | no | Patch needs improvement: | yes |
| Easy pickings: | no | UI/UX: | no |
Description
When the autodetector builds CreateModel operations, it defers related fields (FK/M2M) into separate AddField operations. However, GeneratedField expressions are evaluated before those deferred fields exist in the CreateModel operation, causing the generated expression to reference a non‑existent column name. This results in inconsistent or invalid initial migrations on new apps.
This is not easy to reproduce
In a small Django project, the dependency graph between apps is small. When Django detects initial migrations, it produces a single 0001_initial file per app. And since all fields (including FK targets used inside GeneratedField expressions) are created in that one file, Django never attempts to evaluate a GeneratedField expression that references a relationship that does not yet exist.
When the bug happens
In larger projects, Django may split initial migrations into multiple files (0001_initial, 0002_initial, sometimes 0003_initial, etc.).
When this happens:
- Django may place a GeneratedField into 0001_initial.
- Django may place the ForeignKey needed by the GeneratedField expression into 0002_initial.
- Django then tries to serialize the GeneratedField expression in 0001_initial, but the FK field it references does not yet exist.
- This results in the following error:
django.core.exceptions.FieldError: Cannot resolve keyword 'category_id' into field. Choices are: id, ... etc.withcategory_idbeing a foreign key to a model that hasn't been processed yet.
In other words, the bug only appears when Django generates an initial migration before creating the FK that a GeneratedField depends on.
Also, note that it only happens on INITIAL migrations, which makes it even harder to reproduce, because you need an existing large project that needs to initialize all its models' migrations for the first time.
Example (if you copy paste into a small project, you won't be able to reproduce the issue, read above for why)
app_a:
class Category(models.Model):
name = models.CharField(max_length=100)
app_b:
class Item(models.Model):
category = models.ForeignKey("A.Category", on_delete=models.CASCADE)
# Problematic GeneratedField referring to category_id
somefield = models.GeneratedField(
expression=Concat(
F("category_id"), # or just category_id, makes no difference, both are bugged
Value("-"),
F("id"),
),
output_field=models.CharField(max_length=50),
db_persist=True,
unique=True,
)
B/0001_initial.py # creates Item without the category FK B/0002_initial.py # adds the category FK
Workarounds
- Manually merge migrations i.e. move the FK field (or other dependencies) from 0002_initial.py into 0001_initial.py to ensure that all fields used by GeneratedField expressions are created together.
- Improt the user model inside the initial custom user migration!!!! (That's a werid one) This one is based on django cookiecutter which creates a
usersapp with a custom user model. I noticed that importing that custom model inside the initial user migration file i.e. putimport myproject.users.modelsinsidemyproject/users/migrations/0001_initial.py, which is the default for django-cookiecutter projects, that's why if you're using cookiecutter, you will not have this bug unless you delete all migration files after your project grows enough and then try to migrate from scratch, then you notice the only difference between the old and new initial migration files is this lineimport myproject.users.modelswhich fixes the bug. I also noticed that other apps had up to 3 000x_initial.py files but once I addedimport myproject.users.models, they went down to just one for each app.
Briefly and as a summary, the issue is that if a generated field (somefield in the example above) has a reference to a foreign key (or any other related field) (category_id in the example above) AND django happens to create multiple 000x_initial.py migration files for that app (see 1) AND the foreign relation field is declared in a later file than the generated field, then the bug happens because django processes the generated field, but the field that is mentioned inside of it.
I'd also appreciate it, if someone can answer this question: Is it an expected behavior that importing import myproject.users.models i.e. the custom user model in the users' apps initial migration file i.e. in myproject/userss/migrations/0001_initial.py causes less initial migration files to be generated for each app? Is this normal? Because this is what seems to solve the issue (or make it not even an issue)
Change History (4)
comment:2 by , 3 weeks ago
Replying to David Sanders:
If this isn't already a duplicate, I coincidentally also just noticed this bug when squashing my migrations today.
Also my workaround was to simply remove the generated field from my CreateModel operation, rerun makemigrations and then move the newly gazetted operation back into the squashed migration.
OMG thank you. I thought I was crazy. I already fixed it by editing the auto detector (I think): https://github.com/Ou7law007/django/tree/fix-generatedfield-autodetector
Don't mind the '330 commits ahead', it's actually just 1 commit and 1 file that changed. It's a fork of /stable/5.2.x and not main so...
This is the one commit: https://github.com/Ou7law007/django/commit/a8fabc62cfdc52800d18341bfcd073303b6f4e63
Feel free to test it.
I still don't understand why importing the user model in the users initial migration file causes django to create less initial migration files.
comment:3 by , 2 weeks ago
| Component: | Database layer (models, ORM) → Migrations |
|---|---|
| Keywords: | GeneratedField added; migrations removed |
| Needs tests: | set |
| Owner: | set to |
| Status: | new → assigned |
| Summary: | Bug in GeneratedField when it references a related field (e.g. ForeignKey) with 2 conditions (django happens to create multiple 000x_inital.py for the app && the ForeignKey is first initialized in the later file 000x_inital.py) → Broken migration created when GeneratedField's expression references a foreign key that has been deferred to another initial migration file |
| Triage Stage: | Unreviewed → Accepted |
| Version: | 5.2 → 5.0 |
Thanks for the report, reproduced at ce36c35e76f82f76cdfa5777456e794d481e5afc and at 5.0.9.
It's all good, but next time, please include some portion of the stacktrace, as it can help clarify that the error manifests at migration time, not makemigrations time.
Reproduced with these models, riffing on AutodetectorTests.test_arrange_for_graph_with_multiple_initial. It's likely that this can be reduced even further.
# testapp/models.py from django.db import models class Author(models.Model): name = models.CharField(max_length=200) book = models.ForeignKey("otherapp.Book", models.CASCADE, related_name="+")
# otherapp/models.py from django.db import models from django.db.models.functions import Concat class Book(models.Model): author = models.ForeignKey("testapp.Author", models.CASCADE, related_name="+") title = models.CharField(max_length=200) class Attribution(models.Model): author = models.ForeignKey("testapp.Author", models.CASCADE) book = models.ForeignKey("otherapp.Book", models.CASCADE) author_book_pairs = models.GeneratedField( expression=Concat(models.F("author_id"), models.Value("-"), models.F("book_id")), output_field=models.CharField(max_length=50), db_persist=True, unique=True, )
./manage.py makemigrations ./manage.py migrate
File "/Users/jwalls/django/django/core/management/commands/migrate.py", line 354, in handle post_migrate_state = executor.migrate( targets, ...<3 lines>... fake_initial=fake_initial, ) File "/Users/jwalls/django/django/db/migrations/executor.py", line 137, in migrate state = self._migrate_all_forwards( state, plan, full_plan, fake=fake, fake_initial=fake_initial ) File "/Users/jwalls/django/django/db/migrations/executor.py", line 169, in _migrate_all_forwards state = self.apply_migration( state, migration, fake=fake, fake_initial=fake_initial ) File "/Users/jwalls/django/django/db/migrations/executor.py", line 257, in apply_migration state = migration.apply(state, schema_editor) File "/Users/jwalls/django/django/db/migrations/migration.py", line 132, in apply operation.database_forwards( ~~~~~~~~~~~~~~~~~~~~~~~~~~~^ self.app_label, schema_editor, old_state, project_state ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ ) ^ File "/Users/jwalls/django/django/db/migrations/operations/models.py", line 100, in database_forwards schema_editor.create_model(model) ~~~~~~~~~~~~~~~~~~~~~~~~~~^^^^^^^ File "/Users/jwalls/django/django/db/backends/base/schema.py", line 511, in create_model sql, params = self.table_sql(model) ~~~~~~~~~~~~~~^^^^^^^ File "/Users/jwalls/django/django/db/backends/base/schema.py", line 222, in table_sql definition, extra_params = self.column_sql(model, field) ~~~~~~~~~~~~~~~^^^^^^^^^^^^^^ File "/Users/jwalls/django/django/db/backends/base/schema.py", line 392, in column_sql " ".join( ~~~~~~~~^ # This appends to the params being returned. ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ ...<7 lines>... ) ^ ), ^ File "/Users/jwalls/django/django/db/backends/base/schema.py", line 357, in _iter_column_sql generated_sql, generated_params = self._column_generated_sql(field) ~~~~~~~~~~~~~~~~~~~~~~~~~~^^^^^^^ File "/Users/jwalls/django/django/db/backends/base/schema.py", line 457, in _column_generated_sql expression_sql, params = field.generated_sql(self.connection) ~~~~~~~~~~~~~~~~~~~^^^^^^^^^^^^^^^^^ File "/Users/jwalls/django/django/db/models/fields/generated.py", line 58, in generated_sql resolved_expression = self.expression.resolve_expression( self._query, allow_joins=False ) File "/Users/jwalls/django/django/db/models/expressions.py", line 301, in resolve_expression expr.resolve_expression(query, allow_joins, reuse, summarize, for_save) ~~~~~~~~~~~~~~~~~~~~~~~^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/Users/jwalls/django/django/db/models/expressions.py", line 301, in resolve_expression expr.resolve_expression(query, allow_joins, reuse, summarize, for_save) ~~~~~~~~~~~~~~~~~~~~~~~^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/Users/jwalls/django/django/db/models/expressions.py", line 904, in resolve_expression return query.resolve_ref(self.name, allow_joins, reuse, summarize) ~~~~~~~~~~~~~~~~~^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/Users/jwalls/django/django/db/models/sql/query.py", line 2070, in resolve_ref join_info = self.setup_joins( field_list, self.get_meta(), self.get_initial_alias(), can_reuse=reuse ) File "/Users/jwalls/django/django/db/models/sql/query.py", line 1920, in setup_joins path, final_field, targets, rest = self.names_to_path( ~~~~~~~~~~~~~~~~~~^ names[:pivot], ^^^^^^^^^^^^^^ ...<2 lines>... fail_on_missing=True, ^^^^^^^^^^^^^^^^^^^^^ ) ^ File "/Users/jwalls/django/django/db/models/sql/query.py", line 1825, in names_to_path raise FieldError( ...<2 lines>... ) django.core.exceptions.FieldError: Cannot resolve keyword 'author_id' into field. Choices are: author_book_pairs, id
Feel free to test it.
Would you like to submit your solution as a pull request? You could likely riff on AutodetectorTests.test_arrange_for_graph_with_multiple_initial to create a test for this.
If this isn't already a duplicate, I coincidentally also just noticed this bug when squashing my migrations today.
Also my workaround was to simply remove the generated field from my CreateModel operation, rerun makemigrations and then move the newly gazetted operation back into the squashed migration.