Opened 7 years ago
Last modified 2 weeks ago
#28646 new Bug
Migration calls "CREATE INDEX" when one already exists when 'unique' field attribute is added (PostgreSQL)
Reported by: | Hari - 何瑞理 | Owned by: | bcail |
---|---|---|---|
Component: | Migrations | Version: | 1.11 |
Severity: | Normal | Keywords: | postgresql, migration, index, #djangocph |
Cc: | Tomer Chachamu, bcail | Triage Stage: | Accepted |
Has patch: | yes | Needs documentation: | no |
Needs tests: | no | Patch needs improvement: | no |
Easy pickings: | no | UI/UX: | no |
Description (last modified by )
PostgreSQL migration automatically creates an index for fields that set db_index=True
. An example is SlugField
, which sets this property implicitly. Thereafter when the unique=True
property is added to the field the resultant migration script generates an AlterField object to apply this unique attribute. The schema editor then incorrectly detects this new unique=True
attribute to indicate the need to create a like index statement on the field which causes an error as it conflicts with the already existing index.
The offending piece of code seems to be at django/db/backends/postgresql/schema.py:117.
if ((not (old_field.db_index or old_field.unique) and new_field.db_index) or (not old_field.unique and new_field.unique)): like_index_statement = self._create_like_index_sql(model, new_field) if like_index_statement is not None: self.execute(like_index_statement)
If it's changed as:
if (not (old_field.db_index or old_field.unique) and (new_field.db_index or new_field.unique)): like_index_statement = self._create_like_index_sql(model, new_field) if like_index_statement is not None: self.execute(like_index_statement)
this error won't occur.
PostgreSQL 9.5 introduces IF NOT EXISTS
to the CREATE INDEX
statement which if added to the schema template can also address this problem without changing the above logic.
I encountered the problem with SlugField()
which implicitly sets db_index=True
on PostgreSQL 9.4.
Interestingly, I only discovered this when I used django-tenant-schemas
which adds a thin layer on top of the default Database router setting the schema search path before handing over the work to the default router. With a vanilla Django installation using default router, the second call to create a like index does not throw an error. However, upon reviewing the code, the logic does look incorrect. Also issuing the duplicate SQL statement in PostgreSQL console also throws an error.
I'm still investigating to see if this there's more to this than what I just described.
Change History (29)
comment:1 by , 7 years ago
Description: | modified (diff) |
---|
comment:2 by , 7 years ago
comment:3 by , 7 years ago
Resolution: | → worksforme |
---|---|
Status: | new → closed |
I couldn't reproduce this by changing SlugField()
to SlugField(unique=True)
. Perhaps the bug is in django-tenant-schemas. Please reopen if you find that Django is at fault and add more specific steps to reproduce.
comment:4 by , 7 years ago
Resolution: | worksforme |
---|---|
Status: | closed → new |
Django tries to create a like index twice and fails when I try to make existing SlugField a primary key in a manually written migration. The code to reproduce:
import unittest from django.db import connection, migrations, models from django.db.migrations.state import ProjectState from django.test import TestCase class ChangePrimaryKeyTest(TestCase): def test_change_primary_key(self): # Set PostgreSQL messages locale to get error messages in English operation0 = migrations.RunSQL("SET lc_messages = 'C';") # Create a model with two fields operation1 = migrations.CreateModel( 'SimpleModel', [ ("field1", models.SlugField(max_length=20, primary_key=True)), ("field2", models.SlugField(max_length=20)), ], ) # Drop field1 primary key constraint - this doesn't fail operation2 = migrations.AlterField( "SimpleModel", "field1", models.SlugField(max_length=20, primary_key=False), ) # Add a primary key constraint to field2 - this fails operation3 = migrations.AlterField( "SimpleModel", "field2", models.SlugField(max_length=20, primary_key=True), ) project_state = ProjectState() with connection.schema_editor() as editor: new_state = project_state.clone() operation0.database_forwards( "migrtest", editor, project_state, new_state) operation1.state_forwards("migrtest", new_state) operation1.database_forwards( "migrtest", editor, project_state, new_state) project_state, new_state = new_state, new_state.clone() operation2.state_forwards("migrtest", new_state) operation2.database_forwards( "migrtest", editor, project_state, new_state) project_state, new_state = new_state, new_state.clone() operation3.state_forwards("migrtest", new_state) operation3.database_forwards( "migrtest", editor, project_state, new_state)
Error message:
ERROR: test_change_primary_key (migrtest.tests.ChangePrimaryKeyTest) ---------------------------------------------------------------------- Traceback (most recent call last): File "/usr/lib/python3.6/site-packages/django/db/backends/utils.py", line 65, in execute return self.cursor.execute(sql, params) psycopg2.ProgrammingError: relation "migrtest_simplemodel_field2_972171aa_like" already exists
comment:5 by , 7 years ago
Component: | Database layer (models, ORM) → Migrations |
---|---|
Summary: | Migration calls "CREATE INDEX" when one already exists when 'unique' field attribute is added → Migration calls "CREATE INDEX" when one already exists when 'unique' field attribute is added (PostgreSQL) |
Triage Stage: | Unreviewed → Accepted |
I can reproduce as long as the three operations are in the same migration. The crash doesn't happen if you put the AlterField
operations in a separate migration.
comment:6 by , 7 years ago
Cc: | added |
---|---|
Owner: | changed from | to
Status: | new → assigned |
comment:7 by , 7 years ago
The test case given is incorrect, as Django always uses a fresh schema editor for each migration step:
https://github.com/django/django/blob/master/django/db/migrations/executor.py#L225
This passes and is similar to other cases in migrations/test_operations.py
:
def test_change_primary_key(self): # Create a model with two fields operation1 = migrations.CreateModel( 'SimpleModel', [ ("field1", models.SlugField(max_length=20, primary_key=True)), ("field2", models.SlugField(max_length=20)), ], ) # Drop field1 primary key constraint - this doesn't fail operation2 = migrations.AlterField( "SimpleModel", "field1", models.SlugField(max_length=20, primary_key=False), ) # Add a primary key constraint to field2 - this fails operation3 = migrations.AlterField( "SimpleModel", "field2", models.SlugField(max_length=20, primary_key=True), ) project_state = ProjectState() new_state = project_state.clone() operation1.state_forwards("migrtest", new_state) with connection.schema_editor() as editor: operation1.database_forwards("migrtest", editor, project_state, new_state) project_state, new_state = new_state, new_state.clone() operation2.state_forwards("migrtest", new_state) with connection.schema_editor() as editor: operation2.database_forwards("migrtest", editor, project_state, new_state) project_state, new_state = new_state, new_state.clone() operation3.state_forwards("migrtest", new_state) with connection.schema_editor() as editor: operation3.database_forwards("migrtest", editor, project_state, new_state)
I'm going to try working off the original bug description to reproduce the bug.
Edit: Apparently this is a permitted way of using schema_editor, e.g. in SeparateDatabaseAndState
. This suggests all the test cases should be doubled up (shared schema editor vs separate schema editors). So I'll try fixing the test case given.
comment:8 by , 7 years ago
Has patch: | set |
---|---|
Owner: | removed |
Status: | assigned → new |
I have added a reasonable PR. https://github.com/django/django/pull/9438
The root cause is that schema_editor.create_model
defers the creation of indexes, so at any time you can either have indexes on the database (can be found using schema_editor._constraint_names
), deferred or not at all. https://github.com/django/django/blob/master/django/db/backends/base/schema.py#L300
According to the comment they are deferred for SQLite, so one solution would be letting schema editors override the behaviour - removing it for non-sqlite, or just for postgres. I think it will become more difficult to reason about the schema editor in that case, and if done properly the deferred sql can also be an optimiser, removing redundant index creations and deletions.
Every place that an index is added ought to check whether the index creation is already deferred, and remove the deferred one if so (in favour of an immediate one). Every place that an index is removed needs to check both deferred indexes and actual indexes.
We can probably find more bugs by parameterising the schema and migration tests to try using separate schema_editors (which flushes deferred SQL to the database every step) or sharing schema_editors.
Not all the lines added have a test backing them up but I think it's ready for somebody to have a look at and decide whether the approach is good.
comment:9 by , 7 years ago
Patch needs improvement: | set |
---|
Comments on PR: we have an error in the boolean logic, not correctly distinguishing between the _unique
and primary_key
cases.
(The original suggestion leads to just 3 failures a fix should be simple enough...)
comment:10 by , 7 years ago
Keywords: | #djangocph added |
---|
I'm going to mark this for #djangocph for the sprint in Copenhagen.
Whilst it's right in the heart of the migration framework, I think it should be an easy fix.
Here's the GitHub permalink to the problem if
check: https://github.com/django/django/blob/fb8fd535c0f47cffb4da0c5900f3f66e1ec8d432/django/db/backends/postgresql/schema.py#L124-L126
If you apply the original suggested fix you get a small number of failures (3 I think). These relate to needing to add an index due to a unique
flag being added. That's the last or
in the problem if
.
The test from https://github.com/django/django/pull/9438 checks the new problem behaviour. (Can it live with the tests that fail if you apply the suggested fix?)
That new test fails because the unique
property is essentially _unique or primary_key
, which is too wide. As I said above, using new_field._unique
was enough to make the test pass.
The task here is to go through that and make sure it's correct. Make sure the test is in the right place. Add a comment in the code (if it's needed). Maybe a release note etc.
comment:12 by , 10 months ago
Owner: | set to |
---|---|
Status: | new → assigned |
comment:13 by , 10 months ago
Patch needs improvement: | set |
---|
comment:14 by , 10 months ago
Patch needs improvement: | unset |
---|
comment:15 by , 10 months ago
Patch needs improvement: | set |
---|
comment:16 by , 9 months ago
#35180 was a duplicate. We should add a test for altering CharField(db_index=True, ...)
to the TextField(db_index=True, ...)
.
comment:17 by , 9 months ago
Patch needs improvement: | unset |
---|
Hi Mariusz, I added that test. I also updated the code to use "CREATE INDEX IF NOT EXISTS"... what do you think of going that direction?
comment:18 by , 9 months ago
Patch needs improvement: | set |
---|
comment:19 by , 9 months ago
Patch needs improvement: | unset |
---|
I reverted the "IF NOT EXISTS" changes, and split out the create-new-index and recreate-deleted-index conditions into separate methods.
comment:20 by , 6 months ago
Patch needs improvement: | set |
---|
comment:21 by , 5 months ago
Patch needs improvement: | unset |
---|
comment:22 by , 5 months ago
Patch needs improvement: | set |
---|
comment:23 by , 4 months ago
Patch needs improvement: | unset |
---|
comment:24 by , 4 months ago
Triage Stage: | Accepted → Ready for checkin |
---|
comment:27 by , 4 months ago
Has patch: | unset |
---|---|
Resolution: | fixed |
Status: | closed → new |
comment:28 by , 4 months ago
Triage Stage: | Ready for checkin → Accepted |
---|
That code was last touched in 9356f63a99957f01c14a9788617428a172a29fcb. Your proposal results in some tests failures. Can you write a test for
tests/schema/tests.py
that demonstrates this issue?