Opened 7 years ago

Last modified 7 weeks ago

#28646 new Bug

Migration calls "CREATE INDEX" when one already exists when 'unique' field attribute is added (PostgreSQL)

Reported by: Hari - 何瑞理 Owned by: bcail
Component: Migrations Version: 1.11
Severity: Normal Keywords: postgresql, migration, index, #djangocph
Cc: Tomer Chachamu, bcail Triage Stage: Accepted
Has patch: yes Needs documentation: no
Needs tests: no Patch needs improvement: no
Easy pickings: no UI/UX: no

Description (last modified by Hari - 何瑞理)

PostgreSQL migration automatically creates an index for fields that set db_index=True. An example is SlugField, which sets this property implicitly. Thereafter when the unique=True property is added to the field the resultant migration script generates an AlterField object to apply this unique attribute. The schema editor then incorrectly detects this new unique=True attribute to indicate the need to create a like index statement on the field which causes an error as it conflicts with the already existing index.

The offending piece of code seems to be at django/db/backends/postgresql/schema.py:117.

 if ((not (old_field.db_index or old_field.unique) and new_field.db_index) or
                (not old_field.unique and new_field.unique)):
            like_index_statement = self._create_like_index_sql(model, new_field)
            if like_index_statement is not None:
                self.execute(like_index_statement)

If it's changed as:

 if (not (old_field.db_index or old_field.unique) and (new_field.db_index or new_field.unique)):
            like_index_statement = self._create_like_index_sql(model, new_field)
            if like_index_statement is not None:
                self.execute(like_index_statement)

this error won't occur.

PostgreSQL 9.5 introduces IF NOT EXISTS to the CREATE INDEX statement which if added to the schema template can also address this problem without changing the above logic.

I encountered the problem with SlugField() which implicitly sets db_index=True on PostgreSQL 9.4.

Interestingly, I only discovered this when I used django-tenant-schemas which adds a thin layer on top of the default Database router setting the schema search path before handing over the work to the default router. With a vanilla Django installation using default router, the second call to create a like index does not throw an error. However, upon reviewing the code, the logic does look incorrect. Also issuing the duplicate SQL statement in PostgreSQL console also throws an error.

I'm still investigating to see if this there's more to this than what I just described.

Change History (29)

comment:1 by Hari - 何瑞理, 7 years ago

Description: modified (diff)

comment:2 by Tim Graham, 7 years ago

That code was last touched in 9356f63a99957f01c14a9788617428a172a29fcb. Your proposal results in some tests failures. Can you write a test for tests/schema/tests.py that demonstrates this issue?

comment:3 by Tim Graham, 7 years ago

Resolution: worksforme
Status: newclosed

I couldn't reproduce this by changing SlugField() to SlugField(unique=True). Perhaps the bug is in django-tenant-schemas. Please reopen if you find that Django is at fault and add more specific steps to reproduce.

comment:4 by Ariki, 7 years ago

Resolution: worksforme
Status: closednew

Django tries to create a like index twice and fails when I try to make existing SlugField a primary key in a manually written migration. The code to reproduce:

import unittest

from django.db import connection, migrations, models
from django.db.migrations.state import ProjectState
from django.test import TestCase


class ChangePrimaryKeyTest(TestCase):

    def test_change_primary_key(self):
        # Set PostgreSQL messages locale to get error messages in English
        operation0 = migrations.RunSQL("SET lc_messages = 'C';")
        # Create a model with two fields
        operation1 = migrations.CreateModel(
            'SimpleModel',
            [
                ("field1", models.SlugField(max_length=20, primary_key=True)),
                ("field2", models.SlugField(max_length=20)),
            ],
        )
        # Drop field1 primary key constraint - this doesn't fail
        operation2 = migrations.AlterField(
            "SimpleModel",
            "field1",
            models.SlugField(max_length=20, primary_key=False),
        )
        # Add a primary key constraint to field2 - this fails
        operation3 = migrations.AlterField(
            "SimpleModel",
            "field2",
            models.SlugField(max_length=20, primary_key=True),
        )

        project_state = ProjectState()
        with connection.schema_editor() as editor:
            new_state = project_state.clone()
            operation0.database_forwards(
                "migrtest", editor, project_state, new_state)
            operation1.state_forwards("migrtest", new_state)
            operation1.database_forwards(
                "migrtest", editor, project_state, new_state)
            project_state, new_state = new_state, new_state.clone()
            operation2.state_forwards("migrtest", new_state)
            operation2.database_forwards(
                "migrtest", editor, project_state, new_state)
            project_state, new_state = new_state, new_state.clone()
            operation3.state_forwards("migrtest", new_state)
            operation3.database_forwards(
                "migrtest", editor, project_state, new_state)


Error message:

ERROR: test_change_primary_key (migrtest.tests.ChangePrimaryKeyTest)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/usr/lib/python3.6/site-packages/django/db/backends/utils.py", line 65, in execute
    return self.cursor.execute(sql, params)
psycopg2.ProgrammingError: relation "migrtest_simplemodel_field2_972171aa_like" already exists
Last edited 7 years ago by Ariki (previous) (diff)

comment:5 by Tim Graham, 7 years ago

Component: Database layer (models, ORM)Migrations
Summary: Migration calls "CREATE INDEX" when one already exists when 'unique' field attribute is addedMigration calls "CREATE INDEX" when one already exists when 'unique' field attribute is added (PostgreSQL)
Triage Stage: UnreviewedAccepted

I can reproduce as long as the three operations are in the same migration. The crash doesn't happen if you put the AlterField operations in a separate migration.

comment:6 by Tomer Chachamu, 7 years ago

Cc: Tomer Chachamu added
Owner: changed from nobody to Tomer Chachamu
Status: newassigned

comment:7 by Tomer Chachamu, 7 years ago

The test case given is incorrect, as Django always uses a fresh schema editor for each migration step:

https://github.com/django/django/blob/master/django/db/migrations/executor.py#L225

This passes and is similar to other cases in migrations/test_operations.py:

    def test_change_primary_key(self):
        # Create a model with two fields
        operation1 = migrations.CreateModel(
            'SimpleModel',
            [
                ("field1", models.SlugField(max_length=20, primary_key=True)),
                ("field2", models.SlugField(max_length=20)),
            ],
        )
        # Drop field1 primary key constraint - this doesn't fail
        operation2 = migrations.AlterField(
            "SimpleModel",
            "field1",
            models.SlugField(max_length=20, primary_key=False),
        )
        # Add a primary key constraint to field2 - this fails
        operation3 = migrations.AlterField(
            "SimpleModel",
            "field2",
            models.SlugField(max_length=20, primary_key=True),
        )

        project_state = ProjectState()
        new_state = project_state.clone()
        operation1.state_forwards("migrtest", new_state)
        with connection.schema_editor() as editor:
            operation1.database_forwards("migrtest", editor, project_state, new_state)
        project_state, new_state = new_state, new_state.clone()
        operation2.state_forwards("migrtest", new_state)
        with connection.schema_editor() as editor:
            operation2.database_forwards("migrtest", editor, project_state, new_state)
        project_state, new_state = new_state, new_state.clone()
        operation3.state_forwards("migrtest", new_state)
        with connection.schema_editor() as editor:
            operation3.database_forwards("migrtest", editor, project_state, new_state)

I'm going to try working off the original bug description to reproduce the bug.

Edit: Apparently this is a permitted way of using schema_editor, e.g. in SeparateDatabaseAndState. This suggests all the test cases should be doubled up (shared schema editor vs separate schema editors). So I'll try fixing the test case given.

Last edited 7 years ago by Tomer Chachamu (previous) (diff)

comment:8 by Tomer Chachamu, 7 years ago

Has patch: set
Owner: Tomer Chachamu removed
Status: assignednew

I have added a reasonable PR. https://github.com/django/django/pull/9438

The root cause is that schema_editor.create_model defers the creation of indexes, so at any time you can either have indexes on the database (can be found using schema_editor._constraint_names), deferred or not at all. https://github.com/django/django/blob/master/django/db/backends/base/schema.py#L300

According to the comment they are deferred for SQLite, so one solution would be letting schema editors override the behaviour - removing it for non-sqlite, or just for postgres. I think it will become more difficult to reason about the schema editor in that case, and if done properly the deferred sql can also be an optimiser, removing redundant index creations and deletions.

Every place that an index is added ought to check whether the index creation is already deferred, and remove the deferred one if so (in favour of an immediate one). Every place that an index is removed needs to check both deferred indexes and actual indexes.

We can probably find more bugs by parameterising the schema and migration tests to try using separate schema_editors (which flushes deferred SQL to the database every step) or sharing schema_editors.

Not all the lines added have a test backing them up but I think it's ready for somebody to have a look at and decide whether the approach is good.

comment:9 by Carlton Gibson, 7 years ago

Patch needs improvement: set

Comments on PR: we have an error in the boolean logic, not correctly distinguishing between the _unique and primary_key cases.

Making the check target _unique is enough to avoid the issue:

        if ((not (old_field.db_index or old_field.unique) and new_field.db_index) or
                (not old_field.unique and new_field._unique)):

(The original suggestion leads to just 3 failures a fix should be simple enough...)

Last edited 7 years ago by Carlton Gibson (previous) (diff)

comment:10 by Carlton Gibson, 7 years ago

Keywords: #djangocph added

I'm going to mark this for #djangocph for the sprint in Copenhagen.

Whilst it's right in the heart of the migration framework, I think it should be an easy fix.

Here's the GitHub permalink to the problem if check: https://github.com/django/django/blob/fb8fd535c0f47cffb4da0c5900f3f66e1ec8d432/django/db/backends/postgresql/schema.py#L124-L126

If you apply the original suggested fix you get a small number of failures (3 I think). These relate to needing to add an index due to a unique flag being added. That's the last or in the problem if.

The test from https://github.com/django/django/pull/9438 checks the new problem behaviour. (Can it live with the tests that fail if you apply the suggested fix?)

That new test fails because the unique property is essentially _unique or primary_key, which is too wide. As I said above, using new_field._unique was enough to make the test pass.

The task here is to go through that and make sure it's correct. Make sure the test is in the right place. Add a comment in the code (if it's needed). Maybe a release note etc.

comment:11 by bcail, 11 months ago

Cc: bcail added
Patch needs improvement: unset

comment:12 by bcail, 11 months ago

Owner: set to bcail
Status: newassigned

comment:13 by Mariusz Felisiak, 11 months ago

Patch needs improvement: set

comment:14 by bcail, 11 months ago

Patch needs improvement: unset

comment:15 by Mariusz Felisiak, 11 months ago

Patch needs improvement: set

comment:16 by Mariusz Felisiak, 10 months ago

#35180 was a duplicate. We should add a test for altering CharField(db_index=True, ...) to the TextField(db_index=True, ...).

Last edited 10 months ago by Natalia Bidart (previous) (diff)

comment:17 by bcail, 10 months ago

Patch needs improvement: unset

Hi Mariusz, I added that test. I also updated the code to use "CREATE INDEX IF NOT EXISTS"... what do you think of going that direction?

comment:18 by Mariusz Felisiak, 10 months ago

Patch needs improvement: set

comment:19 by bcail, 10 months ago

Patch needs improvement: unset

I reverted the "IF NOT EXISTS" changes, and split out the create-new-index and recreate-deleted-index conditions into separate methods.

comment:20 by Sarah Boyce, 7 months ago

Patch needs improvement: set

comment:21 by bcail, 6 months ago

Patch needs improvement: unset

comment:22 by Sarah Boyce, 6 months ago

Patch needs improvement: set

comment:23 by bcail, 5 months ago

Patch needs improvement: unset

comment:24 by Sarah Boyce, 5 months ago

Triage Stage: AcceptedReady for checkin

comment:25 by Sarah Boyce <42296566+sarahboyce@…>, 5 months ago

Resolution: fixed
Status: assignedclosed

In 9cf9c79:

Fixed #28646 -- Prevented duplicate index when unique is set to True on PostgreSQL.

comment:26 by Sarah Boyce <42296566+sarahboyce@…>, 5 months ago

In 3dac327:

Reverted "Fixed #28646 -- Prevented duplicate index when unique is set to True on PostgreSQL."

This reverts commit 9cf9c796be8dd53bc3b11355ff39d65c81d7be6d due to a crash on Oracle
as it didn't allow multiple indexes on the same field.

comment:27 by Sarah Boyce, 5 months ago

Has patch: unset
Resolution: fixed
Status: closednew

comment:28 by Sarah Boyce, 5 months ago

Triage Stage: Ready for checkinAccepted

comment:29 by bcail, 7 weeks ago

Has patch: set
Note: See TracTickets for help on using tickets.
Back to Top