Opened 21 months ago

Closed 7 months ago

#33901 closed Bug (fixed)

non-deterministic collations doesn't work with Unique=True on Postgres13.3

Reported by: Ed Chai Owned by: Ed Chai
Component: Migrations Version: dev
Severity: Normal Keywords: collation unique citext
Cc: Triage Stage: Ready for checkin
Has patch: yes Needs documentation: no
Needs tests: no Patch needs improvement: no
Easy pickings: no UI/UX: no

Description

Creating a non-deterministic collation used for case insensitive fields to replace the old approach, CIText, would raise django.db.utils.NotSupportedError: nondeterministic collations are not supported for operator class "varchar_pattern_ops" when the unique=True is also set on the field.Since unique implies the creation of an index despite setting db_index=False explicitly.

here is a sample code to reproduce:

class Migration(migrations.Migration):

    initial = True

    dependencies = [
        migrations.swappable_dependency(settings.AUTH_USER_MODEL),
    ]

    operations = [
        CreateCollation(
            'ci',
            provider='icu',
            locale='und-u-ks-level2',
            deterministic=False
        ),
        migrations.CreateModel(
            name='TestgUser',
            fields=[
                ('id', models.BigAutoField(auto_created=True, primary_key=True, serialize=False, verbose_name='ID')),
                ('password', models.CharField(max_length=128, verbose_name='password')),
                ('username', models.CharField(db_collation='ci', max_length=30, unique=True, verbose_name='username')),
            ],
            options={
                'verbose_name': 'Test User',
                'verbose_name_plural': 'Test Users',
                'abstract': False,
            },
        ),
    ]

Change History (19)

comment:1 by Simon Charette, 21 months ago

Triage Stage: UnreviewedAccepted
Version: 4.0dev

Closely related if not a duplicate of #24082 but since db_collation is the documented way off the now deprecated contrib.postgres.CIText I think it's worth keeping the two issue distinct for now.

Could we have the Postgres schema editor not create these opclass indexes when a custom db_collation is specified until we get a consensus on #24082?

in reply to:  1 comment:2 by Ed Chai, 21 months ago

Replying to Simon Charette: I found #24082 but filed the issue for the same reason, CIText is deprecated and db_collation is documented way to do this job without any conflicts. Removing the duplicate index is an optimal solution and unique only for constraint seems semantic to me. Although I know some others still need LIKE and regex expression for regular case sensitive CharFeild and Textfield.

Closely related if not a duplicate of #24082 but since db_collation is the documented way off the now deprecated contrib.postgres.CIText I think it's worth keeping the two issue distinct for now.

Could we have the Postgres schema editor not create these opclass indexes when a custom db_collation is specified until we get a consensus on #24082?

in reply to:  1 ; comment:3 by Mariusz Felisiak, 21 months ago

Replying to Simon Charette:

Could we have the Postgres schema editor not create these opclass indexes when a custom db_collation is specified until we get a consensus on #24082?

Sounds good to me. Ed, would you like to prepare a patch?

in reply to:  3 comment:4 by Ed Chai, 21 months ago

Replying to Mariusz Felisiak: Sure, I think i can make a pull request this week.

Replying to Simon Charette:

Could we have the Postgres schema editor not create these opclass indexes when a custom db_collation is specified until we get a consensus on #24082?

Sounds good to me. Ed, would you like to prepare a patch?

comment:5 by Mariusz Felisiak, 21 months ago

Has patch: set
Owner: changed from nobody to Ed Chai
Status: newassigned
Triage Stage: AcceptedReady for checkin

comment:6 by Mariusz Felisiak <felisiak.mariusz@…>, 21 months ago

Resolution: fixed
Status: assignedclosed

In f3f9d03:

Fixed #33901 -- Skipped varchar_pattern_ops/text_pattern_ops index creation when db_collation is set.

comment:7 by Ismael ABBO, 19 months ago

Hello everyone!
First thing first, I am a newbie, using 3.2 LTS (latest 3.2.16) version and Postgres 14.4!
While this has been closed, I am still facing the same issue unless it is just a patch for Django 4.2?

Please help on how to fix it!

Here is the error:

...
Applying auth.0012_alter_user_first_name_max_length... OK
  Applying person.0001_initial...Traceback (most recent call last):
  File "/usr/local/lib/python3.9/site-packages/django/db/backends/utils.py", line 84, in _execute
    return self.cursor.execute(sql, params)
psycopg2.errors.FeatureNotSupported: nondeterministic collations are not supported for operator class "varchar_pattern_ops"


The above exception was the direct cause of the following exception:
...
 File "/usr/local/lib/python3.9/site-packages/django/db/backends/utils.py", line 84, in _execute
    return self.cursor.execute(sql, params)
django.db.utils.NotSupportedError: nondeterministic collations are not supported for operator class "varchar_pattern_ops"

There is FeatureNotSupported error on psycopg2, I noticed but, if it is the main reason, how was it managed before? Any workaround?

comment:8 by Tim Graham, 18 months ago

Yes, the fix will only be applied for Django 4.2. Since the bug isn't a regression from a previous release or a major bug in a new feature it won't be backported. See our supported versions policy.

in reply to:  8 ; comment:9 by Ismael ABBO, 18 months ago

Replying to Tim Graham: Thanks for that information!
Then, is there any workaround for 3.2.x versions?

Yes, the fix will only be applied for Django 4.2. Since the bug isn't a regression from a previous release or a major bug in a new feature it won't be backported. See our supported versions policy.

comment:10 by Simon Charette, 18 months ago

Ismael, the only way to work around the issue on Django 3.2 would to be subclass the postgres SchemaEditor to override its _create_like_index_sql method and then use the adjusted backend in your DATABASES entries.

You can refer to the documentation on the subject.

in reply to:  9 comment:11 by Ed Chai, 18 months ago

Replying to Ismael ABBO:

Replying to Tim Graham: Thanks for that information!
Then, is there any workaround for 3.2.x versions?

Yes, the fix will only be applied for Django 4.2. Since the bug isn't a regression from a previous release or a major bug in a new feature it won't be backported. See our supported versions policy.

You can keep using CITEXT, or set unique constraints directly via psql terminal.

comment:12 by Ismael ABBO, 18 months ago

Thanks for help !

comment:13 by Petter Friberg, 12 months ago

Not sure if there should be a new ticket here or not but I'm still seeing this on 4.2 when having a relation that points to a unique field with a non-deterministic collation.

For instance, consider if we run the following migration operation:

migrations.CreateModel(
    name="ExtendedUser",
    fields=[
        ('id', models.BigAutoField(auto_created=True, primary_key=True, serialize=False, verbose_name='ID')),
        ('user', models.OneToOneField(on_delete=django.db.models.deletion.CASCADE, to=settings.AUTH_USER_MODEL, to_field='username'))
    ]
)

Running manage.py sqlmigrate for that operation displays the generated _like index

BEGIN;
--
-- Create model ExtendedUser
--
CREATE TABLE "myapp_extendeduser" ("id" bigint NOT NULL PRIMARY KEY GENERATED BY DEFAULT AS IDENTITY, "user_id" varchar(30) COLLATE "ci" NOT NULL UNIQUE);
ALTER TABLE "myapp_extendeduser" ADD CONSTRAINT "myapp_extendeduser_user_id_<chars>_fk_myapp_user_username" FOREIGN KEY ("user_id") REFERENCES "myapp_user" ("username") DEFERRABLE INITIALLY DEFERRED;
CREATE INDEX "myapp_extendeduser_user_id_<chars>_like" ON "myapp_extendeduser" ("user_id" varchar_pattern_ops);
COMMIT;

(Not sure if there's more to it but) I think the provided patch could be extended with something like below to resolve at least OneToOneField

            # Non-deterministic collations on Postgresql don't support indexes
            # for operator classes varchar_pattern_ops/text_pattern_ops.
            if getattr(field, "db_collation", None) or (field.is_relation and getattr(field.target_field, "db_collation", None)):
                return None
Last edited 12 months ago by Petter Friberg (previous) (diff)

comment:14 by Mariusz Felisiak, 12 months ago

Petter, thanks for the report. If you believe it's an issue in Django, then please create a new ticket in Trac and follow our bug reporting guidelines.

in reply to:  13 comment:15 by Ed Chai, 12 months ago

Replying to Petter Friberg:

Not sure if there should be a new ticket here or not but I'm still seeing this on 4.2 when having a relation that points to a unique field with a non-deterministic collation.

For instance, consider if we run the following migration operation:

migrations.CreateModel(
    name="ExtendedUser",
    fields=[
        ('id', models.BigAutoField(auto_created=True, primary_key=True, serialize=False, verbose_name='ID')),
        ('user', models.OneToOneField(on_delete=django.db.models.deletion.CASCADE, to=settings.AUTH_USER_MODEL, to_field='username'))
    ]
)

Running manage.py sqlmigrate for that operation displays the generated _like index

BEGIN;
--
-- Create model ExtendedUser
--
CREATE TABLE "myapp_extendeduser" ("id" bigint NOT NULL PRIMARY KEY GENERATED BY DEFAULT AS IDENTITY, "user_id" varchar(30) COLLATE "ci" NOT NULL UNIQUE);
ALTER TABLE "myapp_extendeduser" ADD CONSTRAINT "myapp_extendeduser_user_id_<chars>_fk_myapp_user_username" FOREIGN KEY ("user_id") REFERENCES "myapp_user" ("username") DEFERRABLE INITIALLY DEFERRED;
CREATE INDEX "myapp_extendeduser_user_id_<chars>_like" ON "myapp_extendeduser" ("user_id" varchar_pattern_ops);
COMMIT;

(Not sure if there's more to it but) I think the provided patch could be extended with something like below to resolve at least OneToOneField

            # Non-deterministic collations on Postgresql don't support indexes
            # for operator classes varchar_pattern_ops/text_pattern_ops.
            if getattr(field, "db_collation", None) or (field.is_relation and getattr(field.target_field, "db_collation", None)):
                return None

I think your code is not related to this issue. This ticket only addresses errors that occur when both db_collation and unique are applied on Postgres. CREATE INDEX is the expected operation when you create foreign keys

comment:16 by alfonsrv, 10 months ago

Despite this ticket being closed, I still get an error when attempting to run the SQL migration. Running Django 4.2.3 and Postgres 15.x (also tried 13.x)

I created the collation required collation and attempt to run a migration like this

    operations = [
        CreateCollation(
            "case_insensitive",
            provider="icu",
            locale="und-u-ks-level2",
            deterministic=False,
        ),
    ]
migrations.AlterField(
    model_name='user',
    name='email',
    field=models.EmailField(db_collation='case_insensitive', max_length=254, unique=True, verbose_name='email address'),
),

This results in the following SQL:

BEGIN;
--
-- Alter field email on user
--
ALTER TABLE "core_user" ALTER COLUMN "email" TYPE varchar(254) COLLATE "case_insensitive";
COMMIT;

And the same error when attempting to run python manage.py migrate: django.db.utils.NotSupportedError: nondeterministic collations are not supported for operator class "varchar_pattern_ops". I found that altering the existing field doesn't work, but instead it has to be recreated completely.

Last edited 10 months ago by alfonsrv (previous) (diff)

in reply to:  16 comment:17 by Ed Chai, 10 months ago

Replying to alfonsrv:
The issue is likely caused by a "varchar_pattern_ops" that already existed on your Postgres DB, you can run select * from pg_indexes where tablename='tablename'; in psql terminal to check if it was created. This fix indeed only solves the problem when you first time creating model fields with collation and prevents pattern ops index created. I'd be happy to submit another PR for the migration.

Despite this ticket being closed, I still get an error when attempting to run the SQL migration. Running Django 4.2.3 and Postgres 15.x (also tried 13.x)

I created the collation required collation and attempt to run a migration like this

    operations = [
        CreateCollation(
            "case_insensitive",
            provider="icu",
            locale="und-u-ks-level2",
            deterministic=False,
        ),
    ]
migrations.AlterField(
    model_name='user',
    name='email',
    field=models.EmailField(db_collation='case_insensitive', max_length=254, unique=True, verbose_name='email address'),
),

This results in the following SQL:

BEGIN;
--
-- Alter field email on user
--
ALTER TABLE "core_user" ALTER COLUMN "email" TYPE varchar(254) COLLATE "case_insensitive";
COMMIT;

And the same error when attempting to run python manage.py migrate: django.db.utils.NotSupportedError: nondeterministic collations are not supported for operator class "varchar_pattern_ops". I found that altering the existing field doesn't work, but instead it has to be recreated completely.

comment:18 by Adam, 7 months ago

Resolution: fixed
Status: closednew

the issues can be recreated as follows:

I have an old migration that sets the field with unique=true

I have a new migration that sets the field with unique=true, db_collation="utf8_unicode_ci"

The old migration runs first, and creates a LIKE index for the field. The new migration attempts to run:

django.db.utils.NotSupportedError: nondeterministic collations are not supported for operator class "varchar_pattern_ops"

My work around is writing a DROP INDEX of the LIKE index that is created by the first migration

        migrations.RunSQL(
            sql="DROP INDEX IF EXISTS appname_modelname_fieldname_073c92ac_like;",
            reverse_sql=migrations.RunSQL.noop,
        ),
        migrations.AlterField(
            model_name="modelname",
            name="fieldname",
            field=models.CharField(
                blank=True, db_collation="utf8_unicode_ci", max_length=255, null=True, unique=True
            ),
        ),

But this is quite messy.

in reply to:  18 comment:19 by Mariusz Felisiak, 7 months ago

Resolution: fixed
Status: newclosed

Replying to Adam:

the issues can be recreated as follows:

I have an old migration that sets the field with unique=true

I have a new migration that sets the field with unique=true, db_collation="utf8_unicode_ci"

The old migration runs first, and creates a LIKE index for the field. The new migration attempts to run:

django.db.utils.NotSupportedError: nondeterministic collations are not supported for operator class "varchar_pattern_ops"

My work around is writing a DROP INDEX of the LIKE index that is created by the first migration

        migrations.RunSQL(
            sql="DROP INDEX IF EXISTS appname_modelname_fieldname_073c92ac_like;",
            reverse_sql=migrations.RunSQL.noop,
        ),
        migrations.AlterField(
            model_name="modelname",
            name="fieldname",
            field=models.CharField(
                blank=True, db_collation="utf8_unicode_ci", max_length=255, null=True, unique=True
            ),
        ),

But this is quite messy.

Thanks for the report. In the future, please open new tickets for linked bug reports and don't reopen already closed. I've created #34898.

Note: See TracTickets for help on using tickets.
Back to Top