#34553 closed Bug (fixed)

Can't create CheckConstraint with percent characters in values on postgresql due to broken quoting

Reported by: Thomas Kolar Owned by: Simon Charette
Component: Database layer (models, ORM) Version: 4.2
Severity: Normal Keywords:
Cc: Simon Charette Triage Stage: Ready for checkin
Has patch: yes Needs documentation: no
Needs tests: no Patch needs improvement: no
Easy pickings: no UI/UX: no

Description

It is currently impossible to create check constraints that reference values containing percent characters (%).

Consider the following model definition:

from django.db import models


class Dolor(models.Model):
    sit = models.CharField(max_length=42069)

    class Meta:
        constraints = (
            models.CheckConstraint(
                check=models.Q(sit="%"),
                name="amet",
            ),
        )

This will indeed result in a constraint being created. However, the percent character will be doubled, and constrain sit to contain %%, not %:

>>> from ipsum.models import *
>>> Dolor.objects.create(sit="%")
Traceback (most recent call last):
 (... snip ...)
psycopg2.errors.CheckViolation: new row for relation "ipsum_dolor" violates check constraint "amet"
DETAIL:  Failing row contains (1, %).


The above exception was the direct cause of the following exception:

Traceback (most recent call last):
 (... snip ...)
django.db.utils.IntegrityError: new row for relation "ipsum_dolor" violates check constraint "amet"
DETAIL:  Failing row contains (1, %).

>>> Dolor.objects.create(sit="%%")
<Dolor: Dolor object (2)>
>>>

This may be a super contrived example, but using an __in lookup in the check with a list of allowed values also does not work (choices are not a suitable alternative here as that isn't enforced at the DB level).

This was likely introduced with the fix for #30484 - here, the schema editor's quote_value method was changed to always double percent characters.

I'd also like to note that creating a CheckConstraint like this DOES work in Django 2.2. Likely, that stopped being the case when #30060 was fixed. I can attest that this is a serious roadblock when moving a project from 2.2 to a modern and supported version of Django.

Honestly, I'm surprised that this doesn't break in way more instances. "Quoting" like this should only be correct specifically for LIKE (and the likes of it).

Suggested fix: remove the percentage character doubling from the general-purpose escaping method, and reintroduce it in a targeted way where needed.

I unfortunately don't know enough about the ORM's internals to make a more detailed suggestion, or know for just how much headache creation I am to apologize. But the current behavior is obviously incorrect.

Additional notes:

  • This affects django 3.2 as well
  • This almost certainly affects the oracle backend as well
  • This works as expected on sqlite3

I have attached a sample django project that reproduces the issue.

Steps to reproduce (require docker installation):

  • Spin up a postgres instance: docker run -e POSTGRES_USER=lorem -e POSTGRES_PASSWORD=lorem -e POSTGRES_DB=lorem -p 5432:5432 postgres
  • Extract lorem.zip
  • Create a virtual environment
  • Install django==4.2.1, psycopg2-binary==2.9.6
  • Activate the virtual environment
  • Extract lorem.zip and change into django directory
  • python -m manage migrate
  • python -m manage shell
  • from ipsum.models import *
  • Dolor.objects.create(sit="%")
    • expected behavior: This should return a model instance
    • actual behavior: exception (as described above)
  • Dolor.objects.create(sit="%%").sit
    • expected behavior: exception (as described above)
    • actual behavior: returns "%%"

Attachments (1)

lorem.zip (6.2 KB ) - added by Thomas Kolar 19 months ago.
sample django project that demonstrates the issue

Download all attachments as: .zip

Change History (11)

by Thomas Kolar, 19 months ago

Attachment: lorem.zip added

sample django project that demonstrates the issue

comment:1 by Thomas Kolar, 19 months ago

I have discovered a very hacky workaround that should cover quite a few use cases, in case anyone else is affected by this and needs a fix ASAP. The workaround is to combine a __contains check with a __length check - a string that contains another string and has the same length must be equal.

__contains with a % IS supported - as it happens, the fact that support for this was added is what presumably caused this issue (see above).

In order to do this, the __length lookup must be registered as described in this stackoverflow answer https://stackoverflow.com/a/68274322 . This must obviously happen before migrations are run.

For obvious reasons, I do not endorse this abstruse tomfoolery as an intended solution.

comment:2 by David Sanders, 19 months ago

Triage Stage: UnreviewedAccepted

Thanks for the report 🏆

I've verified the issue on 59262c294d26d2aa9346284519545c0f988bf353

Version 0, edited 19 months ago by David Sanders (next)

comment:3 by David Sanders, 19 months ago

Looks like regression in a8b3f96f6acfa082f99166e0a1cfb4b0fbc0eace

Edit: actually bisected to commit 42e8cf47c7ee2db238bf91197ea398126c546741

Last edited 19 months ago by David Sanders (previous) (diff)

comment:4 by Natalia Bidart, 19 months ago

Cc: Simon Charette added

comment:5 by Simon Charette, 19 months ago

The usage of quote_value in ddl_references.Statement.__str__ is problematic because the later doubles %. I haven't fully investigated what ought to be done to resolve the issue here but it relates to how most databases don't support parametrized DDL.

The issue here relates to a8b3f96f6acfa082f99166e0a1cfb4b0fbc0eace / #30408 as well. Now that we explicitly pass params=None I suspect that'll we'll simply want to remove the % doubling that takes place in SchemaEditor.quote_value.

comment:6 by Simon Charette, 19 months ago

Owner: changed from nobody to Simon Charette
Status: newassigned

comment:7 by Simon Charette, 19 months ago

Has patch: set

comment:8 by Mariusz Felisiak, 19 months ago

Triage Stage: AcceptedReady for checkin

comment:9 by Mariusz Felisiak <felisiak.mariusz@…>, 19 months ago

In e0f8104a:

Refs #34553 -- Split constraint escaping test in subtests.

This ensures that constraint violations are tested in isolation from
each other as an IntegrityError only ensures a least one constraint is
violated.

For example, the assertion added in 42e8cf4 break both the
name_constraint_rhs and the rebate_constraint constraints and thus
doesn't constitute a proper regression test. Refs #32369.

comment:10 by Mariusz Felisiak <felisiak.mariusz@…>, 19 months ago

Resolution: fixed
Status: assignedclosed

In ffff17d4:

Fixed #34553 -- Fixed improper % escaping of literal in constraints.

Proper escaping of % in string literals used when defining constaints
was attempted (a8b3f96f6) by overriding quote_value of Postgres and
Oracle schema editor. The same approach was used when adding support for
constraints to the MySQL/MariaDB backend (1fc2c70).

Later on it was discovered that this approach was not appropriate and
that a preferable one was to pass params=None when executing the
constraint creation DDL to avoid any form of interpolation in the first
place (42e8cf47).

When the second patch was applied the corrective of the first were not
removed which caused % literals to be unnecessary doubled. This flew
under the radar because the existings test were crafted in a way that
consecutive %% didn't catch regressions.

This commit introduces an extra test for exact lookups which
highlights more adequately % doubling problems but also adjust a
previous
endswith test to cover % doubling problems (%\% -> %%\%%).

Thanks Thomas Kolar for the report.

Refs #32369, #30408, #30593.

Note: See TracTickets for help on using tickets.
Back to Top