Opened 4 years ago

Closed 4 years ago

#31815 closed Bug (fixed)

CheckConstraint() with unicode parameters crashes on PostgreSQL.

Reported by: JSAustin Owned by: Mariusz Felisiak
Component: Database layer (models, ORM) Version: dev
Severity: Normal Keywords: encoding utf-8
Cc: Simon Charette Triage Stage: Accepted
Has patch: yes Needs documentation: no
Needs tests: no Patch needs improvement: no
Easy pickings: no UI/UX: no

Description (last modified by JSAustin)

I am trying to create a model with a constraint on a field that uses UTF-8 encodings. Specifically, I'm trying to constrain scientific unit strings. The problematic one is: 'µg/mL'. The µ symbol has a b'\xc2\xb5' encoding with utf-8. When I add this as one of the check constraints, I get this error:

File "D:\DjangoSite\venv\lib\site-packages\django\db\backends\postgresql\schema.py", line 38, in quote_value
return psycopg2.extensions.adapt(value).getquoted().decode()
UnicodeEncodeError: 'latin-1' codec can't encode character '\u03bc' in position 0: ordinal not in range(256)

The migration is:
migrations.AddConstraint(model_name='biomarkertestinfo',constraint=models.CheckConstraint(check=models.Q(unitin=['μg/mL', 'ng/mL', 'pg/mL', 'U/L', 'mIU/mL', 'kIU/mL']), name='constrained biomarker unit choices'),)

I am running PostgreSQL 12 locally for this with psycopg2 connection.

Attachments (1)

tests-31815.diff (1.6 KB ) - added by Mariusz Felisiak 4 years ago.
Regression test.

Download all attachments as: .zip

Change History (6)

comment:1 by JSAustin, 4 years ago

Description: modified (diff)

comment:2 by Mariusz Felisiak, 4 years ago

Cc: Simon Charette added
Summary: Can't Create PostgreSQL Database Constraints with non latin-1 encodingsCheckConstraint() with unicode parameters crashes on PostgreSQL.
Triage Stage: UnreviewedAccepted
Version: 3.0master

Thanks for this ticket. I attached a regression test. The following patch (based on discussion) fixes this issue for me:

diff --git a/django/db/backends/postgresql/schema.py b/django/db/backends/postgresql/schema.py
index 7687c37fe7..b4b6f0dabe 100644
--- a/django/db/backends/postgresql/schema.py
+++ b/django/db/backends/postgresql/schema.py
@@ -39,7 +39,10 @@ class DatabaseSchemaEditor(BaseDatabaseSchemaEditor):
         if isinstance(value, str):
             value = value.replace('%', '%%')
         # getquoted() returns a quoted bytestring of the adapted value.
-        return psycopg2.extensions.adapt(value).getquoted().decode()
+        adapted = psycopg2.extensions.adapt(value)
+        if isinstance(value, str):
+            adapted.encoding = 'utf-8'
+        return adapted.getquoted().decode()
 
     def _field_indexes_sql(self, model, field):
         output = super()._field_indexes_sql(model, field)

Can you confirm that it works for you?

Reproduced at f65454801bfa13fc043fee0aca8f49af41380683.

by Mariusz Felisiak, 4 years ago

Attachment: tests-31815.diff added

Regression test.

comment:3 by Mariusz Felisiak, 4 years ago

Owner: changed from nobody to Mariusz Felisiak
Status: newassigned

comment:4 by Mariusz Felisiak, 4 years ago

Has patch: set

comment:5 by GitHub <noreply@…>, 4 years ago

Resolution: fixed
Status: assignedclosed

In f4e93919:

Fixed #31815 -- Fixed schema value encoding on PostgreSQL.

Note: See TracTickets for help on using tickets.
Back to Top