#35336 closed Bug (fixed)
Adding GeneratedField fails with ProgrammingError when using When on CharField
Reported by: | Adrian Garcia | Owned by: | Simon Charette |
---|---|---|---|
Component: | Database layer (models, ORM) | Version: | 5.0 |
Severity: | Release blocker | Keywords: | postgres, generatedfield, contains |
Cc: | Triage Stage: | Ready for checkin | |
Has patch: | yes | Needs documentation: | no |
Needs tests: | no | Patch needs improvement: | no |
Easy pickings: | no | UI/UX: | no |
Description
Forgive me if I am doing something incorrectly, but I cannot for the life of me add a GeneratedField to one of my models. To replicate this error, I created this test model:
from django.db import models class TestModel(models.Model): description = models.TextField()
Running makemigrations/migrate, then modifying the model to add this contains_heck
field:
from django.db import models class TestModel(models.Model): description = models.TextField() contains_heck = models.GeneratedField( expression=models.Case( models.When(description__icontains="HECK", then=models.Value(True)), default=models.Value(False), output_field=models.BooleanField(), ), output_field=models.BooleanField(), db_persist=True, )
Which generates this migration:
# Generated by Django 5.0.3 on 2024-03-27 20:34 from django.db import migrations, models class Migration(migrations.Migration): dependencies = [ ("core", "00001_initial"), ] operations = [ migrations.AddField( model_name="testmodel", name="contains_heck", field=models.GeneratedField( db_persist=True, expression=models.Case( models.When(description__icontains="HECK", then=models.Value(True)), default=models.Value(False), output_field=models.BooleanField() ), output_field=models.BooleanField(), ), ), ]
And after running python manage.py migrate
I get the following error:
Operations to perform: Apply all migrations: admin, auth, consumer, contenttypes, core, db, sessions Running migrations: Applying core.0002_testmodel_contains_heck...Traceback (most recent call last): File "/opt/project/app/manage.py", line 24, in <module> main() File "/opt/project/app/manage.py", line 20, in main execute_from_command_line(sys.argv) File "/usr/local/lib/python3.11/site-packages/django/core/management/__init__.py", line 442, in execute_from_command_line utility.execute() File "/usr/local/lib/python3.11/site-packages/django/core/management/__init__.py", line 436, in execute self.fetch_command(subcommand).run_from_argv(self.argv) File "/usr/local/lib/python3.11/site-packages/django/core/management/base.py", line 413, in run_from_argv self.execute(*args, **cmd_options) File "/usr/local/lib/python3.11/site-packages/django/core/management/base.py", line 459, in execute output = self.handle(*args, **options) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/usr/local/lib/python3.11/site-packages/django/core/management/base.py", line 107, in wrapper res = handle_func(*args, **kwargs) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/usr/local/lib/python3.11/site-packages/django/core/management/commands/migrate.py", line 356, in handle post_migrate_state = executor.migrate( ^^^^^^^^^^^^^^^^^ File "/usr/local/lib/python3.11/site-packages/django/db/migrations/executor.py", line 135, in migrate state = self._migrate_all_forwards( ^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/usr/local/lib/python3.11/site-packages/django/db/migrations/executor.py", line 167, in _migrate_all_forwards state = self.apply_migration( ^^^^^^^^^^^^^^^^^^^^^ File "/usr/local/lib/python3.11/site-packages/django/db/migrations/executor.py", line 252, in apply_migration state = migration.apply(state, schema_editor) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/usr/local/lib/python3.11/site-packages/django/db/migrations/migration.py", line 132, in apply operation.database_forwards( File "/usr/local/lib/python3.11/site-packages/django/db/migrations/operations/fields.py", line 108, in database_forwards schema_editor.add_field( File "/usr/local/lib/python3.11/site-packages/django/db/backends/base/schema.py", line 750, in add_field self.execute(sql, params) File "/usr/local/lib/python3.11/site-packages/django/db/backends/postgresql/schema.py", line 46, in execute sql = self.connection.ops.compose_sql(str(sql), params) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/usr/local/lib/python3.11/site-packages/django/db/backends/postgresql/operations.py", line 195, in compose_sql return mogrify(sql, params, self.connection) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/usr/local/lib/python3.11/site-packages/django/db/backends/postgresql/psycopg_any.py", line 22, in mogrify return ClientCursor(cursor.connection).mogrify(sql, params) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/usr/local/lib/python3.11/site-packages/psycopg/client_cursor.py", line 40, in mogrify pgq = self._convert_query(query, params) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/usr/local/lib/python3.11/site-packages/psycopg/client_cursor.py", line 79, in _convert_query pgq.convert(query, params) File "/usr/local/lib/python3.11/site-packages/psycopg/_queries.py", line 208, in convert (self.template, self._order, self._parts) = f(bquery, self._encoding) ^^^^^^^^^^^^^^^^^^^^^^^^^ File "/usr/local/lib/python3.11/site-packages/psycopg/_queries.py", line 242, in _query2pg_client_nocache parts = _split_query(query, encoding, collapse_double_percent=False) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/usr/local/lib/python3.11/site-packages/psycopg/_queries.py", line 388, in _split_query raise e.ProgrammingError( psycopg.ProgrammingError: only '%s', '%b', '%t' are allowed as placeholders, got '%H'
I took a look with the debugger and I think either Django is incorrectly passing the values to psycopg, or psycopg isn't splitting this SQL statement correctly. I say this because when _split_query
tries to split the query, it ends up matching and consuming the first character of each Value:
[ ('ALTER TABLE "core_testmodel" ADD COLUMN "contains_heck" boolean GENERATED ALWAYS AS (CASE WHEN UPPER("description"::text) LIKE UPPER(\'', "<re.Match object; span=(140, 142), match=b'%H'>"), ("ECK", '<re.Match object; span=(145, 147), match=b"%\'">'), (") THEN true ELSE false END) STORED", "None"), ]
Switching to models.When(description__icontains=models.Value("HECK"), ...) yields a similar error except it's complaining about
%' instead of
%H`.
If the contains_heck
field is created at the same time as the rest of the model, everything works in either the description__icontains="HECK"
or description__icontains=models.Value("HECK")
configuration.
Versions:
- Postgres 14.9
- Python 3.11.2
- Django 5.0.3
- psycopg[binary] 3.1.18
Change History (17)
comment:1 by , 7 months ago
Component: | Migrations → Database layer (models, ORM) |
---|---|
Keywords: | contains added; _split_query removed |
Owner: | changed from | to
Severity: | Normal → Release blocker |
Status: | new → assigned |
Triage Stage: | Unreviewed → Accepted |
comment:2 by , 7 months ago
Thanks for looking into this, I'm glad to know that it's a bug and not me doing something incorrectly. I'll use iregex
as a workaround until 5.0.4.
Out of curiosity, why does this issue only present itself when adding a field? Does Django handle SQL generation differently for table creation vs field addition?
comment:3 by , 7 months ago
Has patch: | set |
---|---|
Patch needs improvement: | set |
Out of curiosity, why does this issue only present itself when adding a field? Does Django handle SQL generation differently for table creation vs field addition?
I'm still laying down the details in this MR but the short answer is yes.
The DBAPI Cursor.execute
method can be used in two ways when dealing with parameters. Either (sql: str, params: [])
which will delegate parametrization to the library or (sql: str, params: None)
which won't attempt to parametrize the SQL. This is an important difference and requires special care when dealing with %
literals in parameters as it's the symbol use for parametrization placeholders by most libraries.
Well some backend simply don't support parametrization in some DDL statements such as GENERATED
columns declaration, in other words you can't do
cursor.execute( "ALTER TABLE invoice ADD COLUMN discount_display text GENERATED ALWAYS discount || %s;", ["%"] )
So you must do
cursor.execute( "ALTER TABLE invoice ADD COLUMN discount_display text GENERATED ALWAYS AS discount || '%';", None )
Well in the case of SchemaEditor.create_model
we liberally do one or the other and in your case, because you don't have any other field with a db_default
assuming your use Postgres, it does the right thing.
In the case of SchemaEditor.add_field
however we always do params: list
, which forces parametrization, and thus
cursor.execute( "ALTER TABLE testmodel ADD COLUMN contains_heck text GENERATED ALWAYS AS UPPER(description) LIKE '%HECK%';", [] )
Crashes as there is an attempt to interpolate the two %
while params
is empty.
The schema migration has historically use parametrizable DDL in the past when possible but this has become increasingly complex as more and more features were added (indexes, constraints, db_default
, GeneratedField
) to a point where I don't think we can continue doing so but the tricky part is that we can't make such a change in a bug fix release without risking to introduce further problems. This is made even worst by the fact that objects as Index
and Constraint
subclasses return already interpolated SQL (their as_sql
methods return "CHECK UPPER(description) LIKE '%HECK%'"
instead of ("CHECK UPPER(description) LIKE %s", "%HECK%")
so even if we wanted to use it entirely to the backend until the last minute by using quote_value
ourselves we can't as by that point we have a mixture of interpolated and non-interpolated SQL.
comment:4 by , 7 months ago
Patch needs improvement: | unset |
---|
comment:5 by , 7 months ago
Thanks for the rundown, it really helped me understand what was going on when I followed the execution with a debugger. The ORM is easily my favorite part of Django and it's no surprise that it is complicated as hell, are there plans for a more unified SQL translator in a future release? I'd be interested in following the progress of that.
I tested the fix and it works great, might not even need to use the workaround if 5.0.4 gets released in the next few weeks. Thanks for the speedy patch.
comment:6 by , 7 months ago
are there plans for a more unified SQL translator in a future release
I think the many regressions we've run into in the past releases in escaping DDL (the subset of SQL for defining objects) in the schema editor (the component behind migrations) warrants spending time having a less convoluted approach.
I think that we should experiment with having backends denote whether or not they support parametrized DDL (the ability to call Cursor.execute(ddl: str, params: tuple)
over having the schema editor do the %
formatting itself), make sure all parts of the scheme editor return (str, tuple)
and never simply str
, and then adjust SchemaEditor.execute
to do cursor.execute(ddl, params)
when connection.features.support_parametrized_dll
and do cursor.execute(ddl % map(self.quote_value, params), None)
otherwise.
comment:9 by , 7 months ago
Triage Stage: | Accepted → Ready for checkin |
---|
comment:11 by , 7 months ago
Woohoo, thanks again everyone for all the effort. You managed to have 5.0.4 released before my PR was merged lol.
This can be reproduced on
psycopg2
andpsycopg>=3
and is reminiscent of the series of issues we had to fix withConstraint
's usage of__contains
lookup which require proper%
escaping (#30408, #34553, #32369)tests/schema/tests.py