Opened 3 months ago

Closed 3 months ago

Last modified 3 months ago

#35149 closed Bug (fixed)

JSONField db_default requires wrapping values in json.dumps()

Reported by: David Sanders Owned by: Simon Charette
Component: Database layer (models, ORM) Version: 5.0
Severity: Release blocker Keywords: JSONField db_default
Cc: Simon Charette, Lily Foote, Carlton Gibson Triage Stage: Ready for checkin
Has patch: yes Needs documentation: no
Needs tests: no Patch needs improvement: no
Easy pickings: no UI/UX: no

Description

This caught me out as I was expecting db_default to accept the same types as what would be set on model instance fields.

tl;dr

This will cause psycopg to raise ProgrammingError: cannot adapt type 'dict'

class Foo(models.Model):
    json = models.JSONField(db_default={"foo": "bar"})

however this works:

class Foo(models.Model):
    json = models.JSONField(db_default=json.dumps({"foo": "bar"}))

At the very least this may be a doc update but I feel like it violates principle of least astonishment to not use the same type that JSONField attrs use. If we do decide to make this consistent it seems like BaseSchemaEditor.db_default_sql() [1] just needs to be aware of the field's type and adapt it correctly – something like calling get_db_prep_value()?

[1] https://github.com/django/django/blob/2005530920e7c290bb1cf7d9ada155250faa81ad/django/db/backends/base/schema.py#L411

Further details

Migration generated without json.dumps:

class Migration(migrations.Migration):
    initial = True

    dependencies = []

    operations = [
        migrations.CreateModel(
            name="Foo",
            fields=[
                (
                    "id",
                    models.BigAutoField(
                        auto_created=True,
                        primary_key=True,
                        serialize=False,
                        verbose_name="ID",
                    ),
                ),
                ("json", models.JSONField(db_default=models.Value({"foo": "bar"}))),
            ],
        ),
    ]

Traceback when applying above migration:

django-sample % dj migrate jsonfield_dbdefault
Operations to perform:
  Apply all migrations: jsonfield_dbdefault
Running migrations:
  Applying jsonfield_dbdefault.0001_initial...Traceback (most recent call last):
  File "/path/to/projects/django-sample/./manage.py", line 22, in <module>
    main()
  File "/path/to/projects/django-sample/./manage.py", line 18, in main
    execute_from_command_line(sys.argv)
  File "/path/to/projects/django/django/core/management/__init__.py", line 442, in execute_from_command_line
    utility.execute()
  File "/path/to/projects/django/django/core/management/__init__.py", line 436, in execute
    self.fetch_command(subcommand).run_from_argv(self.argv)
  File "/path/to/projects/django/django/core/management/base.py", line 413, in run_from_argv
    self.execute(*args, **cmd_options)
  File "/path/to/projects/django/django/core/management/base.py", line 459, in execute
    output = self.handle(*args, **options)
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/path/to/projects/django/django/core/management/base.py", line 107, in wrapper
    res = handle_func(*args, **kwargs)
          ^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/path/to/projects/django/django/core/management/commands/migrate.py", line 356, in handle
    post_migrate_state = executor.migrate(
                         ^^^^^^^^^^^^^^^^^
  File "/path/to/projects/django/django/db/migrations/executor.py", line 135, in migrate
    state = self._migrate_all_forwards(
            ^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/path/to/projects/django/django/db/migrations/executor.py", line 167, in _migrate_all_forwards
    state = self.apply_migration(
            ^^^^^^^^^^^^^^^^^^^^^
  File "/path/to/projects/django/django/db/migrations/executor.py", line 252, in apply_migration
    state = migration.apply(state, schema_editor)
            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/path/to/projects/django/django/db/migrations/migration.py", line 132, in apply
    operation.database_forwards(
  File "/path/to/projects/django/django/db/migrations/operations/models.py", line 97, in database_forwards
    schema_editor.create_model(model)
  File "/path/to/projects/django/django/db/backends/base/schema.py", line 485, in create_model
    self.execute(sql, params or None)
  File "/path/to/projects/django/django/db/backends/postgresql/schema.py", line 46, in execute
    sql = self.connection.ops.compose_sql(str(sql), params)
          ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/path/to/projects/django/django/db/backends/postgresql/operations.py", line 193, in compose_sql
    return mogrify(sql, params, self.connection)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/path/to/projects/django/django/db/backends/postgresql/psycopg_any.py", line 22, in mogrify
    return ClientCursor(cursor.connection).mogrify(sql, params)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/path/to/projects/django-sample/.direnv/python-3.11/lib/python3.11/site-packages/psycopg/client_cursor.py", line 40, in mogrify
    pgq = self._convert_query(query, params)
          ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/path/to/projects/django-sample/.direnv/python-3.11/lib/python3.11/site-packages/psycopg/client_cursor.py", line 79, in _convert_query
    pgq.convert(query, params)
  File "/path/to/projects/django-sample/.direnv/python-3.11/lib/python3.11/site-packages/psycopg/_queries.py", line 213, in convert
    self.dump(vars)
  File "/path/to/projects/django-sample/.direnv/python-3.11/lib/python3.11/site-packages/psycopg/_queries.py", line 223, in dump
    self.params = tuple(
                  ^^^^^^
  File "/path/to/projects/django-sample/.direnv/python-3.11/lib/python3.11/site-packages/psycopg/_queries.py", line 224, in <genexpr>
    self._tx.as_literal(p) if p is not None else b"NULL" for p in params
    ^^^^^^^^^^^^^^^^^^^^^^
  File "/path/to/projects/django-sample/.direnv/python-3.11/lib/python3.11/site-packages/psycopg/_transform.py", line 198, in as_literal
    dumper = self.get_dumper(obj, PY_TEXT)
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/path/to/projects/django-sample/.direnv/python-3.11/lib/python3.11/site-packages/psycopg/_transform.py", line 245, in get_dumper
    raise ex from None
  File "/path/to/projects/django-sample/.direnv/python-3.11/lib/python3.11/site-packages/psycopg/_transform.py", line 243, in get_dumper
    dcls = self.adapters.get_dumper(key, format)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/path/to/projects/django-sample/.direnv/python-3.11/lib/python3.11/site-packages/psycopg/_adapters_map.py", line 223, in get_dumper
    raise e.ProgrammingError(
psycopg.ProgrammingError: cannot adapt type 'dict' using placeholder '%t' (format: TEXT)

Change History (18)

comment:1 by Simon Charette, 3 months ago

Severity: NormalRelease blocker
Triage Stage: UnreviewedAccepted

This is definitely a bug.

The JSONField could have an explicit encoder assigned that is required to be used for the db_default and it should be considered as well.

Two questions for you David

  1. If you edit the migration to assign an output_field to the Value does it address the issue? e.g. models.JSONField(db_default=models.Value({"foo": "bar"}, JSONField()))
  2. Can you also reproduce on psycopg2?

comment:2 by David Sanders, 3 months ago

  1. Nice, yes editing the migration to define output_field=JSONField() works 👍
  2. Yes reproducible on psycopg2

comment:3 by Simon Charette, 3 months ago

Ok then the solution is likely related IMO.

We could either have the current field being assigned as output_field of the Value at deconstruction time (without db_default to prevent recursions 😅) or, likely more nicely, when db_default associated is generated if we're dealing with a Value and it doesn't have an explicit _output_field then use the current field.

I'm not sure why we wrap db_default in Value at initialization time in the first place. The fact that a provided value doesn't adhere to the expression protocol seems like it could have been a nice way to provide the Value wrapper only when necessary at run time instead.

Some ideas in this branch. Ultimately we'll need to find a way to bind Value._output_field is None to the current field, not wrapping db_default in Value in the first place seems like it makes it easier to achieve but it does cause all migrations generated since 5.0 release that make use of db_default with a literal require a new noop migration.

comment:4 by David Sanders, 3 months ago

I like it 👍

The low impact change I was thinking was to assign field as the output field in db_default_sql() however I feel that centralising this is better.

2 questions: If someone provided supplied an expression without an output_field, and it's not resolvable, should your cached property try to set it?

Also should there be a check to warn if someone supplied an expression with a resolved type not matching the field?

eg currently setting something like foo = models.IntegerField(db_default="asdf") will happily generate a migration but will error when it comes time to migrate.

comment:5 by Natalia Bidart, 3 months ago

Cc: Simon Charette added

Simon, David, thanks for the details so far! Would any of you be available to provide a fix for this?

comment:6 by Simon Charette, 3 months ago

If someone provided supplied an expression without an output_field, and it's not resolvable, should your cached property try to set it?

I feel like implicit output_field assignment only makes sense in a context where we offer no way for the user to set it such as when they provide a non-expression themselves. In all other cases the user can provide an explicit output_field so it feels like we'd be guessing for them.

Also should there be a check to warn if someone supplied an expression with a resolved type not matching the field?

That's a good one. I'd say we might want to consider a check in future versions but not necessarily backport such check to 5.0 and have this work included as part of this ticket?

Simon, David, thanks for the details so far! Would any of you be available to provide a fix for this?

I'm happy to provide a fix myself or let someone familiar with the db_default work do it as well and provide a review. Were you interested in taking the work when you reported the issue David?

comment:7 by David Sanders, 3 months ago

I feel like implicit output_field assignment only makes sense in a context where we offer no way for the user to set it such as when they provide a non-expression themselves. In all other cases the user can provide an explicit output_field so it feels like we'd be guessing for them.

Yep agreed, so thinking more about this:

when they provide a non-expression themselves.

This is handled when you wrap it with Value so all good ✓

in all other cases

What made me wonder was this case foo = JSONField(db_default=Value({'foo': 'bar'})) however if we setup the warning one option may be to include warning about expressions that don't resolve like this.

For complex expressions that don't resolve ideally an output_field should be supplied. Implicit setting output_field isn't helpful here 👍

That's a good one. I'd say we might want to consider a check in future versions but not necessarily backport such check to 5.0 and have this work included as part of this ticket?

Yup not as a backport 👍 As per process will ask on forum before creating a ticket.

Were you interested in taking the work when you reported the issue David?

Not especially 🤷‍♂️ I feel like you've already authored a solution ☺️

comment:8 by Simon Charette, 3 months ago

Owner: changed from nobody to Simon Charette
Status: newassigned

comment:9 by Lily Foote, 3 months ago

Cc: Lily Foote added

comment:10 by Carlton Gibson, 3 months ago

Cc: Carlton Gibson added

comment:11 by Natalia Bidart, 3 months ago

Thank you Simon for grabbing this ticket. Is there anything that we/I can do to help?

comment:12 by Simon Charette, 3 months ago

I should be able to work on it this weekend, this have been a bit hectic at work this week. If someone wants to pick it up to work begins sooner feel free to do so, I'm happy to review the changes.

comment:13 by Simon Charette, 3 months ago

Has patch: set

comment:14 by Mariusz Felisiak, 3 months ago

Triage Stage: AcceptedReady for checkin

comment:15 by Mariusz Felisiak <felisiak.mariusz@…>, 3 months ago

In fe1cb62:

Refs #35149 -- Made equivalent db_default alterations noops.

This allows for an easier transition of preserving the literal nature of
non-compilable db_default.

comment:16 by Mariusz Felisiak <felisiak.mariusz@…>, 3 months ago

Resolution: fixed
Status: assignedclosed

In e67d7d70:

Fixed #35149 -- Fixed crashes of db_default with unresolvable output field.

Field.db_default accepts either literal Python values or compilables
(as_sql) and wrap the former ones in Value internally.

While 1e38f11 added support for automatic resolving of output fields for
types such as str, int, float, and other unambigous ones it's cannot do
so for all types such as dict or even contrib.postgres and contrib.gis
primitives.

When a literal, non-compilable, value is provided it likely make the
most sense to bind its output field to the field its attached to avoid
forcing the user to provide an explicit Value(output_field).

Thanks David Sanders for the report.

comment:17 by Mariusz Felisiak <felisiak.mariusz@…>, 3 months ago

In 914eee1:

[5.0.x] Refs #35149 -- Made equivalent db_default alterations noops.

This allows for an easier transition of preserving the literal nature of
non-compilable db_default.

Backport of fe1cb62f5c3f87fafc4a6b52fee2ccc6c80c41e2 from main

comment:18 by Mariusz Felisiak <felisiak.mariusz@…>, 3 months ago

In 761946f8:

[5.0.x] Fixed #35149 -- Fixed crashes of db_default with unresolvable output field.

Field.db_default accepts either literal Python values or compilables
(as_sql) and wrap the former ones in Value internally.

While 1e38f11 added support for automatic resolving of output fields for
types such as str, int, float, and other unambigous ones it's cannot do
so for all types such as dict or even contrib.postgres and contrib.gis
primitives.

When a literal, non-compilable, value is provided it likely make the
most sense to bind its output field to the field its attached to avoid
forcing the user to provide an explicit Value(output_field).

Thanks David Sanders for the report.

Backport of e67d7d70fa10c06aca36b9057f82054eda45269d from main

Note: See TracTickets for help on using tickets.
Back to Top