Opened 4 years ago

Closed 4 years ago

Last modified 3 years ago

#31973 closed Bug (invalid)

TypeError loading data in JSONField if DB has native JSON support

Reported by: Adam Alton Owned by: nobody
Component: Database layer (models, ORM) Version: 3.1
Severity: Normal Keywords:
Cc: Thiago Bellini Ribeiro, giacomo Triage Stage: Unreviewed
Has patch: no Needs documentation: no
Needs tests: no Patch needs improvement: no
Easy pickings: no UI/UX: no

Description

Following today's security release, I've upgraded from Django 3.1 to 3.1.1 with no other code changes, and suddenly I'm getting this error when loading a model instance from the DB:

list(MyModel.objects.all())
Traceback (most recent call last):
  File "<console>", line 1, in <module>
  File "/usr/local/lib/python3.8/site-packages/django/db/models/query.py", line 287, in __iter__
    self._fetch_all()
  File "/usr/local/lib/python3.8/site-packages/django/db/models/query.py", line 1303, in _fetch_all
    self._result_cache = list(self._iterable_class(self))
  File "/usr/local/lib/python3.8/site-packages/django/db/models/query.py", line 70, in __iter__
    for row in compiler.results_iter(results):
  File "/usr/local/lib/python3.8/site-packages/django/db/models/sql/compiler.py", line 1100, in apply_converters
    value = converter(value, expression, connection)
  File "/usr/local/lib/python3.8/site-packages/django/db/models/fields/json.py", line 74, in from_db_value
    return json.loads(value, cls=self.decoder)
  File "/usr/local/lib/python3.8/json/__init__.py", line 341, in loads
    raise TypeError(f'the JSON object must be str, bytes or bytearray, '
TypeError: the JSON object must be str, bytes or bytearray, not list

From looking at the recent changes between 3.0 and 3.1, I'm pretty sure that the problem was introduced in this commit: https://github.com/django/django/commit/0be51d2226fce030ac9ca840535a524f41e9832c

Specifically, in django/db/models/fields/json.py, the removal of this from from_db_value:

    if connection.features.has_native_json_field and self.decoder is None:
            return value

I'm pretty sure that this means that if the value has already been decoded and the DB has got native JSON support, then it continues through this function, attempting to decode the already-decoded data, and therefore gets a TypeError. But note that the exception catching around json.loads only catches json.JSONDecodeError, not TypeError, and hence it dies.

I will see if I can write a test to prove this (or run the existing tests with postgres, which has native JSON support). But I'm reporting this now, as it seems like a fairly serious breaking change in 3.1.1. And maybe someone else who understands the changes in the mentioned commit can fix this faster than I can.

Change History (24)

comment:1 by Mariusz Felisiak, 4 years ago

Can you provide more details? Are you using a custom decoder/encoder? The only reason that I can imagine is that you use json instead of jsonb datatype in your database.

comment:2 by Mariusz Felisiak, 4 years ago

I will see if I can write a test to prove this (or run the existing tests with postgres, which has native JSON support)

The existing tests, which cover also lists in JSONFields, work on PostgreSQL.

comment:3 by Mariusz Felisiak, 4 years ago

Component: UncategorizedDatabase layer (models, ORM)
Resolution: needsinfo
Status: newclosed
Type: UncategorizedBug

comment:4 by Adam Alton, 4 years ago

Thanks for your swift reply, @felixxm. You're right, this is caused by the database having the fields as type json rather than jsonb, even though my DB schema has only ever been altered via Django's migrations.

So (contrary to my original report) there is not a bug in value_from_db. But…

The question now is whether there's a bug in Django's migration logic.

The fields which were getting this error were originally using a custom JSONField, which was a subclass of models.TextField, and so presumably their original DB column type was text. I then changed them to use the native models.JSONField, and got Django to automatically create a migration (which contains a migrations.AlterField operation for each field, converting them to the native models.JSONField).

The weird thing is that that migration converted the DB columns to type json rather than jsonb.

I can fix that in my own DB, no problem. But the remaining question is this: is that behaviour unexpected and should we fix it?

It seems unexpected to me that Django is happily creating the migration and applying it, but that the DB then ends up in an unusable state. Presumably Django should either say "I can't do that change, write your own migration" or it should apply the correct change.

Happy to make a new ticket if the decision is that this is a valid bug.

Last edited 4 years ago by Adam Alton (previous) (diff)

in reply to:  4 comment:5 by Mariusz Felisiak, 4 years ago

The fields which were getting this error were originally using a custom JSONField, which was a subclass of models.TextField, and so presumably their original DB column type was text. I then changed them to use the native models.JSONField, and got Django to automatically create a migration (which contains a migrations.AlterField operation for each field, converting them to the native models.JSONField).

The weird thing is that that migration converted the DB columns to type json rather than jsonb.

Django have never used the json datatype, so it's not possible that it used json type in migrations. Can you provide more detail about this custom JSONField? Was it your own implementation or a field provided by 3rd-party package?

I can fix that in my own DB, no problem. But the remaining question is this: is that behaviour unexpected and should we fix it?

I cannot reproduce it (custom JSONField(TextField) -> models.JSONField works for me). Can you provide a small sample project that reproduces this issue?

comment:6 by Thomas J Bradley, 4 years ago

I’m getting exactly the same problem: a TypeError on my JSONFields. I was upgrading directly from Django 3.0.10 to 3.1.1 and none of the JSONFields load properly. I downgraded to 3.1.0 and they do seem to work properly again.

The problem seems to be present when the data in the JSONField is a bool, list or dict.

I’m not using any custom JSONField stuff, only the built-ins. It does appear that because only the json.JSONDecodeError is being caught as an exception, its missing the case where—because of the DBs built-in JSON support—the data has already been converted.

comment:7 by Mariusz Felisiak, 4 years ago

I’m not using any custom JSONField stuff, only the built-ins. It does appear that because only the json.JSONDecodeError is being caught as an exception, its missing the case where—because of the DBs built-in JSON support—the data has already been converted.

The data shouldn't be converted because Django uses a custom stub converter. Please provide a sample project that reproduces this issue.

comment:8 by Agris Ameriks, 4 years ago

The example code that shows broken functionality is here: https://code.djangoproject.com/ticket/31956#comment:8

I think that JSONField should return the dict either way and it was working fine in 3.1.

Previously fixed ordering bug was related to "with a custom decoder", but in code example I'm not specifying different decoder.

I think that this will be "blocker" for users to be able to upgrade to next versions if they depend on JSONField being converted to dict automatically.

comment:9 by Mariusz Felisiak, 4 years ago

Agris, that's a different issue, see #31991.

comment:10 by Thiago Bellini Ribeiro, 4 years ago

I too am suffering from this issue.

I don't use any custom stuff on my JSONFields and also don't manipulate it in any weird way, I just save/load lists/dicts from them.

They were all from postgresql's contrib package before and I changed them to models.JSONField when I migrated to 3.1. Upgrading to 3.1.1 gives me the exact error described here even when just opening the django's admin interface for the model. Took a long time for me to figure it out and the error went away when I downgrade django to 3.1.

comment:11 by Thiago Bellini Ribeiro, 4 years ago

Cc: Thiago Bellini Ribeiro added

comment:12 by Mariusz Felisiak, 4 years ago

Thiago, using JSONField's in admin works for me. Can you provide a sample project that reproduces this issue?

in reply to:  12 comment:13 by Thiago Bellini Ribeiro, 4 years ago

Replying to felixxm:

Thiago, using JSONField's in admin works for me. Can you provide a sample project that reproduces this issue?

OMG I cannot reproduce the issue anymore. I really had this happening at the day after the 3.1.1's release where I spent some time trying to understand what was wrong, and after downgrading to 3.1.0 it worked fine again.

Would like to say that "well, it is working now so it is fine", but the fact that I'm sure that something did indeed happen that day and other people here in this ticket says that something also happened makes me unsure..

comment:14 by Adam Alton, 4 years ago

I've now dug further into what happened in my case.

It turns out that I actually wrote a custom migration when I moved from the custom JSONField (which subclassed models.TextField) to the built-in models.JSONField. I think I must have underestimated how clever Django would be in doing that conversion, and so I wrote a custom migrations.RunSQL, assuming that Django wouldn't do the right conversion for me. My custom migration did something like this:

ALTER TABLE my_table ALTER COLUMN my_field TYPE JSON USING my_field::JSON;

I'd totally forgotten about this (and failed to see it when I looked through my migrations before I raised this ticket).

So in my case, I'm pretty sure that the problem was my own fault for doing my_field::JSON instead of my_field::JSONB. (And for assuming that Django wouldn't be clever enough to do this for me!) Sorry for the false alarm on this ticket, and thank you @felixxm for your help.

I hope that this info helps the other people who've got this error to get to the bottom of what's going on in their case. But whatever it is, my guess is that it's not a bug in Django!

comment:15 by Mariusz Felisiak, 4 years ago

Adam, thanks for sharing details.

comment:16 by Mariusz Felisiak, 4 years ago

Resolution: needsinfoinvalid

comment:17 by Thomas J Bradley, 4 years ago

This is a general comment for other people who many run into this problem—I diagnosed my problem with a lot of dependency checks.

My problem was specifically related to django-db-geventpool—at the time it wasn’t compatible with the new JSONField.
https://github.com/jneight/django-db-geventpool

Just today ran into this exact same problem again because a reversion in the most recent version of django-db-geventpool, v3.2.4 which is not compatible with Django/3.1—I’ll have to wait to upgrade this dependency until after their next release.

comment:18 by Andrew, 4 years ago

I don't think this issue is invalid. I have the exact same reproducible problem on a new django project with 3.1.0 vs 3.1.1.

To fix it in the raw sql, you can append ::json to field you are querying

class Table(models.Model):
    data = models.JSONField()  # jsonb column

select
    data::json  as data_returned_as_a_dict,
    data::jsonb as data_broken,
    data        as data_also_broken
from
    app_table

3.1.1 really did introduce issues with jsonb fields for no good reason that I can see, since simply lying about the type works just fine!

Last edited 4 years ago by Andrew (previous) (diff)

comment:19 by giacomo, 4 years ago

Cc: giacomo added

I'm also experiencing this error on Django >= 3.1.1 when using jsonb_array_length on a JSONField that contains a list:

class MyModel(models.Model):
    field = JSONField(default=list)

>>> MyModel.objects.annotate(field_len=Func(F('field'), function='jsonb_array_length'))

Traceback (most recent call last):
  File "<console>", line 1, in <module>
  File "/home/ubuntu/text-annotation/lib/python3.8/site-packages/django/db/models/query.py", line 263, in __repr__
    data = list(self[:REPR_OUTPUT_SIZE + 1])
  File "/home/ubuntu/text-annotation/lib/python3.8/site-packages/django/db/models/query.py", line 287, in __iter__
    self._fetch_all()
  File "/home/ubuntu/text-annotation/lib/python3.8/site-packages/django/db/models/query.py", line 1308, in _fetch_all
    self._result_cache = list(self._iterable_class(self))
  File "/home/ubuntu/text-annotation/lib/python3.8/site-packages/django/db/models/query.py", line 70, in __iter__
    for row in compiler.results_iter(results):
  File "/home/ubuntu/text-annotation/lib/python3.8/site-packages/django/db/models/sql/compiler.py", line 1100, in apply_converters
    value = converter(value, expression, connection)
  File "/home/ubuntu/text-annotation/lib/python3.8/site-packages/django/db/models/fields/json.py", line 83, in from_db_value
    return json.loads(value, cls=self.decoder)
  File "/home/ubuntu/.pyenv/versions/3.8.6/lib/python3.8/json/__init__.py", line 341, in loads
    raise TypeError(f'the JSON object must be str, bytes or bytearray, '
TypeError: the JSON object must be str, bytes or bytearray, not int

Things that I checked after finding this ticket:

  • the JSONField was "converted" from the JSONField of the old Postgres contrib package (afaik no actual migration happened)
  • in my table, the corresponding field was correctly set to jsonb
  • downgrading to Django 3.1 makes it work

I've resorted to using raw SQL for this query, as suggested by @Andrew.

in reply to:  19 ; comment:20 by Mariusz Felisiak, 4 years ago

Replying to giacomo:

I'm also experiencing this error on Django >= 3.1.1 when using jsonb_array_length on a JSONField that contains a list:

class MyModel(models.Model):
    field = JSONField(default=list)

>>> MyModel.objects.annotate(field_len=Func(F('field'), function='jsonb_array_length'))

Do you really want to get JSON as a result of JSONB_ARRAY_LENGTH()? I think you missed output_field, e.g.

MyModel.objects.annotate(field_len=Func(F('field'), function='jsonb_array_length', output_field=IntegerField()))

works fine.

in reply to:  20 comment:21 by giacomo, 4 years ago

Replying to Mariusz Felisiak:

Do you really want to get JSON as a result of JSONB_ARRAY_LENGTH()? I think you missed output_field, e.g.

MyModel.objects.annotate(field_len=Func(F('field'), function='jsonb_array_length', output_field=IntegerField()))

works fine.

Thanks, I missed that part of the Func documentation and I was also confused because it stopped working in a minor version. The documentation could be improved, though: output_field is not included in the class attributes and is only mentioned at the end without an example...

comment:22 by Garry Polley, 3 years ago

I'm not sure of the best spot to put this information. I upgraded our code base from Django 2.2.x to 3.2.x . In doing that upgrade I got a similar error seen above:

Traceback (most recent call last):
  File "/usr/local/lib/python3.8/site-packages/celery/app/trace.py", line 412, in trace_task
    R = retval = fun(*args, **kwargs)
  File "/usr/local/lib/python3.8/site-packages/celery/app/trace.py", line 704, in __protected_call__
    return self.run(*args, **kwargs)
  File "/usr/local/lib/python3.8/site-packages/celery/app/autoretry.py", line 50, in run
    raise task.retry(exc=exc, **retry_kwargs)
  File "/usr/local/lib/python3.8/site-packages/celery/app/task.py", line 706, in retry
    raise_with_context(exc)
  File "/usr/local/lib/python3.8/site-packages/celery/app/autoretry.py", line 35, in run
    return task._orig_run(*args, **kwargs)
OUR CODE
    for preference in get_queryset_page(
  File "/usr/local/lib/python3.8/site-packages/django/db/models/query.py", line 280, in __iter__
    self._fetch_all()
  File "/usr/local/lib/python3.8/site-packages/django/db/models/query.py", line 1324, in _fetch_all
    self._result_cache = list(self._iterable_class(self))
  File "/usr/local/lib/python3.8/site-packages/django/db/models/query.py", line 68, in __iter__
    for row in compiler.results_iter(results):
  File "/usr/local/lib/python3.8/site-packages/django/db/models/sql/compiler.py", line 1122, in apply_converters
    value = converter(value, expression, connection)
  File "/usr/local/lib/python3.8/site-packages/django/db/models/fields/json.py", line 83, in from_db_value
    return json.loads(value, cls=self.decoder)
  File "/usr/local/lib/python3.8/json/__init__.py", line 341, in loads
    raise TypeError(f'the JSON object must be str, bytes or bytearray, '
TypeError: the JSON object must be str, bytes or bytearray, not dict

Our database uses the json datatype in Postgres rather than jsonb. The database is not managed by the Django application, so we cannot change the datatype.

The odd part here is that it is a dictionary coming out of the postgres backend. I think whenever this change went in https://github.com/django/django/commit/0be51d2226fce030ac9ca840535a524f41e9832c it changed how existing json types were working.

The solution I'm going with right now is to use the current JSONField implementation and do check before the json.loads call to see if value is a dictionary type.

class JSONFieldJSONType(JSONField):
    """
    Custom JSON field because our postgres uses json type and not jsonb.

    Details on these changes within Django can be seen here:

    * https://code.djangoproject.com/ticket/31973
    * https://code.djangoproject.com/ticket/31956#comment:8

    PR that changed behavior for regular json type:
    https://github.com/django/django/commit/0be51d2226fce030ac9ca840535a524f41e9832c

    """

    def from_db_value(self, value, expression, connection):
        if value is None:
            return value
        # Some backends (SQLite at least) extract non-string values in their
        # SQL datatypes.
        if isinstance(expression, KeyTransform) and not isinstance(value, str):
            return value
        try:
            # Custom implementation for how our data comes out of our postgres
            # connection.
            if isinstance(value, dict):
                data_value = self.get_prep_value(value)
            else:
                data_value = value
            return json.loads(data_value, cls=self.decoder)
        except json.JSONDecodeError:
            return value
Last edited 3 years ago by Garry Polley (previous) (diff)

comment:23 by Mariusz Felisiak, 3 years ago

Garry, using json data type was never supported, see #32135.

comment:24 by morpheism, 3 years ago

FWIW, I have also encountered this error when upgrading a site from Django 2.2 to Django 3.2. Perhaps the PostgreSQL json column type was never intentionally supported, but it appears to have worked until 3.1.1.

+1 Garry for the workaround.

Note: See TracTickets for help on using tickets.
Back to Top