#31973 closed Bug (invalid)
TypeError loading data in JSONField if DB has native JSON support
Reported by: | Adam Alton | Owned by: | nobody |
---|---|---|---|
Component: | Database layer (models, ORM) | Version: | 3.1 |
Severity: | Normal | Keywords: | |
Cc: | Thiago Bellini Ribeiro, giacomo | Triage Stage: | Unreviewed |
Has patch: | no | Needs documentation: | no |
Needs tests: | no | Patch needs improvement: | no |
Easy pickings: | no | UI/UX: | no |
Description
Following today's security release, I've upgraded from Django 3.1 to 3.1.1 with no other code changes, and suddenly I'm getting this error when loading a model instance from the DB:
list(MyModel.objects.all()) Traceback (most recent call last): File "<console>", line 1, in <module> File "/usr/local/lib/python3.8/site-packages/django/db/models/query.py", line 287, in __iter__ self._fetch_all() File "/usr/local/lib/python3.8/site-packages/django/db/models/query.py", line 1303, in _fetch_all self._result_cache = list(self._iterable_class(self)) File "/usr/local/lib/python3.8/site-packages/django/db/models/query.py", line 70, in __iter__ for row in compiler.results_iter(results): File "/usr/local/lib/python3.8/site-packages/django/db/models/sql/compiler.py", line 1100, in apply_converters value = converter(value, expression, connection) File "/usr/local/lib/python3.8/site-packages/django/db/models/fields/json.py", line 74, in from_db_value return json.loads(value, cls=self.decoder) File "/usr/local/lib/python3.8/json/__init__.py", line 341, in loads raise TypeError(f'the JSON object must be str, bytes or bytearray, ' TypeError: the JSON object must be str, bytes or bytearray, not list
From looking at the recent changes between 3.0 and 3.1, I'm pretty sure that the problem was introduced in this commit: https://github.com/django/django/commit/0be51d2226fce030ac9ca840535a524f41e9832c
Specifically, in django/db/models/fields/json.py, the removal of this from from_db_value
:
if connection.features.has_native_json_field and self.decoder is None: return value
I'm pretty sure that this means that if the value has already been decoded and the DB has got native JSON support, then it continues through this function, attempting to decode the already-decoded data, and therefore gets a TypeError
. But note that the exception catching around json.loads
only catches json.JSONDecodeError
, not TypeError
, and hence it dies.
I will see if I can write a test to prove this (or run the existing tests with postgres, which has native JSON support). But I'm reporting this now, as it seems like a fairly serious breaking change in 3.1.1. And maybe someone else who understands the changes in the mentioned commit can fix this faster than I can.
Change History (24)
comment:1 by , 4 years ago
comment:2 by , 4 years ago
I will see if I can write a test to prove this (or run the existing tests with postgres, which has native JSON support)
The existing tests, which cover also lists in JSONField
s, work on PostgreSQL.
comment:3 by , 4 years ago
Component: | Uncategorized → Database layer (models, ORM) |
---|---|
Resolution: | → needsinfo |
Status: | new → closed |
Type: | Uncategorized → Bug |
follow-up: 5 comment:4 by , 4 years ago
Thanks for your swift reply, @felixxm. You're right, this is caused by the database having the fields as type json
rather than jsonb
, even though my DB schema has only ever been altered via Django's migrations.
So (contrary to my original report) there is not a bug in value_from_db
. But…
The question now is whether there's a bug in Django's migration logic.
The fields which were getting this error were originally using a custom JSONField
, which was a subclass of models.TextField
, and so presumably their original DB column type was text
. I then changed them to use the native models.JSONField
, and got Django to automatically create a migration (which contains a migrations.AlterField
operation for each field, converting them to the native models.JSONField
).
The weird thing is that that migration converted the DB columns to type json
rather than jsonb
.
I can fix that in my own DB, no problem. But the remaining question is this: is that behaviour unexpected and should we fix it?
It seems unexpected to me that Django is happily creating the migration and applying it, but that the DB then ends up in an unusable state. Presumably Django should either say "I can't do that change, write your own migration" or it should apply the correct change.
Happy to make a new ticket if the decision is that this is a valid bug.
comment:5 by , 4 years ago
The fields which were getting this error were originally using a custom
JSONField
, which was a subclass ofmodels.TextField
, and so presumably their original DB column type wastext
. I then changed them to use the nativemodels.JSONField
, and got Django to automatically create a migration (which contains amigrations.AlterField
operation for each field, converting them to the nativemodels.JSONField
).
The weird thing is that that migration converted the DB columns to type
json
rather thanjsonb
.
Django have never used the json
datatype, so it's not possible that it used json
type in migrations. Can you provide more detail about this custom JSONField
? Was it your own implementation or a field provided by 3rd-party package?
I can fix that in my own DB, no problem. But the remaining question is this: is that behaviour unexpected and should we fix it?
I cannot reproduce it (custom JSONField(TextField)
-> models.JSONField
works for me). Can you provide a small sample project that reproduces this issue?
comment:6 by , 4 years ago
I’m getting exactly the same problem: a TypeError
on my JSONFields. I was upgrading directly from Django 3.0.10
to 3.1.1
and none of the JSONFields load properly. I downgraded to 3.1.0
and they do seem to work properly again.
The problem seems to be present when the data in the JSONField is a bool
, list
or dict
.
I’m not using any custom JSONField stuff, only the built-ins. It does appear that because only the json.JSONDecodeError
is being caught as an exception, its missing the case where—because of the DBs built-in JSON support—the data has already been converted.
comment:7 by , 4 years ago
I’m not using any custom JSONField stuff, only the built-ins. It does appear that because only the json.JSONDecodeError is being caught as an exception, its missing the case where—because of the DBs built-in JSON support—the data has already been converted.
The data shouldn't be converted because Django uses a custom stub converter. Please provide a sample project that reproduces this issue.
comment:8 by , 4 years ago
The example code that shows broken functionality is here: https://code.djangoproject.com/ticket/31956#comment:8
I think that JSONField should return the dict either way and it was working fine in 3.1.
Previously fixed ordering bug was related to "with a custom decoder", but in code example I'm not specifying different decoder.
I think that this will be "blocker" for users to be able to upgrade to next versions if they depend on JSONField being converted to dict automatically.
comment:10 by , 4 years ago
I too am suffering from this issue.
I don't use any custom stuff on my JSONFields and also don't manipulate it in any weird way, I just save/load lists/dicts from them.
They were all from postgresql's contrib package before and I changed them to models.JSONField when I migrated to 3.1. Upgrading to 3.1.1 gives me the exact error described here even when just opening the django's admin interface for the model. Took a long time for me to figure it out and the error went away when I downgrade django to 3.1.
comment:11 by , 4 years ago
Cc: | added |
---|
follow-up: 13 comment:12 by , 4 years ago
Thiago, using JSONField
's in admin works for me. Can you provide a sample project that reproduces this issue?
comment:13 by , 4 years ago
Replying to felixxm:
Thiago, using
JSONField
's in admin works for me. Can you provide a sample project that reproduces this issue?
OMG I cannot reproduce the issue anymore. I really had this happening at the day after the 3.1.1's release where I spent some time trying to understand what was wrong, and after downgrading to 3.1.0 it worked fine again.
Would like to say that "well, it is working now so it is fine", but the fact that I'm sure that something did indeed happen that day and other people here in this ticket says that something also happened makes me unsure..
comment:14 by , 4 years ago
I've now dug further into what happened in my case.
It turns out that I actually wrote a custom migration when I moved from the custom JSONField
(which subclassed models.TextField
) to the built-in models.JSONField
. I think I must have underestimated how clever Django would be in doing that conversion, and so I wrote a custom migrations.RunSQL
, assuming that Django wouldn't do the right conversion for me. My custom migration did something like this:
ALTER TABLE my_table ALTER COLUMN my_field TYPE JSON USING my_field::JSON;
I'd totally forgotten about this (and failed to see it when I looked through my migrations before I raised this ticket).
So in my case, I'm pretty sure that the problem was my own fault for doing my_field::JSON
instead of my_field::JSONB
. (And for assuming that Django wouldn't be clever enough to do this for me!) Sorry for the false alarm on this ticket, and thank you @felixxm for your help.
I hope that this info helps the other people who've got this error to get to the bottom of what's going on in their case. But whatever it is, my guess is that it's not a bug in Django!
comment:16 by , 4 years ago
Resolution: | needsinfo → invalid |
---|
comment:17 by , 4 years ago
This is a general comment for other people who many run into this problem—I diagnosed my problem with a lot of dependency checks.
My problem was specifically related to django-db-geventpool
—at the time it wasn’t compatible with the new JSONField.
https://github.com/jneight/django-db-geventpool
Just today ran into this exact same problem again because a reversion in the most recent version of django-db-geventpool
, v3.2.4
which is not compatible with Django/3.1—I’ll have to wait to upgrade this dependency until after their next release.
comment:18 by , 4 years ago
I don't think this issue is invalid. I have the exact same reproducible problem on a new django project with 3.1.0 vs 3.1.1.
To fix it in the raw sql, you can append ::json
to field you are querying!
3.1.1 really did introduce issues with jsonb
fields for no good reason that I can see, since simply lying about the type works just fine!
follow-up: 20 comment:19 by , 4 years ago
Cc: | added |
---|
I'm also experiencing this error on Django >= 3.1.1 when using jsonb_array_length
on a JSONField
that contains a list:
class MyModel(models.Model): field = JSONField(default=list) >>> MyModel.objects.annotate(field_len=Func(F('field'), function='jsonb_array_length')) Traceback (most recent call last): File "<console>", line 1, in <module> File "/home/ubuntu/text-annotation/lib/python3.8/site-packages/django/db/models/query.py", line 263, in __repr__ data = list(self[:REPR_OUTPUT_SIZE + 1]) File "/home/ubuntu/text-annotation/lib/python3.8/site-packages/django/db/models/query.py", line 287, in __iter__ self._fetch_all() File "/home/ubuntu/text-annotation/lib/python3.8/site-packages/django/db/models/query.py", line 1308, in _fetch_all self._result_cache = list(self._iterable_class(self)) File "/home/ubuntu/text-annotation/lib/python3.8/site-packages/django/db/models/query.py", line 70, in __iter__ for row in compiler.results_iter(results): File "/home/ubuntu/text-annotation/lib/python3.8/site-packages/django/db/models/sql/compiler.py", line 1100, in apply_converters value = converter(value, expression, connection) File "/home/ubuntu/text-annotation/lib/python3.8/site-packages/django/db/models/fields/json.py", line 83, in from_db_value return json.loads(value, cls=self.decoder) File "/home/ubuntu/.pyenv/versions/3.8.6/lib/python3.8/json/__init__.py", line 341, in loads raise TypeError(f'the JSON object must be str, bytes or bytearray, ' TypeError: the JSON object must be str, bytes or bytearray, not int
Things that I checked after finding this ticket:
- the
JSONField
was "converted" from theJSONField
of the old Postgres contrib package (afaik no actual migration happened) - in my table, the corresponding field was correctly set to
jsonb
- downgrading to Django 3.1 makes it work
I've resorted to using raw SQL for this query, as suggested by @Andrew.
follow-up: 21 comment:20 by , 4 years ago
Replying to giacomo:
I'm also experiencing this error on Django >= 3.1.1 when using
jsonb_array_length
on aJSONField
that contains a list:
class MyModel(models.Model): field = JSONField(default=list) >>> MyModel.objects.annotate(field_len=Func(F('field'), function='jsonb_array_length'))
Do you really want to get JSON
as a result of JSONB_ARRAY_LENGTH()
? I think you missed output_field
, e.g.
MyModel.objects.annotate(field_len=Func(F('field'), function='jsonb_array_length', output_field=IntegerField()))
works fine.
comment:21 by , 4 years ago
Replying to Mariusz Felisiak:
Do you really want to get
JSON
as a result ofJSONB_ARRAY_LENGTH()
? I think you missedoutput_field
, e.g.
MyModel.objects.annotate(field_len=Func(F('field'), function='jsonb_array_length', output_field=IntegerField()))works fine.
Thanks, I missed that part of the Func
documentation and I was also confused because it stopped working in a minor version. The documentation could be improved, though: output_field
is not included in the class attributes and is only mentioned at the end without an example...
comment:22 by , 3 years ago
I'm not sure of the best spot to put this information. I upgraded our code base from Django 2.2.x to 3.2.x . In doing that upgrade I got a similar error seen above:
Traceback (most recent call last): File "/usr/local/lib/python3.8/site-packages/celery/app/trace.py", line 412, in trace_task R = retval = fun(*args, **kwargs) File "/usr/local/lib/python3.8/site-packages/celery/app/trace.py", line 704, in __protected_call__ return self.run(*args, **kwargs) File "/usr/local/lib/python3.8/site-packages/celery/app/autoretry.py", line 50, in run raise task.retry(exc=exc, **retry_kwargs) File "/usr/local/lib/python3.8/site-packages/celery/app/task.py", line 706, in retry raise_with_context(exc) File "/usr/local/lib/python3.8/site-packages/celery/app/autoretry.py", line 35, in run return task._orig_run(*args, **kwargs) OUR CODE for preference in get_queryset_page( File "/usr/local/lib/python3.8/site-packages/django/db/models/query.py", line 280, in __iter__ self._fetch_all() File "/usr/local/lib/python3.8/site-packages/django/db/models/query.py", line 1324, in _fetch_all self._result_cache = list(self._iterable_class(self)) File "/usr/local/lib/python3.8/site-packages/django/db/models/query.py", line 68, in __iter__ for row in compiler.results_iter(results): File "/usr/local/lib/python3.8/site-packages/django/db/models/sql/compiler.py", line 1122, in apply_converters value = converter(value, expression, connection) File "/usr/local/lib/python3.8/site-packages/django/db/models/fields/json.py", line 83, in from_db_value return json.loads(value, cls=self.decoder) File "/usr/local/lib/python3.8/json/__init__.py", line 341, in loads raise TypeError(f'the JSON object must be str, bytes or bytearray, ' TypeError: the JSON object must be str, bytes or bytearray, not dict
Our database uses the json
datatype in Postgres rather than jsonb
. The database is not managed by the Django application, so we cannot change the datatype.
The odd part here is that it is a dictionary coming out of the postgres backend. I think whenever this change went in https://github.com/django/django/commit/0be51d2226fce030ac9ca840535a524f41e9832c it changed how existing json
types were working.
The solution I'm going with right now is to use the current JSONField
implementation and do check before the json.loads
call to see if value
is a dictionary type.
class JSONFieldJSONType(JSONField): """ Custom JSON field because our postgres uses json type and not jsonb. Details on these changes within Django can be seen here: * https://code.djangoproject.com/ticket/31973 * https://code.djangoproject.com/ticket/31956#comment:8 PR that changed behavior for regular json type: https://github.com/django/django/commit/0be51d2226fce030ac9ca840535a524f41e9832c """ def from_db_value(self, value, expression, connection): if value is None: return value # Some backends (SQLite at least) extract non-string values in their # SQL datatypes. if isinstance(expression, KeyTransform) and not isinstance(value, str): return value try: # Custom implementation for how our data comes out of our postgres # connection. if isinstance(value, dict): data_value = self.get_prep_value(value) else: data_value = value return json.loads(data_value, cls=self.decoder) except json.JSONDecodeError: return value
comment:24 by , 3 years ago
FWIW, I have also encountered this error when upgrading a site from Django 2.2 to Django 3.2. Perhaps the PostgreSQL json
column type was never intentionally supported, but it appears to have worked until 3.1.1.
+1 Garry for the workaround.
Can you provide more details? Are you using a custom decoder/encoder? The only reason that I can imagine is that you use
json
instead ofjsonb
datatype in your database.