Opened 4 years ago

Closed 4 years ago

Last modified 4 years ago

#32135 closed Uncategorized (invalid)

JSONField db_type `jsonb`==>`json` for sequential data no longer works

Reported by: Michael Anuzis Owned by: nobody
Component: Database layer (models, ORM) Version: 3.1
Severity: Normal Keywords:
Cc: Triage Stage: Unreviewed
Has patch: no Needs documentation: no
Needs tests: no Patch needs improvement: no
Easy pickings: no UI/UX: no

Description

Issue: django.contrib.postgres.fields.JSONField in Django3.0.6 allows overriding db_type jsonb to json, which allows preserving the order of items in nested OrderedDicts (ref: https://stackoverflow.com/a/57941588/1377763 ). Django3.1.2 appears to force use of jsonb by default (ref: https://docs.djangoproject.com/en/3.1/ref/contrib/postgres/fields/#jsonfield, "JSONField uses jsonb"), losing the order of items in the JSON, apparently providing no alternative option to retain it, and not supporting previous workarounds.

Steps to reproduce:
1) Create a model using a JSONField and attempt to override jsonb to json using the RawJSONField example in the stackoverflow link above.
2) Initialize an instance of the model with data in the JSONField involving OrderedDicts (example below you can copy/paste)
3) Attempt to retrieve the model, read, and/or update the field. For example, via MyJSONModel.objects.get(user=user)

Error:

  • TypeError: the JSON object must be str, bytes or bytearray, not dict
  • Exception location: /usr/lib/python3.8/json/init.py, line 341, in loads

Since the RawJSONField resulting from the above workaround appears to require a string (akin to storing json in a TextField...), I attempted to resolve this by manually converting the field's data via json.loads() and json.dumps() at each interaction with the field, but this resulted in continued errors of the same type and did not appear to have a clear fix. I've abandoned 3.0==>3.1 upgrade for now as the ability to retain the sequence of data in a JSONField is critical, 3.1 documentation doesn't describe how to force jsonb ==> json, and suggested workarounds via stackoverflow do not transition smoothly between versions.

Example JSON data structure for reproducing bug:
from collections import OrderedDict
progress = OrderedDict([
('BODY1_SLUG', {'name': 'BODY1_NAME', 'n_videos': 19, 'kbranches': OrderedDict([
('BRANCH1_SLUG', {'name': 'BRANCH1_NAME', 'n_videos': 7, 'vids_duration_mins': 20, 'completed': 0, 'concepts': OrderedDict([
('CONCEPT1_SLUG', {'name': 'CONCEPT1_NAME', 'videos': [{'name': 'A', 'url': 'UA', 'duration_in_secs': 120, 'viewed': 1}, {...}]}),
('CONCEPT2_SLUG', {'name': 'CONCEPT2_NAME', 'videos': [{'name': 'B', 'url': 'UB', 'duration_in_secs': 120, 'viewed': 0}, {...}]}),
])})
])})
]) # Initially provided with indentation, but the bug report parsing converted to block quotes.

Change History (5)

comment:1 by Michael Anuzis, 4 years ago

Summary: Django3.1.2 JSONField bug breaks backward compatibility of db_type `jsonb`==>`json` preventing upgrade from Django3.0.63.0.6==>3.1.2 JSONField db_type `jsonb`==>`json` for storing sequential data breaks backwards compatibility

comment:2 by Michael Anuzis, 4 years ago

Update: I see this may be a duplicate (or at least closely related to): https://code.djangoproject.com/ticket/31973 and https://code.djangoproject.com/ticket/32111

In these existing tickets, the bugs are closed with the assertion "It looks that you use json instead of jsonb datatype in your database, which is not supported."

Is there a reason json was permitted via workaround in 3.0 and appears forcefully disabled in 3.1? Is there another option suggested for Django applications that need to preserve the sequence of similar data structures, and where a format like JSON appears to be the cleanest & simplest option for doing so?

Context: Not an expert on json or jsonb and didn't know the difference existed until I discovered a bug where jsonb loses the sequence of data deliberately stored via OrderedDict. Not the only person with this need looking for a solution online, the stackoverflow workaround suggested earlier appeared to solve the issue cleanly.

In my case, I don't need to query/filter against any subfields within the JSON via Django's ORM and one potential workaround would be using a TextField and manually processing every interaction with json.loads() & json.dumps() since this workaround is only needed in 1 place of the app, and while preserving sequence is critically important it's a low frequency interaction and performance would be fine. It'd be nice if Django could support preserving the sequence of objects stored via JSON, but if version3.1 going forward strictly forbid a workaround that worked up to 3.0 it would help to know so I can put in the work now switching this instance to a TextField and moving on to 3.1. Appreciate any guidance you can provide.

Version 5, edited 4 years ago by Michael Anuzis (previous) (next) (diff)

comment:3 by Michael Anuzis, 4 years ago

Summary: 3.0.6==>3.1.2 JSONField db_type `jsonb`==>`json` for storing sequential data breaks backwards compatibility3.0.6==>3.1.2 JSONField db_type `jsonb`==>`json` for sequential data no longer works

comment:4 by Mariusz Felisiak, 4 years ago

Component: UncategorizedDatabase layer (models, ORM)
Resolution: invalid
Status: newclosed
Summary: 3.0.6==>3.1.2 JSONField db_type `jsonb`==>`json` for sequential data no longer worksJSONField db_type `jsonb`==>`json` for sequential data no longer works

Is there a reason json was permitted via workaround in 3.0 and forcefully disabled in 3.1?

Using json data type was never supported. django.contrib.postgres.fields.JSONField in Django > 3.1 also uses jsonb (see related #30476).

comment:5 by Sage Abdullah, 4 years ago

I'd like to note, if you still want to use JSON, you can register a handler for the connection_created signal. The handler should register a stub loader for JSON by calling psycopg2.extras.register_default_json(conn_or_curs=connection, loads=lambda x: x). See the registration for JSONB here. Otherwise, then you would have to override from_db_value so it just returns the value because it's already handled by psycopg2.

Note: See TracTickets for help on using tickets.
Back to Top