Opened 3 months ago

Closed 3 months ago

Last modified 3 months ago

#35525 closed New feature (wontfix)

Django error with parsing JSONField

Reported by: Nirjal Paudel Owned by: nobody
Component: Database layer (models, ORM) Version: dev
Severity: Normal Keywords: Models, fields, encoding, decoding
Cc: Triage Stage: Unreviewed
Has patch: no Needs documentation: no
Needs tests: no Patch needs improvement: no
Easy pickings: no UI/UX: no

Description

Hello, I have come across with error in django whilst parsing JSON fields in Django.
This happened in Django==5.2.
Database I used: Postgres 14

So I had a django JSONField table with column called logs which is a json field [ Not jsonb field ]. By default it the error followed as


>>> Synclogs.objects.last().logs
Traceback (most recent call last):
  File "<console>", line 1, in <module>
  File "/Users/nirjalpaudel/Programming/test/django_new/.venv/lib/python3.12/site-packages/django/db/models/manager.py", line 87, in manager_method
    return getattr(self.get_queryset(), name)(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/nirjalpaudel/Programming/test/django_new/.venv/lib/python3.12/site-packages/django/db/models/query.py", line 1110, in last
    for obj in queryset[:1]:
  File "/Users/nirjalpaudel/Programming/test/django_new/.venv/lib/python3.12/site-packages/django/db/models/query.py", line 400, in __iter__
    self._fetch_all()
  File "/Users/nirjalpaudel/Programming/test/django_new/.venv/lib/python3.12/site-packages/django/db/models/query.py", line 1928, in _fetch_all
    self._result_cache = list(self._iterable_class(self))
                         ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/nirjalpaudel/Programming/test/django_new/.venv/lib/python3.12/site-packages/django/db/models/query.py", line 123, in __iter__
    for row in compiler.results_iter(results):
  File "/Users/nirjalpaudel/Programming/test/django_new/.venv/lib/python3.12/site-packages/django/db/models/sql/compiler.py", line 1500, in apply_converters
    value = converter(value, expression, connection)
            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/nirjalpaudel/Programming/test/django_new/.venv/lib/python3.12/site-packages/django/db/models/fields/json.py", line 94, in from_db_value
    return json.loads(value, cls=self.decoder)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Library/Frameworks/Python.framework/Versions/3.12/lib/python3.12/json/__init__.py", line 339, in loads
    raise TypeError(f'the JSON object must be str, bytes or bytearray, '
TypeError: the JSON object must be str, bytes or bytearray, not list

Now, I come to see in the manual and code of postgres, using some kind of custom encoder and decoder will have helped the code, I followed using this

    class SyncLogField(models.JSONField):
        description = "A custom made json field for parsing and storing logs"

        def __init__(self, *args, **kwargs):
            super().__init__(*args, **kwargs)

        def from_db_value(self, value, expression, connection):
            if value is None:
                return value
            ## Here I was skipping the value.
            return value

And then I tried psycopg2 to see how does it interact with the postgres library and turns out that for JSONField, I print the original type of query parsed by the psyopg2 and tried to emulate what JSONField in django would do. Here is the code I tried to emulate django behaviour

import psycopg2
import json

# Database connection parameters
DB_NAME = "testname"
DB_USER = "testuser"
DB_PASSWORD = "testpassword"
DB_HOST = "localhost"
DB_PORT = "5437"

# Connect to the PostgreSQL database
try:
    connection = psycopg2.connect(
        dbname=DB_NAME,
        user=DB_USER,
        password=DB_PASSWORD,
        host=DB_HOST,
        port=DB_PORT
    )
    cursor = connection.cursor()

    # Perform the query to get the last log entry in ascending order by id
    query = "SELECT logs FROM synclogs ORDER BY id ASC LIMIT 1"
    cursor.execute(query)

    # Fetch the result
    result = cursor.fetchone()

    if result:
        # Get the original log
        log = result[0]
        
        # Print the type of the original log
        print("Type of original log:", type(log))

        # Parse the JSON log if it's a string
        if isinstance(log, str):
            log_data = json.loads(log)
            print("Last log entry:", log_data)
        else:
            print("Log is not a JSON string:", log)
    else:
        print("No logs found in the synclogs table.")

except Exception as e:
    print("An error occurred while connecting to the database or querying:", str(e))
finally:
    if connection:
        cursor.close()
        connection.close()

}}]

The result I got is that the json content got from db by psycopg2 is not string but is a list, so I had to skip the value entirely in my custom serializer. 

Instead of doing this, why don't we check if the content/value parsed from db is list or dict, by default python types for json
If yes - Don't do anything
If no and is a string instance - then perform json.loads() to it

I think that if we do this changes in database, we will have a JSONField that will work for both JSON and JSONB column in postgres

Change History (2)

comment:1 by Simon Charette, 3 months ago

Resolution: wontfix
Status: newclosed

So I had a django JSONField table with column called logs which is a json field [ Not jsonb field ].

Django doesn't support pointing fields designed to work with certain database data types to others, JSONField is not exception.

Instead of doing this, why don't we check if the content/value parsed from db is list or dict, by default python types for json
If yes - Don't do anything
If no and is a string instance - then perform json.loads() to it

If you need this feature simply subclass JSONField and override it's from_db_value method. Note that if you use this approach you won't be able to implement support for the decoder option as this requires the decoding to be performed by the field itself and not from psycopg.

in reply to:  1 comment:2 by Nirjal Paudel, 3 months ago

Hi, thank-you for replying. Can you explain to me the from_db_value part, as I think the default decoder option I mean just making an exception for JSONField with json type and psycopg2. How about the two code here. Sorry if this is stupid to ask

class DjangoJSONEncoder(json.JSONEncoder):
    """
    JSONEncoder subclass that knows how to encode date/time, decimal types, and
    UUIDs.
    """

    def default(self, o):
        # See "Date Time String Format" in the ECMA-262 specification.
        if isinstance(o, datetime.datetime):
            r = o.isoformat()
            if o.microsecond:
                r = r[:23] + r[26:]
            if r.endswith("+00:00"):
                r = r.removesuffix("+00:00") + "Z"
            return r
        elif isinstance(o, datetime.date):
            return o.isoformat()
        elif isinstance(o, datetime.time):
            if is_aware(o):
                raise ValueError("JSON can't represent timezone-aware times.")
            r = o.isoformat()
            if o.microsecond:
                r = r[:12]
            return r
        elif isinstance(o, datetime.timedelta):
            return duration_iso_string(o)
        elif isinstance(o, (decimal.Decimal, uuid.UUID, Promise)):
            return str(o)
        # how about adding 2 conditions here 
       # one for list and dict to put it here
        elif isinstance(o, list):
            return o
        elif isinstance(o, dict):
            return o
        else:
            return super().default(o)

Replying to Simon Charette:

So I had a django JSONField table with column called logs which is a json field [ Not jsonb field ].

Django doesn't support pointing fields designed to work with certain database data types to others, JSONField is not exception.

Instead of doing this, why don't we check if the content/value parsed from db is list or dict, by default python types for json
If yes - Don't do anything
If no and is a string instance - then perform json.loads() to it

If you need this feature simply subclass JSONField and override it's from_db_value method. Note that if you use this approach you won't be able to implement support for the decoder option as this requires the decoding to be performed by the field itself and not from psycopg.

Note: See TracTickets for help on using tickets.
Back to Top