Opened 6 years ago

Closed 6 years ago

Last modified 6 years ago

#29147 closed Bug (invalid)

Postgres JSONField missing to_python

Reported by: Javier Buzzi Owned by: Williams Mendez
Component: contrib.postgres Version: 1.9
Severity: Normal Keywords:
Cc: Triage Stage: Unreviewed
Has patch: yes Needs documentation: no
Needs tests: no Patch needs improvement: no
Easy pickings: yes UI/UX: no

Description (last modified by Javier Buzzi)

Every other field implements a to_python except for JSONField:

https://github.com/django/django/blob/master/django/contrib/postgres/fields/hstore.py#L38
https://github.com/django/django/blob/master/django/contrib/postgres/fields/array.py#L102
https://github.com/django/django/blob/master/django/contrib/postgres/fields/ranges.py#L47

JSONField defaults to the value, meaning if you pass '{"object": "value"}' you get back '{"object": "value"}'. The same example in HStoreField, you pass in '{"object": "value"}' you get back {"object": "value"} -- makes the world of difference.

PR. ​https://github.com/django/django/pull/9801

Change History (17)

comment:1 by Williams Mendez, 6 years ago

Owner: set to Williams Mendez
Status: newassigned

comment:2 by Tim Graham, 6 years ago

Resolution: invalid
Status: assignedclosed

I don't think that change is correct because a string is valid JSON -- to treat it differently would break backwards compatibility.

comment:3 by Williams Mendez, 6 years ago

At first I thought the same, but when I try to store a String into a postgres 'jsonb' field it crashes. Same when I try to fetch a string as a json:

will_test=# select 'hello'::json ;
ERROR:  invalid input syntax for type json
LINE 1: select 'hello'::json ;
               ^
DETAIL:  Token "hello" is invalid.
CONTEXT:  JSON data, line 1: hello


will_test=# select '{}'::json ;
 json
------
 {}
(1 row)

I'll do some research on that topic.

comment:4 by Williams Mendez, 6 years ago

Yeah, it's totally right:

will_test=# select '"hello"'::json ;
  json
---------
 "hello"
(1 row)

comment:5 by Javier Buzzi, 6 years ago

As i understood it was that to_python() is the representation of the data in python, postgres could save it and do with it what it wants; store it binary/urlencode it/ encrypt it for all it wants, but from a python perspective that data to me, needs to be represented as a dict/simple type (if it is "hello") -- i still don't understand why this is invalid.

$ echo '"hello"' | docker run --rm -i python:3.6 python -c 'import json, sys; print(json.load(sys.stdin))'
hello
$ echo '{"hello": "hi"}' | docker run --rm -i python:3.6 python -c 'import json, sys; print(json.load(sys.stdin))'
{'hello': 'hi'}
$ echo '{"hello": "hi"}' | docker run --rm -i python:2.7 python -c 'import json, sys; print(json.load(sys.stdin))'
{u'hello': u'hi'}
$ echo '"hello"' | docker run --rm -i python:2.7 python -c 'import json, sys; print(json.load(sys.stdin))'
hello
Last edited 6 years ago by Javier Buzzi (previous) (diff)

comment:6 by Tim Graham, 6 years ago

Can you be more clear about what the unexpected behavior is? (i.e. give a failing test for tests/postgres_tests/test_json.py)

comment:7 by Javier Buzzi, 6 years ago

I added this (copy and paste from hstore.py) and it doenst break anything

--- a/django/contrib/postgres/fields/jsonb.py
+++ b/django/contrib/postgres/fields/jsonb.py
@@ -41,6 +41,14 @@ class JSONField(CheckFieldDefaultMixin, Field):
         self.encoder = encoder
         super().__init__(verbose_name, name, **kwargs)
 
+    def to_python(self, value):
+        if isinstance(value, str):
+            value = json.loads(value)
+        return value
+
+    def value_to_string(self, obj):
+        return json.dumps(self.value_from_object(obj))
+
     def db_type(self, connection):
         return 'jsonb'

ran the tests:

$ /django/tests/runtests.py postgres_tests.test_json --failfast --settings=mysettings
Testing against Django installed in '/usr/local/lib/python3.6/site-packages/Django-2.1.dev20180313024248-py3.6.egg/django' with up to 4 processes
Creating test database for alias 'default'...
Cloning test database for alias 'default'...
Cloning test database for alias 'default'...
Cloning test database for alias 'default'...
Cloning test database for alias 'default'...
System check identified no issues (0 silenced).
.......................................................
----------------------------------------------------------------------
Ran 55 tests in 0.118s

OK
Destroying test database for alias 'default'...
Destroying test database for alias 'default'...
Destroying test database for alias 'default'...
Destroying test database for alias 'default'...
Destroying test database for alias 'default'...

Here are all the steps on how i got it to work:

$ docker run --rm -it python:3.6.4 bash
apt-get update
git clone https://github.com/django/django.git
cd django
sed -i "s/^exit 101$/exit 0/" /usr/sbin/policy-rc.d
apt-get install -y libmemcached-dev postgresql vim postgresql-contrib
python setup.py install
pip install -rtests/requirements/py3.txt -rtests/requirements/postgres.txt
echo "SECRET_KEY = 'hellooooo'
DATABASES = {
    'default': {
        'ENGINE': 'django.db.backends.postgresql_psycopg2',
        'NAME': 'django',
        'USER': 'django',
        'PASSWORD': 'django',
        'HOST': 'localhost',
        'PORT': 5432,
    }
}" > tests/mysettings.py
echo "local   all         all                         trust" >> /etc/postgresql/9.4/main/pg_hba.conf 
echo "diff --git a/django/contrib/postgres/fields/jsonb.py b/django/contrib/postgres/fields/jsonb.py
index 3c27607..49b65ef 100644
--- a/django/contrib/postgres/fields/jsonb.py
+++ b/django/contrib/postgres/fields/jsonb.py
@@ -41,6 +41,14 @@ class JSONField(CheckFieldDefaultMixin, Field):
         self.encoder = encoder
         super().__init__(verbose_name, name, **kwargs)
 
+    def to_python(self, value):
+        if isinstance(value, str):
+            value = json.loads(value)
+        return value
+
+    def value_to_string(self, obj):
+        return json.dumps(self.value_from_object(obj))
+
     def db_type(self, connection):
         return 'jsonb'
 
" > mychanges.diff
patch django/contrib/postgres/fields/jsonb.py mychanges.diff
service postgresql start
echo "set the password to 'django'"
su postgres -c 'createuser --superuser django -P'
/django/tests/runtests.py postgres_tests.test_json --failfast --settings=mysettings

Last edited 6 years ago by Javier Buzzi (previous) (diff)

comment:8 by Tim Graham, 6 years ago

What behavior changes as a result of the patch?

comment:9 by Javier Buzzi, 6 years ago

@Tim, nothing that i can see.

We need this for Aloe, we have a string table to load data for our tests and we rely heavily on the field's ability to turn a raw/string value to a python value.

Example:

>> from django.db.models import BooleanField, CharField, NumberField
>> from django.contrib.postgres import fields
>> CharField.to_python(None, 'hello')
'hello'
>> IntegerField.to_python(None, '123')
123
>> BooleanField.to_python(None, 'True')
True
>> fields.HStoreField.to_python(None, '{"hello": "world"}')
{'hello': 'world'}

The only thing that breaks that pattern in the entire lib, is JSONField:

>> fields.JSONField.to_python(None, '{"hello": "world"}')
'{"hello": "world"}'
Last edited 6 years ago by Javier Buzzi (previous) (diff)

comment:10 by Tim Graham, 6 years ago

I believe your proposal is backwards incompatible. Assuming that a string should be parsed as a JSON dict rather than treating a string as a string could be undesired. Your patch results in a test failure of postgres_tests.test_json.TestSerialization.test_dumping.

comment:11 by Javier Buzzi, 6 years ago

Please explain, i don't see that on my side. (using the same setup from above)

# /django/tests/runtests.py postgres_tests.test_json.TestSerialization.test_dumping --failfast --settings=mysettings
Testing against Django installed in '/usr/local/lib/python3.6/site-packages/Django-2.1.dev20180320161010-py3.6.egg/django' with up to 4 processes
Creating test database for alias 'default'...
System check identified no issues (0 silenced).
.
----------------------------------------------------------------------
Ran 1 test in 0.004s

OK
Destroying test database for alias 'default'...

comment:12 by Tim Graham, 6 years ago

In your patch, I think you missed the JSONField.value_to_string() that's already defined. It overrides the one you added since it's declared below yours. The test failure happens if you remove the existing value_to_string() method in favor of yours. Anyway, if you provide a pull request that passes all tests and includes a test for the new behavior, I'll take a closer look to see if this is suitable.

comment:13 by Javier Buzzi, 6 years ago

Thanks Tim.

Here it is: https://github.com/django/django/pull/9801

Update: ..clearly i'm running these tests incorrectly.. i cannot seem to reproduce it locally.

Last edited 6 years ago by Javier Buzzi (previous) (diff)

comment:14 by Javier Buzzi, 6 years ago

@Tim, can you look at the PR ​https://github.com/django/django/pull/9801

Any comments, concerns, and/or changes, will be appreciated.

comment:15 by Javier Buzzi, 6 years ago

Description: modified (diff)
Has patch: set

comment:16 by Tim Graham <timograham@…>, 6 years ago

In 623139b:

Refs #29147 --- Added JSONField serialization tests.

comment:17 by Tim Graham, 6 years ago

The tests committed above demonstrate why this proposal is backwards incompatible. I think if your application makes the assumption you've articulated about your data, you'll have to add your own parsing.

Note: See TracTickets for help on using tickets.
Back to Top