Opened 15 years ago

Closed 15 years ago

Last modified 15 years ago

#11742 closed (invalid)

simplejson loads not always unicode

Reported by: kwek Owned by: nobody
Component: Uncategorized Version: 1.1
Severity: Keywords:
Cc: mjbroek@… Triage Stage: Unreviewed
Has patch: no Needs documentation: no
Needs tests: no Patch needs improvement: no
Easy pickings: no UI/UX: no

Description

In our project we use import simplejson from django as it provides a nice wrapper around python having json compiled in itself or the use of simplejson. But now we've upgraded to django 1.1 i noticed the following difference.

See the output of python without simplejson installed. Notice that it always returns as unicode for loads().

>>> from django.utils import simplejson
>>> simplejson.loads('"test"', encoding='utf-8')
u'test'
>>> simplejson.loads('"test"')
u'test'
>>> 

Now see the output with simplejson installed. Notice that this returns as a normal string and not as unicode as some would expect. 
{{{
>>> from django.utils import simplejson
>>> simplejson.loads('"test"')
'test'
>>> simplejson.loads('"test"', encoding='utf-8')
'test'
}}}

And here the output of the python 2.6 json module (always unicode too)
{{{
>>> import json
>>> json.loads('"tuut"')
u'tuut'
}}}

So is this by design or did this inconsistency sneak in with the new release? In the meantime we will just import json from python directly (2.6) but i thought ill mention it anyway. 

Change History (3)

comment:1 by Karen Tracey, 15 years ago

Resolution: invalid
Status: newclosed

It didn't exactly sneak in, a deliberate decision was made as to which simplejson implementation to use, see r9707.

The fact that simplejson is inconsistent here is a simplejson issue, see: http://code.google.com/p/simplejson/issues/detail?id=40

The behavior you are describing is due to an optimization, and the response to people reporting it appears to be that if you consistently want unicode back you should consistently feed unicode in.

comment:2 by kwek, 15 years ago

Input for loads() casted to unicode() and all is good now (even with simplejson).. thanks!

comment:3 by kwek, 15 years ago

actually did this the correct trick for me:

data = simplejson.loads(data.decode('utf-8'), object_hook=json_object_hook)
Note: See TracTickets for help on using tickets.
Back to Top