Opened 7 years ago

Closed 7 years ago

Last modified 3 years ago

#7990 closed (fixed)

serializers should use StringIO and not cStringIO

Reported by: anonymous Owned by: nobody
Component: Core (Other) Version: master
Severity: Keywords: cstringio serializers unicode
Cc: Triage Stage: Accepted
Has patch: no Needs documentation: no
Needs tests: no Patch needs improvement: no
Easy pickings: UI/UX:

Description

there is a slight difference between cStringIO and StringIO which makes cStringIO unusable for unicode data:

>>> import StringIO
>>> import cStringIO
>>> s1 = StringIO.StringIO(u'unicode text')
>>> s2 = cStringIO.StringIO(u'unicode text')
>>> s1.read()
u'unicode text'
>>> s2.read()
'u\x00\x00\x00n\x00\x00\x00i\x00\x00\x00c\x00\x00\x00o\x00\x00\x00d\x00\x00\x00e\x00\x00\x00 \x00\x00\x00t\x00\x00\x00e\x00\x00\x00x\x00\x00\x00t\x00\x00\x00'

This makes serializers such as json or yaml unusable for unicode data.

Change History (4)

comment:1 Changed 7 years ago by Simon Greenhill

  • Needs documentation unset
  • Needs tests unset
  • Patch needs improvement unset
  • Triage Stage changed from Unreviewed to Accepted

comment:2 Changed 7 years ago by mtredinnick

The particular problem reported here is only present in python 2.3 and 2.4. However, more significantly, even in python 2.5, cStringIO cannot handle non-ASCII unicode strings. This is a documented feature of the module, so for that reason, we should change it.

comment:3 Changed 7 years ago by russellm

  • Resolution set to fixed
  • Status changed from new to closed

(In [8151]) Fixed #7990 -- Modified serializers to use StringIO, rather than cStringIO, due to potential unicode issues.

comment:4 Changed 3 years ago by jacob

  • milestone 1.0 beta deleted

Milestone 1.0 beta deleted

Note: See TracTickets for help on using tickets.
Back to Top