Opened 10 years ago

Closed 10 years ago

#3314 closed defect (fixed)

[patch] smart_unicode throws UnicodeDecodeError when got a instance with utf-8 encoded string

Reported by: nesh <nesh [at] studioquattro [dot] co [dot] yu> Owned by: Adrian Holovaty
Component: Forms Version:
Severity: normal Keywords:
Cc: nesh@… Triage Stage: Accepted
Has patch: yes Needs documentation: no
Needs tests: yes Patch needs improvement: yes
Easy pickings: UI/UX:

Description

When smart_unicode get instance instead of string it uses unicode(str(s)) to convert it to string, but if instance returns a utf-8 encoded string you will get a UnicodeDecodeError.

Attachments (3)

util.diff (591 bytes) - added by nesh <nesh [at] studioquattro [dot] co [dot] yu> 10 years ago.
fast fix
form_utils.diff (590 bytes) - added by nesh <nesh [at] studioquattro [dot] co [dot] yu> 10 years ago.
smart_unicode patch
forms_test.diff (893 bytes) - added by nesh <nesh [at] studioquattro [dot] co [dot] yu> 10 years ago.
test case

Download all attachments as: .zip

Change History (9)

Changed 10 years ago by nesh <nesh [at] studioquattro [dot] co [dot] yu>

Attachment: util.diff added

fast fix

comment:1 Changed 10 years ago by nesh <nesh [at] studioquattro [dot] co [dot] yu>

Summary: smart_unicode throws UnicodeDecodeError when got a instance with utf-8 encoded string[patch] smart_unicode throws UnicodeDecodeError when got a instance with utf-8 encoded string

comment:2 Changed 10 years ago by mir@…

Needs tests: set
Patch needs improvement: set
Triage Stage: UnreviewedAccepted

Hi Nesh--thanks for spotting this! I don't understand one thing about your patch, why do you first try to decode from ASCII and only use the DEFAULT_CHARSET when it fails? I'd simply do it like this:

def smart_unicode(s):
    if not isinstance(s, basestring):
        s = unicode(str(s), settings.DEFAULT_CHARSET)
    elif not isinstance(s, unicode):
        s = unicode(s, settings.DEFAULT_CHARSET)
    return s

comment:3 Changed 10 years ago by nesh <nesh [at] studioquattro [dot] co [dot] yu>

Yes, that's was my first idea, but I'm not sure why in the first place the str is used so I'm wrapped this in the try...except block for a quick fix.

Also if str call is not essential then we can simply use:

def smart_unicode(s):
   if not isinstance(s, unicode):
      s = unicode(str(s), settings.DEFAULT_CHARSET)
   return s

Regarding the tests, I'll try to add some and send updated patch (with tests) during the day.

Changed 10 years ago by nesh <nesh [at] studioquattro [dot] co [dot] yu>

Attachment: form_utils.diff added

smart_unicode patch

Changed 10 years ago by nesh <nesh [at] studioquattro [dot] co [dot] yu>

Attachment: forms_test.diff added

test case

comment:4 Changed 10 years ago by nesh <nesh [at] studioquattro [dot] co [dot] yu>

Patch is fixed, I also added special case for objects that implement __unicode__ and the test case.

Sorry for double form_utils.diff attachment.

comment:5 Changed 10 years ago by mir@…

#3403 marked as duplicate

comment:6 Changed 10 years ago by Adrian Holovaty

Resolution: fixed
Status: newclosed

(In [4522]) Fixed #3314 -- Fixed a bug in newforms smart_unicode. Thanks for the patch, nesh

Note: See TracTickets for help on using tickets.
Back to Top