Opened 8 years ago

Closed 8 years ago

#3314 closed defect (fixed)

[patch] smart_unicode throws UnicodeDecodeError when got a instance with utf-8 encoded string

Reported by: nesh <nesh [at] studioquattro [dot] co [dot] yu> Owned by: adrian
Component: Forms Version:
Severity: normal Keywords:
Cc: nesh@… Triage Stage: Accepted
Has patch: yes Needs documentation: no
Needs tests: yes Patch needs improvement: yes
Easy pickings: UI/UX:

Description

When smart_unicode get instance instead of string it uses unicode(str(s)) to convert it to string, but if instance returns a utf-8 encoded string you will get a UnicodeDecodeError.

Attachments (3)

util.diff (591 bytes) - added by nesh <nesh [at] studioquattro [dot] co [dot] yu> 8 years ago.
fast fix
form_utils.diff (590 bytes) - added by nesh <nesh [at] studioquattro [dot] co [dot] yu> 8 years ago.
smart_unicode patch
forms_test.diff (893 bytes) - added by nesh <nesh [at] studioquattro [dot] co [dot] yu> 8 years ago.
test case

Download all attachments as: .zip

Change History (9)

Changed 8 years ago by nesh <nesh [at] studioquattro [dot] co [dot] yu>

fast fix

comment:1 Changed 8 years ago by nesh <nesh [at] studioquattro [dot] co [dot] yu>

  • Summary changed from smart_unicode throws UnicodeDecodeError when got a instance with utf-8 encoded string to [patch] smart_unicode throws UnicodeDecodeError when got a instance with utf-8 encoded string

comment:2 Changed 8 years ago by mir@…

  • Needs tests set
  • Patch needs improvement set
  • Triage Stage changed from Unreviewed to Accepted

Hi Nesh--thanks for spotting this! I don't understand one thing about your patch, why do you first try to decode from ASCII and only use the DEFAULT_CHARSET when it fails? I'd simply do it like this:

def smart_unicode(s):
    if not isinstance(s, basestring):
        s = unicode(str(s), settings.DEFAULT_CHARSET)
    elif not isinstance(s, unicode):
        s = unicode(s, settings.DEFAULT_CHARSET)
    return s

comment:3 Changed 8 years ago by nesh <nesh [at] studioquattro [dot] co [dot] yu>

Yes, that's was my first idea, but I'm not sure why in the first place the str is used so I'm wrapped this in the try...except block for a quick fix.

Also if str call is not essential then we can simply use:

def smart_unicode(s):
   if not isinstance(s, unicode):
      s = unicode(str(s), settings.DEFAULT_CHARSET)
   return s

Regarding the tests, I'll try to add some and send updated patch (with tests) during the day.

Changed 8 years ago by nesh <nesh [at] studioquattro [dot] co [dot] yu>

smart_unicode patch

Changed 8 years ago by nesh <nesh [at] studioquattro [dot] co [dot] yu>

test case

comment:4 Changed 8 years ago by nesh <nesh [at] studioquattro [dot] co [dot] yu>

Patch is fixed, I also added special case for objects that implement __unicode__ and the test case.

Sorry for double form_utils.diff attachment.

comment:5 Changed 8 years ago by mir@…

#3403 marked as duplicate

comment:6 Changed 8 years ago by adrian

  • Resolution set to fixed
  • Status changed from new to closed

(In [4522]) Fixed #3314 -- Fixed a bug in newforms smart_unicode. Thanks for the patch, nesh

Note: See TracTickets for help on using tickets.
Back to Top