Django

Code

Ticket #3314 (closed: fixed)

Opened 2 years ago

Last modified 2 years ago

[patch] smart_unicode throws UnicodeDecodeError when got a instance with utf-8 encoded string

Reported by: nesh <nesh [at] studioquattro [dot] co [dot] yu> Assigned to: adrian
Milestone: Component: Forms
Version: Keywords:
Cc: nesh@studioquattro.co.yu Triage Stage: Accepted
Has patch: 1 Needs documentation: 0
Needs tests: 1 Patch needs improvement: 1

Description

When smart_unicode get instance instead of string it uses unicode(str(s)) to convert it to string, but if instance returns a utf-8 encoded string you will get a UnicodeDecodeError.

Attachments

util.diff (0.6 kB) - added by nesh <nesh [at] studioquattro [dot] co [dot] yu> on 01/17/07 07:06:07.
fast fix
form_utils.diff (0.6 kB) - added by nesh <nesh [at] studioquattro [dot] co [dot] yu> on 01/30/07 04:33:20.
smart_unicode patch
forms_test.diff (0.9 kB) - added by nesh <nesh [at] studioquattro [dot] co [dot] yu> on 01/30/07 04:34:08.
test case

Change History

01/17/07 07:06:07 changed by nesh <nesh [at] studioquattro [dot] co [dot] yu>

  • attachment util.diff added.

fast fix

01/17/07 07:06:28 changed by nesh <nesh [at] studioquattro [dot] co [dot] yu>

  • summary changed from smart_unicode throws UnicodeDecodeError when got a instance with utf-8 encoded string to [patch] smart_unicode throws UnicodeDecodeError when got a instance with utf-8 encoded string.

01/17/07 17:39:59 changed by mir@noris.de

  • needs_better_patch set to 1.
  • needs_tests set to 1.
  • stage changed from Unreviewed to Accepted.

Hi Nesh--thanks for spotting this! I don't understand one thing about your patch, why do you first try to decode from ASCII and only use the DEFAULT_CHARSET when it fails? I'd simply do it like this:

def smart_unicode(s):
    if not isinstance(s, basestring):
        s = unicode(str(s), settings.DEFAULT_CHARSET)
    elif not isinstance(s, unicode):
        s = unicode(s, settings.DEFAULT_CHARSET)
    return s

01/18/07 02:54:53 changed by nesh <nesh [at] studioquattro [dot] co [dot] yu>

Yes, that's was my first idea, but I'm not sure why in the first place the str is used so I'm wrapped this in the try...except block for a quick fix.

Also if str call is not essential then we can simply use:

def smart_unicode(s):
   if not isinstance(s, unicode):
      s = unicode(str(s), settings.DEFAULT_CHARSET)
   return s

Regarding the tests, I'll try to add some and send updated patch (with tests) during the day.

01/30/07 04:33:20 changed by nesh <nesh [at] studioquattro [dot] co [dot] yu>

  • attachment form_utils.diff added.

smart_unicode patch

01/30/07 04:34:08 changed by nesh <nesh [at] studioquattro [dot] co [dot] yu>

  • attachment forms_test.diff added.

test case

01/30/07 04:41:32 changed by nesh <nesh [at] studioquattro [dot] co [dot] yu>

Patch is fixed, I also added special case for objects that implement __unicode__ and the test case.

Sorry for double form_utils.diff attachment.

01/30/07 14:20:38 changed by mir@noris.de

#3403 marked as duplicate

02/14/07 22:13:03 changed by adrian

  • status changed from new to closed.
  • resolution set to fixed.

(In [4522]) Fixed #3314 -- Fixed a bug in newforms smart_unicode. Thanks for the patch, nesh


Add/Change #3314 ([patch] smart_unicode throws UnicodeDecodeError when got a instance with utf-8 encoded string)




Change Properties
Action