Opened 18 years ago
Closed 18 years ago
#3314 closed defect (fixed)
[patch] smart_unicode throws UnicodeDecodeError when got a instance with utf-8 encoded string
Reported by: | nesh <nesh [at] studioquattro [dot] co [dot] yu> | Owned by: | Adrian Holovaty |
---|---|---|---|
Component: | Forms | Version: | |
Severity: | normal | Keywords: | |
Cc: | nesh@… | Triage Stage: | Accepted |
Has patch: | yes | Needs documentation: | no |
Needs tests: | yes | Patch needs improvement: | yes |
Easy pickings: | no | UI/UX: | no |
Description
When smart_unicode get instance instead of string it uses unicode(str(s))
to convert it to string, but if instance returns a utf-8
encoded string you will get a UnicodeDecodeError
.
Attachments (3)
Change History (9)
by , 18 years ago
comment:1 by , 18 years ago
Summary: | smart_unicode throws UnicodeDecodeError when got a instance with utf-8 encoded string → [patch] smart_unicode throws UnicodeDecodeError when got a instance with utf-8 encoded string |
---|
comment:2 by , 18 years ago
Needs tests: | set |
---|---|
Patch needs improvement: | set |
Triage Stage: | Unreviewed → Accepted |
Hi Nesh--thanks for spotting this! I don't understand one thing about your patch, why do you first try to decode from ASCII and only use the DEFAULT_CHARSET when it fails? I'd simply do it like this:
def smart_unicode(s): if not isinstance(s, basestring): s = unicode(str(s), settings.DEFAULT_CHARSET) elif not isinstance(s, unicode): s = unicode(s, settings.DEFAULT_CHARSET) return s
comment:3 by , 18 years ago
Yes, that's was my first idea, but I'm not sure why in the first place the str
is used so I'm wrapped this in the try...except
block for a quick fix.
Also if str
call is not essential then we can simply use:
def smart_unicode(s): if not isinstance(s, unicode): s = unicode(str(s), settings.DEFAULT_CHARSET) return s
Regarding the tests, I'll try to add some and send updated patch (with tests) during the day.
comment:4 by , 18 years ago
Patch is fixed, I also added special case for objects that implement __unicode__
and the test case.
Sorry for double form_utils.diff attachment.
comment:6 by , 18 years ago
Resolution: | → fixed |
---|---|
Status: | new → closed |
fast fix