#25730 closed Cleanup/optimization (fixed)
base.Model __str__ sometimes returns unicode on Python 2
Reported by: | Kevin Turner | Owned by: | Simon Charette |
---|---|---|---|
Component: | Database layer (models, ORM) | Version: | dev |
Severity: | Normal | Keywords: | |
Cc: | Triage Stage: | Ready for checkin | |
Has patch: | yes | Needs documentation: | no |
Needs tests: | no | Patch needs improvement: | no |
Easy pickings: | yes | UI/UX: | no |
Description (last modified by )
base.py contains from __future__ import unicode_literals
so the "" that Model.__str__
returns is a unicode object in the case where hasattr(self, '__unicode__')
is False.
There's a force_text().encode()
on the other case; probably both cases should use smart_str
.
Discovered under Django 1.7 but this looks to still be the case in the master branch.
Change History (9)
comment:1 by , 9 years ago
Description: | modified (diff) |
---|
comment:2 by , 9 years ago
Resolution: | → worksforme |
---|---|
Status: | new → closed |
comment:3 by , 9 years ago
Resolution: | worksforme |
---|---|
Status: | closed → new |
Triage Stage: | Unreviewed → Accepted |
Type: | Bug → Cleanup/optimization |
Version: | 1.8 → master |
Ahh I think I see what you mean by unicode string returned.
The default __str__
implementation returns u'%s object' % self.__class__.__name__
if not hasattr(self, '__unicode__')
.
It wouldn't hurt to simply prefix the default format with a b
but I don't think it's worth a backport. Most calls to __str__
are made implicitly through str()
which automatically tries to encode('ascii')
when fed unicode
.
This isn't really an issue because self.__class__.__name__
can only contain ASCII characters on Python 2 and thus the returned string is always successfully coerced to a bytestring when required unless I'm missing something.
comment:4 by , 9 years ago
Returning the wrong type will gum things up even if the content of that response is only ascii characters, as in this example.
Those format operations aren't always in places where you can change them over to all use __unicode__
, because they're embedded in things like TestCase messages, which is where I encountered it.
Having the test-failure message fail to format itself and crash is pretty annoying when you're trying to write or fix a test.
(edited to fix URL of example)
comment:5 by , 9 years ago
Owner: | changed from | to
---|---|
Status: | new → assigned |
Didn't know about Python 2 behavior in this case.
b'%s' % u'foo' # works
b'%s' % u'bär' # works
b'%s' % u'bär'.encode('utf-8') # works
b'%s - %s' % (u'foo', u'bär'.encode('utf-8')) # fails
It looks like this issue exists since we added Python 3 support in Django 1.5
comment:7 by , 9 years ago
Triage Stage: | Accepted → Ready for checkin |
---|
Hi @keturn,
Is this just something you assumed from reading the source?
Could you provide a failing test case because from my understanding
force_text().encode('utf-8')
returns anutf-8
encoded bytestring.