Opened 8 years ago

Closed 8 years ago

Last modified 8 years ago

#25730 closed Cleanup/optimization (fixed)

base.Model __str__ sometimes returns unicode on Python 2

Reported by: Kevin Turner Owned by: Simon Charette
Component: Database layer (models, ORM) Version: dev
Severity: Normal Keywords:
Cc: Triage Stage: Ready for checkin
Has patch: yes Needs documentation: no
Needs tests: no Patch needs improvement: no
Easy pickings: yes UI/UX: no

Description (last modified by Kevin Turner)

base.py contains from __future__ import unicode_literals

so the "" that Model.__str__ returns is a unicode object in the case where hasattr(self, '__unicode__') is False.

There's a force_text().encode() on the other case; probably both cases should use smart_str.

Discovered under Django 1.7 but this looks to still be the case in the master branch.

Change History (9)

comment:1 by Kevin Turner, 8 years ago

Description: modified (diff)

comment:2 by Simon Charette, 8 years ago

Resolution: worksforme
Status: newclosed

Hi @keturn,

Is this just something you assumed from reading the source?

Could you provide a failing test case because from my understanding force_text().encode('utf-8') returns an utf-8 encoded bytestring.

comment:3 by Simon Charette, 8 years ago

Resolution: worksforme
Status: closednew
Triage Stage: UnreviewedAccepted
Type: BugCleanup/optimization
Version: 1.8master

Ahh I think I see what you mean by unicode string returned.

The default __str__ implementation returns u'%s object' % self.__class__.__name__ if not hasattr(self, '__unicode__').

It wouldn't hurt to simply prefix the default format with a b but I don't think it's worth a backport. Most calls to __str__ are made implicitly through str() which automatically tries to encode('ascii') when fed unicode.

This isn't really an issue because self.__class__.__name__ can only contain ASCII characters on Python 2 and thus the returned string is always successfully coerced to a bytestring when required unless I'm missing something.

Last edited 8 years ago by Simon Charette (previous) (diff)

comment:4 by Kevin Turner, 8 years ago

Returning the wrong type will gum things up even if the content of that response is only ascii characters, as in this example.

Those format operations aren't always in places where you can change them over to all use __unicode__, because they're embedded in things like TestCase messages, which is where I encountered it.

Having the test-failure message fail to format itself and crash is pretty annoying when you're trying to write or fix a test.

(edited to fix URL of example)

Last edited 8 years ago by Kevin Turner (previous) (diff)

comment:5 by Simon Charette, 8 years ago

Owner: changed from nobody to Simon Charette
Status: newassigned

Didn't know about Python 2 behavior in this case.

b'%s' % u'foo' # works

b'%s' % u'bär' # works

b'%s' % u'bär'.encode('utf-8') # works

b'%s - %s' % (u'foo', u'bär'.encode('utf-8')) # fails

It looks like this issue exists since we added Python 3 support in Django 1.5

comment:6 by Simon Charette, 8 years ago

Has patch: set

Created a PR.

comment:7 by Tim Graham, 8 years ago

Triage Stage: AcceptedReady for checkin

comment:8 by Simon Charette <charette.s@…>, 8 years ago

Resolution: fixed
Status: assignedclosed

In 4cd5d84:

Fixed #25730 -- Made Model.str() always return str instances.

Thanks to Kevin Turner for the report and Tim for the review.

comment:9 by Simon Charette <charette.s@…>, 8 years ago

In 946e7679:

[1.9.x] Fixed #25730 -- Made Model.str() always return str instances.

Thanks to Kevin Turner for the report and Tim for the review.

Backport of 4cd5d846d4a0a62bba6edf435a8ae9c6dcfacb43 from master

Note: See TracTickets for help on using tickets.
Back to Top