Opened 16 years ago

Closed 16 years ago

#7412 closed (invalid)

i18n crash on non-ASCII (UTF-8 encoded) doctrings

Reported by: AV <av0000@…> Owned by: nobody
Component: contrib.admin Version: dev
Severity: Keywords: i18n utf8 unicode
Cc: av0000@… Triage Stage: Unreviewed
Has patch: yes Needs documentation: no
Needs tests: no Patch needs improvement: no
Easy pickings: no UI/UX: no

Description

The Unicode decoding error occured in i18n.py when source file contains non-ASCII docstrings.

Example:

Navigate to yoursite/documentation/models using following model example (the source file is in utf-8 encoding)

# -*- coding: utf-8 -*-
# ...
class TestModel(models.Model):
    """
    Test description with non-ASCII symbols.
    АаБбВвГг
    """

Workaround #1: use u"""....""" format but not on TAGS - tag parser is different :(

Workaround/patch #2: patch or change django/templatetags/i18n.py TranslateNode.render: from translation.ugettext(value) to translation.ugettext(smart_unicode(value))

Attachments (1)

i18n.diff (826 bytes ) - added by AV <av0000@…> 16 years ago.

Download all attachments as: .zip

Change History (4)

by AV <av0000@…>, 16 years ago

Attachment: i18n.diff added

comment:1 by Simon Greenhill, 16 years ago

Triage Stage: UnreviewedReady for checkin

comment:2 by Marc Garcia, 16 years ago

Triage Stage: Ready for checkinUnreviewed

I can't reproduce the error. Adding non ascii characters to the docstrings on the model doesn't make django raise any encoding error. Could you provide more information about this issue please?

comment:3 by Karen Tracey <kmtracey@…>, 16 years ago

Resolution: invalid
Status: newclosed

I was able to recreate using the TestModel exactly as shown. However I don't believe that is a valid testcase. Per PEP 257 (http://www.python.org/dev/peps/pep-0257/):

For Unicode docstrings, use u"""Unicode triple-quoted strings""".

That is, the model should be specified like so:

# -*- coding: utf-8 -*-
# ...
class TestModel(models.Model):
    u"""
    Test description with non-ASCII symbols.
    АаБбВвГг
    """

I believe that is the correct way to avoid the problem, since the model's __doc__ attribute will then be a Unicode object instead of a bytestring. Django can't really fix up a bytestring returned by __doc__ after the fact because the correct encoding is buried in the 'coding:' specification on the first line of the file containing the Model definition. The patch provided probably seemed to work because the system default locale encoding matched the file encoding, but you cannot count on that. I did verify that the recommended way of specifying the docstring works correctly on newforms-admin.

Note: See TracTickets for help on using tickets.
Back to Top