Opened 7 years ago

Closed 7 years ago

#7412 closed (invalid)

i18n crash on non-ASCII (UTF-8 encoded) doctrings

Reported by: AV <av0000@…> Owned by: nobody
Component: contrib.admin Version: master
Severity: Keywords: i18n utf8 unicode
Cc: av0000@… Triage Stage: Unreviewed
Has patch: yes Needs documentation: no
Needs tests: no Patch needs improvement: no
Easy pickings: UI/UX:

Description

The Unicode decoding error occured in i18n.py when source file contains non-ASCII docstrings.

Example:

Navigate to yoursite/documentation/models using following model example (the source file is in utf-8 encoding)

# -*- coding: utf-8 -*-
# ...
class TestModel(models.Model):
    """
    Test description with non-ASCII symbols.
    АаБбВвГг
    """

Workaround #1: use u"""....""" format but not on TAGS - tag parser is different :(

Workaround/patch #2: patch or change django/templatetags/i18n.py TranslateNode.render: from translation.ugettext(value) to translation.ugettext(smart_unicode(value))

Attachments (1)

i18n.diff (826 bytes) - added by AV <av0000@…> 7 years ago.

Download all attachments as: .zip

Change History (4)

Changed 7 years ago by AV <av0000@…>

comment:1 Changed 7 years ago by Simon Greenhill

  • Needs documentation unset
  • Needs tests unset
  • Patch needs improvement unset
  • Triage Stage changed from Unreviewed to Ready for checkin

comment:2 Changed 7 years ago by garcia_marc

  • Triage Stage changed from Ready for checkin to Unreviewed

I can't reproduce the error. Adding non ascii characters to the docstrings on the model doesn't make django raise any encoding error. Could you provide more information about this issue please?

comment:3 Changed 7 years ago by Karen Tracey <kmtracey@…>

  • Resolution set to invalid
  • Status changed from new to closed

I was able to recreate using the TestModel exactly as shown. However I don't believe that is a valid testcase. Per PEP 257 (http://www.python.org/dev/peps/pep-0257/):

For Unicode docstrings, use u"""Unicode triple-quoted strings""".

That is, the model should be specified like so:

# -*- coding: utf-8 -*-
# ...
class TestModel(models.Model):
    u"""
    Test description with non-ASCII symbols.
    АаБбВвГг
    """

I believe that is the correct way to avoid the problem, since the model's __doc__ attribute will then be a Unicode object instead of a bytestring. Django can't really fix up a bytestring returned by __doc__ after the fact because the correct encoding is buried in the 'coding:' specification on the first line of the file containing the Model definition. The patch provided probably seemed to work because the system default locale encoding matched the file encoding, but you cannot count on that. I did verify that the recommended way of specifying the docstring works correctly on newforms-admin.

Note: See TracTickets for help on using tickets.
Back to Top