Opened 15 years ago

Closed 15 years ago

#5657 closed (fixed)

[patch] urlize breaks when string.letters is changed by the locale

Reported by: Andrew Stoneman <astoneman@…> Owned by: nobody
Component: Uncategorized Version: dev
Severity: Keywords: sprintdec01
Cc: Triage Stage: Ready for checkin
Has patch: yes Needs documentation: no
Needs tests: no Patch needs improvement: no
Easy pickings: no UI/UX: no


utils.html.urlize() depends on string.letters being automatically convertible into unicode. If it is changed by the locale module, however, it sometimes contains non-ascii characters, which causes very mysterious unicode decode errors on sites that use the function. I've included a patch to have it use string.ascii_letters instead, which does not change.

Attachments (2)

urlize_ascii_letters.patch (856 bytes) - added by Andrew Stoneman <astoneman@…> 15 years ago.
patch to use ascii_letters instead of letters
urlize_with_ascii_plus_unittests.patch (3.1 KB) - added by shaleh 15 years ago.
updated patch, adds unittests

Download all attachments as: .zip

Change History (7)

Changed 15 years ago by Andrew Stoneman <astoneman@…>

Attachment: urlize_ascii_letters.patch added

patch to use ascii_letters instead of letters

comment:1 Changed 15 years ago by Andrew Stoneman <astoneman@…>

A sample session to show the problem:

>>> import locale
>>> import string
>>> from django.utils.html import urlize
>>> string.letters
>>> locale.setlocale(locale.LC_ALL, 'de_DE')
>>> string.letters
>>> urlize('abc')
Traceback (most recent call last):
  File "<console>", line 1, in ?
  File "/opt/local/Library/Frameworks/Python.framework/Versions/2.4/lib/python2.4/site-packages/django/utils/",
line 129, in wrapper
    return func(*args, **kwargs)
  File "/opt/local/Library/Frameworks/Python.framework/Versions/2.4/lib/python2.4/site-packages/django/utils/",
line 82, in urlize
    if middle.startswith('www.') or ('@' not in middle and not middle.startswith('http://') and \
UnicodeDecodeError: 'ascii' codec can't decode byte 0xaa in position 52: ordinal not in range(128)

comment:2 Changed 15 years ago by Simon G <dev@…>

Needs tests: set
Triage Stage: UnreviewedReady for checkin

Andrew - this looks good, but can we get a regression test (like if your example above)?

Changed 15 years ago by shaleh

updated patch, adds unittests

comment:3 Changed 15 years ago by shaleh

Keywords: sprintdec01 added
Needs tests: unset

unittests added

comment:4 Changed 15 years ago by Malcolm Tredinnick

Unfortunately, the test case isn't sufficiently portable (for example, on my Ubuntu laptop, it fails with an "invalid locale" error). Rather than worrying too much about lots of different installation situations, I'm just going to commit the core patch. It's correct and I can live without tests for this small change.

comment:5 Changed 15 years ago by Malcolm Tredinnick

Resolution: fixed
Status: newclosed

(In [6856]) Fixed #5657 -- Use string.ascii_letters instead of ascii.letters in the urlize
filter to ensure consistent (and correct) results no matter what the server's
locale setting might be. Thanks, Andrew Stoneman.

Note: See TracTickets for help on using tickets.
Back to Top