Opened 17 years ago
Closed 17 years ago
#5657 closed (fixed)
[patch] urlize breaks when string.letters is changed by the locale
Reported by: | Owned by: | nobody | |
---|---|---|---|
Component: | Uncategorized | Version: | dev |
Severity: | Keywords: | sprintdec01 | |
Cc: | Triage Stage: | Ready for checkin | |
Has patch: | yes | Needs documentation: | no |
Needs tests: | no | Patch needs improvement: | no |
Easy pickings: | no | UI/UX: | no |
Description
utils.html.urlize()
depends on string.letters
being automatically convertible into unicode. If it is changed by the locale module, however, it sometimes contains non-ascii characters, which causes very mysterious unicode decode errors on sites that use the function. I've included a patch to have it use string.ascii_letters
instead, which does not change.
Attachments (2)
Change History (7)
by , 17 years ago
Attachment: | urlize_ascii_letters.patch added |
---|
comment:1 by , 17 years ago
A sample session to show the problem:
>>> import locale >>> import string >>> from django.utils.html import urlize >>> string.letters 'abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ' >>> locale.setlocale(locale.LC_ALL, 'de_DE') 'de_DE' >>> string.letters 'ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz\xaa\xb5\xba\xc0\xc1\xc2\xc3\xc4\xc5\xc6\xc7\xc8\xc9 \xca\xcb\xcc\xcd\xce\xcf\xd0\xd1\xd2\xd3\xd4\xd5\xd6\xd8\xd9\xda\xdb\xdc\xdd\xde\xdf\xe0\xe1\xe2\xe3\xe4 \xe5\xe6\xe7\xe8\xe9\xea\xeb\xec\xed\xee\xef\xf0\xf1\xf2\xf3\xf4\xf5\xf6\xf8\xf9\xfa\xfb\xfc\xfd\xfe\xff' >>> urlize('abc') Traceback (most recent call last): File "<console>", line 1, in ? File "/opt/local/Library/Frameworks/Python.framework/Versions/2.4/lib/python2.4/site-packages/django/utils/functional.py", line 129, in wrapper return func(*args, **kwargs) File "/opt/local/Library/Frameworks/Python.framework/Versions/2.4/lib/python2.4/site-packages/django/utils/html.py", line 82, in urlize if middle.startswith('www.') or ('@' not in middle and not middle.startswith('http://') and \ UnicodeDecodeError: 'ascii' codec can't decode byte 0xaa in position 52: ordinal not in range(128)
comment:2 by , 17 years ago
Needs tests: | set |
---|---|
Triage Stage: | Unreviewed → Ready for checkin |
Andrew - this looks good, but can we get a regression test (like if your example above)?
by , 17 years ago
Attachment: | urlize_with_ascii_plus_unittests.patch added |
---|
updated patch, adds unittests
comment:4 by , 17 years ago
Unfortunately, the test case isn't sufficiently portable (for example, on my Ubuntu laptop, it fails with an "invalid locale" error). Rather than worrying too much about lots of different installation situations, I'm just going to commit the core patch. It's correct and I can live without tests for this small change.
comment:5 by , 17 years ago
Resolution: | → fixed |
---|---|
Status: | new → closed |
patch to use ascii_letters instead of letters