#11522 closed (fixed)
UnicodeEncodeError on redirect to non-ASCII Location
Reported by: | Ilya Semenov | Owned by: | nobody |
---|---|---|---|
Component: | HTTP handling | Version: | dev |
Severity: | Keywords: | ||
Cc: | yoanblanc, oldium.pro@… | Triage Stage: | Accepted |
Has patch: | yes | Needs documentation: | no |
Needs tests: | no | Patch needs improvement: | yes |
Easy pickings: | no | UI/UX: | no |
Description
If a view for a URL which contains unicode characters returns a HttpResponseRedirect to a non-absolute URL, django crashes.
Example,
def myview(request): print request.path # '/info/Интеграция_CMS/' return HttpResponseRedirect('?edit=1')
gives the following error message:
URI: '/info/\xd0\x98\xd0\xbd\xd1\x82\xd0\xb5\xd0\xb3\xd1\x80\xd0\xb0\xd1\x86\xd0\xb8\xd1\x8f_CMS/' ... Traceback (most recent call last): ... File "/usr/lib/python2.5/site-packages/django/http/utils.py", line 20, in fix_location_header response['Location'] = request.build_absolute_uri(response['Location']) File "/usr/lib/python2.5/site-packages/django/http/__init__.py", line 314, in __setitem__ header, value = self._convert_to_ascii(header, value) File "/usr/lib/python2.5/site-packages/django/http/__init__.py", line 306, in _convert_to_ascii yield value.encode('us-ascii') UnicodeEncodeError: 'ascii' codec can't encode characters in position 28-37: ordinal not in range(128), HTTP response headers must be in US-ASCII format
The cause of the problem is that #10267 was incorrectly fixed by [10539], leaving many other cases open. The call to iri_to_uri should be moved from HttpResponseRedirect?.init (and other places where it had been copy-pasted) deeper to django.utils.http.fix_location_header.
Apart of my problem, it was reported in #10267 that django.views.generic.simple.redirect_to and django.contrib.redirects also suffer from the similar encoding issue, which would also be resolved once fix_location_header is fixed appropriately.
For those Google strangers who are interested in a workaround, you can do as follows:
def myview(request): print request.path # '/info/Интеграция_CMS/' return HttpResponseRedirect('?edit=1') # crashes return HttpResponseRedirect(request.path + '?edit=1') # doesn't crash
Attachments (2)
Change History (29)
comment:1 by , 15 years ago
comment:2 by , 15 years ago
Ok, 11638 was closed.
It's not necessarily relative path. Imagine what would be if you redirect to a file in russian utf-8:
# -*- coding: utf-8 -*-
path = '/var/www/filez/site_media/upload/alecs/отпуск-2009-ОРС.pdf'.decode('utf8')
return HttpResponseRedirect(path)
UnicodeEncodeError at /file/365d13241be1e1a3ac00523a5deddec4f577e01f813ded539bcbac32
('ascii', u'/var/www/filez/site_media/upload/alecs/\u043e\u0442\u043f\u0443\u0441\u043a-2009-\u041e\u0420\u0421.pdf', 39, 45, 'ordinal not in range(128)')
comment:3 by , 15 years ago
BlindHunter, would you mind using the wiki formatting - and, what is more important, the "Preview" button? Your text is hardly readable and if I were a Django developer, I would just disregard it.
comment:4 by , 15 years ago
It seems to me that you well-formed message is disregarded for more than two weeks ;)
All the peculiarities and moments are important for a real developer. I'll try to use wikiformatting in my future posts.
comment:5 by , 15 years ago
Update on this ticket:
1) The proposed workaround does not work anymore in Django 1.1. Use the following:
def myview(request): print request.path # '/info/Интеграция/' return HttpResponseRedirect('?edit=1') # crashes return HttpResponseRedirect(request.path + '?edit=1') # still crashes return HttpResponseRedirect(request.build_absolute_uri('?edit=1')) # doesn't crash, but violates DRY principle
2) I disregard my words about "moving the call to iri_to_uri blah-blah-blah..", they don't make sense, as that's what [10539] was actually about.
3) Still, I consider [10539] to be the cause of the problem. I can see two ways to fix the issue:
a) a special case in HttpResponse.setitem to handle 'Location' header properly (attached)
b) practically revert [10539], but instead of calling iri_to_uri in each and every case (as it was before [10539]), call fix_location_header instead. That is worse, as fix_location_header will be called twice then.
comment:6 by , 15 years ago
Perhaps (a) can be improved to apply fix_location_header logic, and then we'll don't need the fix_location_header middleware at all.
comment:7 by , 15 years ago
Patch needs improvement: | set |
---|
comment:8 by , 15 years ago
Has patch: | set |
---|
comment:9 by , 15 years ago
The problem is in value.encode('us-ascii') - this codec can't encode my url as it contains a file named in russian. A serious bug :(
In order to skip encoding check you can override in HttpResponseRedirect method _convert_to_ascii:
def _convert_to_ascii(self, *values): """Converts all values to ascii strings.""" for value in values: if isinstance(value, unicode): try: value = value.encode('us-ascii') except: pass else: value = str(value) if '\n' in value or '\r' in value: raise BadHeaderError("Header values can't contain newlines (got %r)" % (value)) yield value
comment:10 by , 15 years ago
Triage Stage: | Unreviewed → Accepted |
---|
by , 15 years ago
Attachment: | ticket11522_puny_code.patch added |
---|
Patch to support IRI in redirections.
comment:11 by , 15 years ago
A small patch of mine that enables doing redirection using unicode IRI like Mr. Snowman
>>> HttpResponseRedirect(u"http://müller.de/") Location: http://xn--mller-kva.de/
I also think that URLField should accept such IRI, maybe we can create a new one and call it IRIField.
comment:12 by , 15 years ago
Cc: | added |
---|
comment:13 by , 15 years ago
The problem is in value.encode('us-ascii')
The problem is NOT in value.encode. That is absolutely correct to encode the response headers in ASCII as required by RFC1915. The problem is that the URL in "Location" field is not automatically urlencoded, as anyone would expect.
comment:14 by , 15 years ago
Unicode in the bug title should read non-ascii. Remember, it's possible to encode urls in other encodings besides Unicode.
comment:15 by , 15 years ago
IanLewis, you are mistaken. Unicode is a character set, not an encoding. That doesn't make sense to encode urls in ... Unicode. URLs can be encoded in UTF8, CP1251 or any other encoding which are all mappings from a character set (Unicode) to particular byte strings. (Getting into details, URLs are actually encoded twice -- first from Unicode to byte strings, then from byte strings to lower-ASCII strings using the %XX notation).
This ticket title mentions Unicode URLs and I consider that to be perfectly fine.
comment:16 by , 15 years ago
Summary: | Crash on redirect to a relative URL if request.path is unicode → UnicodeEncodeError on redirect to non-ASCII Location |
---|
comment:17 by , 15 years ago
Closing #11921 as a dupe. Removed 'relative' from the summary since that was overly specific. Also, put non-ASCII in the summary since the problem is not use of Unicode but rather the presence of non-ASCII characters. A Unicode string with all ASCII chars works fine.
comment:18 by , 15 years ago
My question here is whether the translation should be completely transparent, i.e.
return HttpResponseRedirect(u'/áíé/'); return HttpResponseRedirect(u'http://áíé/');
should just work (but I've read on Python pages that there are some problems in automatic conversion), or is it responsibility of the developer, so you have
return HttpResponseRedirect(iri_to_uri(u'/áíé/'));
Anyway, the Django methods should just work, so the following should work I think (at least the first case):
return HttpResponseRedirect(request.get_full_path()); return HttpResponseRedirect(request.path);
comment:19 by , 15 years ago
Cc: | added |
---|
comment:20 by , 15 years ago
Component: | Uncategorized → HTTP handling |
---|---|
milestone: | → 1.2 |
The ticket11522_puny_code.patch fixed the problem for me.
Is it possible to apply this to the django trunk ?
Thank you.
comment:22 by , 15 years ago
Got this problem too, any fix would be welcome. I'm going with simple_fix.diff for now.
comment:25 by , 15 years ago
Resolution: | → fixed |
---|---|
Status: | new → closed |
comment:26 by , 14 years ago
Cc: | added; removed |
---|
Hi,
I'm back from a looong vacation so can only catch this now.
>>> import django >>> print(django.VERSION) (1, 2, 3, 'final', 0) >>> print(django.http.HttpResponseRedirect(u"http://müller.de")) Content-Type: text/html; charset=utf-8 Location: http://m%C3%BCller.de
Others are doing it right…
>>> import werkzeug >>> print(werkzeug.__version__) 0.6.2 >>> werkzeug.redirect(u"http://müller.de").headers Headers([('Content-Type', 'text/html; charset=utf-8'), ('Location', 'http://xn--mller-kva.de')])
too bad.
#11638 closed as a dup.