Opened 15 years ago

Closed 14 years ago

Last modified 12 years ago

#11522 closed (fixed)

UnicodeEncodeError on redirect to non-ASCII Location

Reported by: Ilya Semenov Owned by: nobody
Component: HTTP handling Version: dev
Severity: Keywords:
Cc: yoanblanc, oldium.pro@… Triage Stage: Accepted
Has patch: yes Needs documentation: no
Needs tests: no Patch needs improvement: yes
Easy pickings: no UI/UX: no

Description

If a view for a URL which contains unicode characters returns a HttpResponseRedirect to a non-absolute URL, django crashes.

Example,

def myview(request):
    print request.path # '/info/Интеграция_CMS/'
    return HttpResponseRedirect('?edit=1')

gives the following error message:

URI:            '/info/\xd0\x98\xd0\xbd\xd1\x82\xd0\xb5\xd0\xb3\xd1\x80\xd0\xb0\xd1\x86\xd0\xb8\xd1\x8f_CMS/'

...

Traceback (most recent call last):

  ...

  File "/usr/lib/python2.5/site-packages/django/http/utils.py", line 20, in fix_location_header
    response['Location'] = request.build_absolute_uri(response['Location'])

  File "/usr/lib/python2.5/site-packages/django/http/__init__.py", line 314, in __setitem__
    header, value = self._convert_to_ascii(header, value)

  File "/usr/lib/python2.5/site-packages/django/http/__init__.py", line 306, in _convert_to_ascii
    yield value.encode('us-ascii')

UnicodeEncodeError: 'ascii' codec can't encode characters in position 28-37: ordinal not in range(128), HTTP response headers must be in US-ASCII format

The cause of the problem is that #10267 was incorrectly fixed by [10539], leaving many other cases open. The call to iri_to_uri should be moved from HttpResponseRedirect?.init (and other places where it had been copy-pasted) deeper to django.utils.http.fix_location_header.

Apart of my problem, it was reported in #10267 that django.views.generic.simple.redirect_to and django.contrib.redirects also suffer from the similar encoding issue, which would also be resolved once fix_location_header is fixed appropriately.

For those Google strangers who are interested in a workaround, you can do as follows:

def myview(request):
    print request.path # '/info/Интеграция_CMS/'
    return HttpResponseRedirect('?edit=1') # crashes
    return HttpResponseRedirect(request.path + '?edit=1') # doesn't crash

Attachments (2)

simple_fix.diff (480 bytes ) - added by Ilya Semenov 15 years ago.
added a special rule in HttpResponse.setitem
ticket11522_puny_code.patch (2.5 KB ) - added by yoan@… 15 years ago.
Patch to support IRI in redirections.

Download all attachments as: .zip

Change History (29)

comment:1 by Karen Tracey, 15 years ago

#11638 closed as a dup.

comment:2 by Alexander, 15 years ago

Ok, 11638 was closed.
It's not necessarily relative path. Imagine what would be if you redirect to a file in russian utf-8:

# -*- coding: utf-8 -*-
path = '/var/www/filez/site_media/upload/alecs/отпуск-2009-ОРС.pdf'.decode('utf8')
return HttpResponseRedirect(path)

UnicodeEncodeError at /file/365d13241be1e1a3ac00523a5deddec4f577e01f813ded539bcbac32

('ascii', u'/var/www/filez/site_media/upload/alecs/\u043e\u0442\u043f\u0443\u0441\u043a-2009-\u041e\u0420\u0421.pdf', 39, 45, 'ordinal not in range(128)')

comment:3 by Ilya Semenov, 15 years ago

BlindHunter, would you mind using the wiki formatting - and, what is more important, the "Preview" button? Your text is hardly readable and if I were a Django developer, I would just disregard it.

comment:4 by Alexander, 15 years ago

It seems to me that you well-formed message is disregarded for more than two weeks ;)
All the peculiarities and moments are important for a real developer. I'll try to use wikiformatting in my future posts.

comment:5 by Ilya Semenov, 15 years ago

Update on this ticket:

1) The proposed workaround does not work anymore in Django 1.1. Use the following:

def myview(request):
    print request.path # '/info/Интеграция/'
    return HttpResponseRedirect('?edit=1') # crashes
    return HttpResponseRedirect(request.path + '?edit=1') # still crashes
    return HttpResponseRedirect(request.build_absolute_uri('?edit=1')) # doesn't crash, but violates DRY principle

2) I disregard my words about "moving the call to iri_to_uri blah-blah-blah..", they don't make sense, as that's what [10539] was actually about.

3) Still, I consider [10539] to be the cause of the problem. I can see two ways to fix the issue:

a) a special case in HttpResponse.setitem to handle 'Location' header properly (attached)
b) practically revert [10539], but instead of calling iri_to_uri in each and every case (as it was before [10539]), call fix_location_header instead. That is worse, as fix_location_header will be called twice then.

by Ilya Semenov, 15 years ago

Attachment: simple_fix.diff added

added a special rule in HttpResponse.setitem

comment:6 by Ilya Semenov, 15 years ago

Perhaps (a) can be improved to apply fix_location_header logic, and then we'll don't need the fix_location_header middleware at all.

comment:7 by Ilya Semenov, 15 years ago

Patch needs improvement: set

comment:8 by Ilya Semenov, 15 years ago

Has patch: set

comment:9 by Alexander, 15 years ago

The problem is in value.encode('us-ascii') - this codec can't encode my url as it contains a file named in russian. A serious bug :(
In order to skip encoding check you can override in HttpResponseRedirect method _convert_to_ascii:

def _convert_to_ascii(self, *values):
        """Converts all values to ascii strings."""
        for value in values:
            if isinstance(value, unicode):
                try:
                    value = value.encode('us-ascii')
                except:
                    pass
            else:
                value = str(value)
            if '\n' in value or '\r' in value:
                raise BadHeaderError("Header values can't contain newlines (got %r)" % (value))
            yield value

comment:10 by Alex Gaynor, 15 years ago

Triage Stage: UnreviewedAccepted

by yoan@…, 15 years ago

Attachment: ticket11522_puny_code.patch added

Patch to support IRI in redirections.

comment:11 by yoan@…, 15 years ago

A small patch of mine that enables doing redirection using unicode IRI like Mr. Snowman

>>> HttpResponseRedirect(u"http://müller.de/")
Location: http://xn--mller-kva.de/

I also think that URLField should accept such IRI, maybe we can create a new one and call it IRIField.

comment:12 by anonymous, 15 years ago

Cc: yoan@… added

comment:13 by Ilya Semenov, 15 years ago

The problem is in value.encode('us-ascii')

The problem is NOT in value.encode. That is absolutely correct to encode the response headers in ASCII as required by RFC1915. The problem is that the URL in "Location" field is not automatically urlencoded, as anyone would expect.

comment:14 by Ian Lewis, 14 years ago

Unicode in the bug title should read non-ascii. Remember, it's possible to encode urls in other encodings besides Unicode.

comment:15 by Ilya Semenov, 14 years ago

IanLewis, you are mistaken. Unicode is a character set, not an encoding. That doesn't make sense to encode urls in ... Unicode. URLs can be encoded in UTF8, CP1251 or any other encoding which are all mappings from a character set (Unicode) to particular byte strings. (Getting into details, URLs are actually encoded twice -- first from Unicode to byte strings, then from byte strings to lower-ASCII strings using the %XX notation).

This ticket title mentions Unicode URLs and I consider that to be perfectly fine.

comment:16 by Karen Tracey, 14 years ago

Summary: Crash on redirect to a relative URL if request.path is unicodeUnicodeEncodeError on redirect to non-ASCII Location

comment:17 by Karen Tracey, 14 years ago

Closing #11921 as a dupe. Removed 'relative' from the summary since that was overly specific. Also, put non-ASCII in the summary since the problem is not use of Unicode but rather the presence of non-ASCII characters. A Unicode string with all ASCII chars works fine.

comment:18 by Oldřich Jedlička, 14 years ago

My question here is whether the translation should be completely transparent, i.e.

return HttpResponseRedirect(u'/áíé/');
return HttpResponseRedirect(u'http://áíé/');

should just work (but I've read on Python pages that there are some problems in automatic conversion), or is it responsibility of the developer, so you have

return HttpResponseRedirect(iri_to_uri(u'/áíé/'));

Anyway, the Django methods should just work, so the following should work I think (at least the first case):

return HttpResponseRedirect(request.get_full_path());
return HttpResponseRedirect(request.path);

comment:19 by Oldřich Jedlička, 14 years ago

Cc: oldium.pro@… added

comment:20 by Rémy Hubscher, 14 years ago

Component: UncategorizedHTTP handling
milestone: 1.2

The ticket11522_puny_code.patch fixed the problem for me.
Is it possible to apply this to the django trunk ?

Thank you.

comment:21 by 235, 14 years ago

simple patch worked in my case. looking forward to fix this

comment:22 by Nikita Nemkin, 14 years ago

Got this problem too, any fix would be welcome. I'm going with simple_fix.diff for now.

comment:23 by jeremb, 14 years ago

Same problem here, it would be welcome to merge the patch with trunk.

comment:24 by anonymous, 14 years ago

have this problem to, pls fix it

comment:25 by Karen Tracey, 14 years ago

Resolution: fixed
Status: newclosed

(In [12660]) [1.1.X] Fixed #11522: Restored ability of http redirect responses to correctly handle redirect locations with non-ASCII chars.

r12659 from trunk.

comment:26 by yoanblanc, 13 years ago

Cc: yoanblanc added; yoan@… removed

Hi,

I'm back from a looong vacation so can only catch this now.

>>> import django
>>> print(django.VERSION)
(1, 2, 3, 'final', 0)
>>> print(django.http.HttpResponseRedirect(u"http://müller.de"))
Content-Type: text/html; charset=utf-8
Location: http://m%C3%BCller.de

Others are doing it right…

>>> import werkzeug
>>> print(werkzeug.__version__)
0.6.2
>>> werkzeug.redirect(u"http://müller.de").headers
Headers([('Content-Type', 'text/html; charset=utf-8'), ('Location', 'http://xn--mller-kva.de')])

too bad.

comment:27 by Jacob, 12 years ago

milestone: 1.2

Milestone 1.2 deleted

Note: See TracTickets for help on using tickets.
Back to Top