Opened 6 years ago

Closed 5 years ago

Last modified 4 years ago

#11522 closed (fixed)

UnicodeEncodeError on redirect to non-ASCII Location

Reported by: semenov Owned by: nobody
Component: HTTP handling Version: master
Severity: Keywords:
Cc: yoanblanc, oldium.pro@… Triage Stage: Accepted
Has patch: yes Needs documentation: no
Needs tests: no Patch needs improvement: yes
Easy pickings: UI/UX:

Description

If a view for a URL which contains unicode characters returns a HttpResponseRedirect to a non-absolute URL, django crashes.

Example,

def myview(request):
    print request.path # '/info/Интеграция_CMS/'
    return HttpResponseRedirect('?edit=1')

gives the following error message:

URI:            '/info/\xd0\x98\xd0\xbd\xd1\x82\xd0\xb5\xd0\xb3\xd1\x80\xd0\xb0\xd1\x86\xd0\xb8\xd1\x8f_CMS/'

...

Traceback (most recent call last):

  ...

  File "/usr/lib/python2.5/site-packages/django/http/utils.py", line 20, in fix_location_header
    response['Location'] = request.build_absolute_uri(response['Location'])

  File "/usr/lib/python2.5/site-packages/django/http/__init__.py", line 314, in __setitem__
    header, value = self._convert_to_ascii(header, value)

  File "/usr/lib/python2.5/site-packages/django/http/__init__.py", line 306, in _convert_to_ascii
    yield value.encode('us-ascii')

UnicodeEncodeError: 'ascii' codec can't encode characters in position 28-37: ordinal not in range(128), HTTP response headers must be in US-ASCII format

The cause of the problem is that #10267 was incorrectly fixed by [10539], leaving many other cases open. The call to iri_to_uri should be moved from HttpResponseRedirect?.init (and other places where it had been copy-pasted) deeper to django.utils.http.fix_location_header.

Apart of my problem, it was reported in #10267 that django.views.generic.simple.redirect_to and django.contrib.redirects also suffer from the similar encoding issue, which would also be resolved once fix_location_header is fixed appropriately.

For those Google strangers who are interested in a workaround, you can do as follows:

def myview(request):
    print request.path # '/info/Интеграция_CMS/'
    return HttpResponseRedirect('?edit=1') # crashes
    return HttpResponseRedirect(request.path + '?edit=1') # doesn't crash

Attachments (2)

simple_fix.diff (480 bytes) - added by semenov 6 years ago.
added a special rule in HttpResponse.setitem
ticket11522_puny_code.patch (2.5 KB) - added by yoan@… 6 years ago.
Patch to support IRI in redirections.

Download all attachments as: .zip

Change History (29)

comment:1 Changed 6 years ago by kmtracey

  • Needs documentation unset
  • Needs tests unset
  • Patch needs improvement unset

#11638 closed as a dup.

comment:2 Changed 6 years ago by BlindHunter

Ok, 11638 was closed.
It's not necessarily relative path. Imagine what would be if you redirect to a file in russian utf-8:

# -*- coding: utf-8 -*-
path = '/var/www/filez/site_media/upload/alecs/отпуск-2009-ОРС.pdf'.decode('utf8')
return HttpResponseRedirect(path)

UnicodeEncodeError at /file/365d13241be1e1a3ac00523a5deddec4f577e01f813ded539bcbac32

('ascii', u'/var/www/filez/site_media/upload/alecs/\u043e\u0442\u043f\u0443\u0441\u043a-2009-\u041e\u0420\u0421.pdf', 39, 45, 'ordinal not in range(128)')

comment:3 Changed 6 years ago by semenov

BlindHunter, would you mind using the wiki formatting - and, what is more important, the "Preview" button? Your text is hardly readable and if I were a Django developer, I would just disregard it.

comment:4 Changed 6 years ago by BlindHunter

It seems to me that you well-formed message is disregarded for more than two weeks ;)
All the peculiarities and moments are important for a real developer. I'll try to use wikiformatting in my future posts.

comment:5 Changed 6 years ago by semenov

Update on this ticket:

1) The proposed workaround does not work anymore in Django 1.1. Use the following:

def myview(request):
    print request.path # '/info/Интеграция/'
    return HttpResponseRedirect('?edit=1') # crashes
    return HttpResponseRedirect(request.path + '?edit=1') # still crashes
    return HttpResponseRedirect(request.build_absolute_uri('?edit=1')) # doesn't crash, but violates DRY principle

2) I disregard my words about "moving the call to iri_to_uri blah-blah-blah..", they don't make sense, as that's what [10539] was actually about.

3) Still, I consider [10539] to be the cause of the problem. I can see two ways to fix the issue:

a) a special case in HttpResponse.setitem to handle 'Location' header properly (attached)
b) practically revert [10539], but instead of calling iri_to_uri in each and every case (as it was before [10539]), call fix_location_header instead. That is worse, as fix_location_header will be called twice then.

Changed 6 years ago by semenov

added a special rule in HttpResponse.setitem

comment:6 Changed 6 years ago by semenov

Perhaps (a) can be improved to apply fix_location_header logic, and then we'll don't need the fix_location_header middleware at all.

comment:7 Changed 6 years ago by semenov

  • Patch needs improvement set

comment:8 Changed 6 years ago by semenov

  • Has patch set

comment:9 Changed 6 years ago by BlindHunter

The problem is in value.encode('us-ascii') - this codec can't encode my url as it contains a file named in russian. A serious bug :(
In order to skip encoding check you can override in HttpResponseRedirect method _convert_to_ascii:

def _convert_to_ascii(self, *values):
        """Converts all values to ascii strings."""
        for value in values:
            if isinstance(value, unicode):
                try:
                    value = value.encode('us-ascii')
                except:
                    pass
            else:
                value = str(value)
            if '\n' in value or '\r' in value:
                raise BadHeaderError("Header values can't contain newlines (got %r)" % (value))
            yield value

comment:10 Changed 6 years ago by Alex

  • Triage Stage changed from Unreviewed to Accepted

Changed 6 years ago by yoan@…

Patch to support IRI in redirections.

comment:11 Changed 6 years ago by yoan@…

A small patch of mine that enables doing redirection using unicode IRI like Mr. Snowman

>>> HttpResponseRedirect(u"http://müller.de/")
Location: http://xn--mller-kva.de/

I also think that URLField should accept such IRI, maybe we can create a new one and call it IRIField.

comment:12 Changed 6 years ago by anonymous

  • Cc yoan@… added

comment:13 Changed 6 years ago by semenov

The problem is in value.encode('us-ascii')

The problem is NOT in value.encode. That is absolutely correct to encode the response headers in ASCII as required by RFC1915. The problem is that the URL in "Location" field is not automatically urlencoded, as anyone would expect.

comment:14 Changed 6 years ago by IanLewis

Unicode in the bug title should read non-ascii. Remember, it's possible to encode urls in other encodings besides Unicode.

comment:15 Changed 6 years ago by semenov

IanLewis, you are mistaken. Unicode is a character set, not an encoding. That doesn't make sense to encode urls in ... Unicode. URLs can be encoded in UTF8, CP1251 or any other encoding which are all mappings from a character set (Unicode) to particular byte strings. (Getting into details, URLs are actually encoded twice -- first from Unicode to byte strings, then from byte strings to lower-ASCII strings using the %XX notation).

This ticket title mentions Unicode URLs and I consider that to be perfectly fine.

comment:16 Changed 6 years ago by kmtracey

  • Summary changed from Crash on redirect to a relative URL if request.path is unicode to UnicodeEncodeError on redirect to non-ASCII Location

comment:17 Changed 6 years ago by kmtracey

Closing #11921 as a dupe. Removed 'relative' from the summary since that was overly specific. Also, put non-ASCII in the summary since the problem is not use of Unicode but rather the presence of non-ASCII characters. A Unicode string with all ASCII chars works fine.

comment:18 Changed 6 years ago by oldium

My question here is whether the translation should be completely transparent, i.e.

return HttpResponseRedirect(u'/áíé/');
return HttpResponseRedirect(u'http://áíé/');

should just work (but I've read on Python pages that there are some problems in automatic conversion), or is it responsibility of the developer, so you have

return HttpResponseRedirect(iri_to_uri(u'/áíé/'));

Anyway, the Django methods should just work, so the following should work I think (at least the first case):

return HttpResponseRedirect(request.get_full_path());
return HttpResponseRedirect(request.path);

comment:19 Changed 6 years ago by oldium

  • Cc oldium.pro@… added

comment:20 Changed 6 years ago by Natim

  • Component changed from Uncategorized to HTTP handling
  • milestone set to 1.2

The ticket11522_puny_code.patch fixed the problem for me.
Is it possible to apply this to the django trunk ?

Thank you.

comment:21 Changed 5 years ago by 235

simple patch worked in my case. looking forward to fix this

comment:22 Changed 5 years ago by etzel

Got this problem too, any fix would be welcome. I'm going with simple_fix.diff for now.

comment:23 Changed 5 years ago by jeremb

Same problem here, it would be welcome to merge the patch with trunk.

comment:24 Changed 5 years ago by anonymous

have this problem to, pls fix it

comment:25 Changed 5 years ago by kmtracey

  • Resolution set to fixed
  • Status changed from new to closed

(In [12660]) [1.1.X] Fixed #11522: Restored ability of http redirect responses to correctly handle redirect locations with non-ASCII chars.

r12659 from trunk.

comment:26 Changed 5 years ago by yoanblanc

  • Cc yoanblanc added; yoan@… removed

Hi,

I'm back from a looong vacation so can only catch this now.

>>> import django
>>> print(django.VERSION)
(1, 2, 3, 'final', 0)
>>> print(django.http.HttpResponseRedirect(u"http://müller.de"))
Content-Type: text/html; charset=utf-8
Location: http://m%C3%BCller.de

Others are doing it right…

>>> import werkzeug
>>> print(werkzeug.__version__)
0.6.2
>>> werkzeug.redirect(u"http://müller.de").headers
Headers([('Content-Type', 'text/html; charset=utf-8'), ('Location', 'http://xn--mller-kva.de')])

too bad.

comment:27 Changed 4 years ago by jacob

  • milestone 1.2 deleted

Milestone 1.2 deleted

Note: See TracTickets for help on using tickets.
Back to Top