Opened 16 years ago

Closed 15 years ago

Last modified 13 years ago

#10267 closed (fixed)

HttpResponse.build_absolute_uri does not encode IRIs properly.

Reported by: liangent Owned by: aljosa
Component: HTTP handling Version: 1.0
Severity: Keywords:
Cc: Triage Stage: Accepted
Has patch: no Needs documentation: no
Needs tests: no Patch needs improvement: no
Easy pickings: no UI/UX: no

Description

It reports:

UnicodeEncodeError: 'ascii' codec can't encode characters in position 49-51: ordinal not in range(128), HTTP response headers must be in US-ASCII format

I think Chinese characters should be urlencoded. Raising an exception is not so good.

Attachments (1)

10267.diff (1.5 KB ) - added by aljosa 16 years ago.
added iri_to_uri (for files mentioned in comment #8)

Download all attachments as: .zip

Change History (16)

comment:1 by Karen Tracey, 16 years ago

Resolution: worksforme
Status: newclosed

You have left rather a lot out of this ticket. What did your code do to generate that error message? What is the full traceback, not just the last bit? There's a wealth of useful information in tracebacks that make it possible to diagnose problems without actually having to go attempt to recreate them -- please do include full tracebacks in ticket such as this. For this case, I just tried a HttpResponsePermanent redirect using non-ASCII chars and it was automatically percent-encoded, so there seems to be some other problem in your case. As you haven't included anything about what you did to generate the error message I have no idea what that might be. You might want to post on django-users including information about the code you are using and the full traceback. So far as I can see from looking at the Django code involved and my own testing I do not see a bug in Django here.

comment:2 by liangent, 16 years ago

Resolution: worksforme
Status: closedreopened

Django version 1.0.2 final

detail here (just a simplified example to recreate the problem):

create a project:
django-admin.py startproject dj30x

go to dj30x directory, edit urls.py

from django.conf.urls.defaults import *

urlpatterns = patterns('dj30x.views',
    # there will be no problem if i don't use chinese characters here
    (u'^中文/$', 'redir'),
    (u'^中文/done/$', 'done'),
)

then create views.py in the same directory

from django.http import *

def redir(request):
    # the problem still exists if i change HttpResponseRedirect to
    # HttpResponsePermanentRedirect and/or change u'done/' to 'done/'
    return HttpResponseRedirect(u'done/')

def done(request):
    return HttpResponse(u'done')

save all files, and manage.py runserver , open http://127.0.0.1:8000/%E4%B8%AD%E6%96%87/ (i typed in http://127.0.0.1:8000/中文/ , and firefox encoded it automatically.), it says

Traceback (most recent call last):

  File "C:\Python25\Lib\site-packages\django\core\servers\basehttp.py", line 278, in run
    self.result = application(self.environ, self.start_response)

  File "C:\Python25\Lib\site-packages\django\core\servers\basehttp.py", line 635, in __call__
    return self.application(environ, start_response)

  File "C:\Python25\Lib\site-packages\django\core\handlers\wsgi.py", line 244, in __call__
    response = self.apply_response_fixes(request, response)

  File "C:\Python25\Lib\site-packages\django\core\handlers\base.py", line 174, in apply_response_fixes
    response = func(request, response)

  File "C:\Python25\Lib\site-packages\django\http\utils.py", line 20, in fix_location_header
    response['Location'] = request.build_absolute_uri(response['Location'])

  File "C:\Python25\Lib\site-packages\django\http\__init__.py", line 314, in __setitem__
    header, value = self._convert_to_ascii(header, value)

  File "C:\Python25\Lib\site-packages\django\http\__init__.py", line 306, in _convert_to_ascii
    yield value.encode('us-ascii')

UnicodeEncodeError: 'ascii' codec can't encode characters in position 22-23: ordinal not in range(128), HTTP response headers must be in US-ASCII format

note: http://127.0.0.1:8000/%E4%B8%AD%E6%96%87/done/ can be viewed correctly.

comment:3 by liangent, 16 years ago

i think the problem is not in HttpResponsePermanentRedirect itself.

the title may be a little bit unsuitable, but i didn't find somewhere to change the title.

comment:4 by Malcolm Tredinnick, 16 years ago

Summary: HttpResponsePermanentRedirect cannot accept chinese charactersHttpResponse.build_absolute_uri does not encode IRIs properly.

I'd guess that we should be calling django.utils.encoding.iri_to_uri in build_absolute_uri(). It's the URI constructor's job to encode things correctly and, in this case, that would probably be build_absolute_uri, since we use request.path, which is a unicode object.

As noted by liangent, the problem is not specific to HttpResponsePermanentRedirect. I'll change the title of the ticket.

comment:5 by Jacob, 16 years ago

milestone: 1.1
Triage Stage: UnreviewedAccepted

comment:6 by Malcolm Tredinnick, 16 years ago

Resolution: fixed
Status: reopenedclosed

(In [10539]) Fixed #10267 -- Correctly handle IRIs in HttpResponse.build_absolute_uri().

comment:7 by Malcolm Tredinnick, 16 years ago

(In [10540]) [1.0.X] Fixed #10267 -- Correctly handle IRIs in HttpResponse.build_absolute_uri().

Backport of r10539 from trunk.

comment:8 by Chris Beaven, 16 years ago

Resolution: fixed
Status: closedreopened

I'd argue that perhaps this should be done in the two http response redirect classes - just at a quick browse, I can see two other places where this isn't being done:

  • django.views.generic.simple.redirect_to
  • django.contrib.redirects

Since double-iri-to-uri doesn't break anything, wouldn't it just be better to safeguard it at the http class level?

I'll reopen this, as the original title matched what I'm describing. But feel free to tell me to shove off and open a new ticket or two :P

comment:9 by aljosa, 16 years ago

Owner: changed from nobody to aljosa
Status: reopenednew

by aljosa, 16 years ago

Attachment: 10267.diff added

added iri_to_uri (for files mentioned in comment #8)

comment:10 by Alex Gaynor, 16 years ago

Resolution: fixed
Status: newclosed

Please file a new ticket for this. The original issue here was solved.

comment:11 by Ilya Semenov, 15 years ago

Resolution: fixed
Status: closedreopened

Ticket fixed incorrectly, the call to iri_to_uri should be moved from HttpResponseRedirect.init (and other places where it had been copy-pasted) to django.utils.http.fix_location_header

Otherwise it still crashes with a unicode URI and something innocent like HttpResponseRedirect('?newparam=1')

URI: '/info/\xd0\x98\xd0\xbd\xd1\x82\xd0\xb5\xd0\xb3\xd1\x80\xd0\xb0\xd1\x86\xd0\xb8\xd1\x8f_CMS/'

Traceback (most recent call last):

...

File "/usr/lib/python2.5/site-packages/django/http/utils.py", line 20, in fix_location_header

responseLocation = request.build_absolute_uri(responseLocation)

File "/usr/lib/python2.5/site-packages/django/http/init.py", line 314, in setitem

header, value = self._convert_to_ascii(header, value)

File "/usr/lib/python2.5/site-packages/django/http/init.py", line 306, in _convert_to_ascii

yield value.encode('us-ascii')

UnicodeEncodeError: 'ascii' codec can't encode characters in position 28-37: ordinal not in range(128), HTTP response headers must be in US-ASCII format

comment:12 by Ilya Semenov, 15 years ago

I beg my pardon for not quoting the traceback.

URI:            '/info/\xd0\x98\xd0\xbd\xd1\x82\xd0\xb5\xd0\xb3\xd1\x80\xd0\xb0\xd1\x86\xd0\xb8\xd1\x8f_CMS/'
PathInfo:       '/\xd0\x98\xd0\xbd\xd1\x82\xd0\xb5\xd0\xb3\xd1\x80\xd0\xb0\xd1\x86\xd0\xb8\xd1\x8f_CMS/'

Traceback (most recent call last):

  ...

  File "/usr/lib/python2.5/site-packages/django/http/utils.py", line 20, in fix_location_header
    response['Location'] = request.build_absolute_uri(response['Location'])

  File "/usr/lib/python2.5/site-packages/django/http/__init__.py", line 314, in __setitem__
    header, value = self._convert_to_ascii(header, value)

  File "/usr/lib/python2.5/site-packages/django/http/__init__.py", line 306, in _convert_to_ascii
    yield value.encode('us-ascii')

UnicodeEncodeError: 'ascii' codec can't encode characters in position 28-37: ordinal not in range(128), HTTP response headers must be in US-ASCII format

comment:13 by Alex Gaynor, 15 years ago

Resolution: fixed
Status: reopenedclosed

Once again, the original issue here was fixed, if you believe there is a new issue, please file a new ticket.

comment:14 by Ilya Semenov, 15 years ago

Whatever. Reposted to #11522.

comment:15 by Jacob, 13 years ago

milestone: 1.1

Milestone 1.1 deleted

Note: See TracTickets for help on using tickets.
Back to Top