Opened 10 years ago

Closed 7 years ago

#2078 closed defect (fixed)

[patch] HttpResponseRedirect should percent-encode non-ASCII characters in Location header

Reported by: Andrey <aela@…> Owned by: Adrian Holovaty
Component: Core (Other) Version:
Severity: normal Keywords:
Cc: aela@… Triage Stage: Unreviewed
Has patch: yes Needs documentation: no
Needs tests: no Patch needs improvement: no
Easy pickings: UI/UX:

Description

RFC 3986 does not allow non-ASCII characters in URIs. Instead they should be converted to bytes according to UTF-8 and then replaced by %XX sequences, where XX is the hexadecimal value of the byte.

Attachments (2)

urlescape.py (447 bytes) - added by Andrey <aela@…> 10 years ago.
A simple code for percent-encoding of strings
urlencode.diff (1.0 KB) - added by Andrey 10 years ago.
Use urllib.quote for percent-encoding URLs in HttpResponseRedirect and HttpResponsePermanentRedirect

Download all attachments as: .zip

Change History (11)

Changed 10 years ago by Andrey <aela@…>

Attachment: urlescape.py added

A simple code for percent-encoding of strings

comment:1 Changed 10 years ago by Adrian Holovaty

Summary: HttpResponseRedirect should percent-encode non-ASCII characters in Location header[patch] HttpResponseRedirect should percent-encode non-ASCII characters in Location header

Can we not use one of the utility functions in urllib or urllib2 for this?

comment:2 Changed 10 years ago by Andrey

Yes, seems like urllib.quote is almost what we need. We just have to tell it to leave HTTP reserved characters unquoted, like this: urllib.quote(url, safe="!*'();:@&=+$,/?%#[]")

Changed 10 years ago by Andrey

Attachment: urlencode.diff added

Use urllib.quote for percent-encoding URLs in HttpResponseRedirect and HttpResponsePermanentRedirect

comment:3 Changed 10 years ago by anonymous

Cc: aela@… added

comment:4 Changed 10 years ago by Alexander Petrov

I am currentry writing a Django-based mini-wiki, and of course I'd want to have non-latin character in article titles (I am a native Russian speaker). Unfortunately, Django still chokes on cyrilic characters in URLs. For example, if an URL with cyrillic character and wihout trailing slash is entered in browser (say, http://some.host/wiki/Проверка), it does not get redirected correctly (you get some cruft like http://some.host/wiki/Проверка/ instead of http://some.host/wiki/Проверка/).

This patch above fixes this issue, so it would be really nice to have it merged. :)

comment:5 Changed 10 years ago by Home

Type: defect

comment:6 Changed 10 years ago by anonymous

Type: defect

comment:7 Changed 10 years ago by Adrian Holovaty

Resolution: fixed
Status: newclosed

(In [3166]) Fixed #2078 -- Improved HttpResponseRedirect and HttpResponsePermanentRedirect to percent-encode non-ASCII characters in the Location header. Thanks, Andrey

comment:8 Changed 7 years ago by shivaraj

Resolution: fixed
Status: closedreopened

Recently I came across http://www.djangosnippets.org/snippets/1048/ and I found
params = {'v':'1.0', 'q': text.encode('utf-8')}
to be the encoded format for urlopening translation apis with non-ASCII characters,
when I didn't succeed in sending a escaped version used here to call translate api from django.

comment:9 Changed 7 years ago by dc

Resolution: fixed
Status: reopenedclosed

The original bug was fixed three years ago. Please ask on django-users and open a new ticket with detailed information about your problem if you are sure it is a django bug.

Note: See TracTickets for help on using tickets.
Back to Top