Opened 19 years ago

Closed 16 years ago

#2078 closed defect (fixed)

[patch] HttpResponseRedirect should percent-encode non-ASCII characters in Location header

Reported by: Andrey <aela@…> Owned by: Adrian Holovaty
Component: Core (Other) Version:
Severity: normal Keywords:
Cc: aela@… Triage Stage: Unreviewed
Has patch: yes Needs documentation: no
Needs tests: no Patch needs improvement: no
Easy pickings: no UI/UX: no

Description

RFC 3986 does not allow non-ASCII characters in URIs. Instead they should be converted to bytes according to UTF-8 and then replaced by %XX sequences, where XX is the hexadecimal value of the byte.

Attachments (2)

urlescape.py (447 bytes ) - added by Andrey <aela@…> 19 years ago.
A simple code for percent-encoding of strings
urlencode.diff (1.0 KB ) - added by Andrey 19 years ago.
Use urllib.quote for percent-encoding URLs in HttpResponseRedirect and HttpResponsePermanentRedirect

Download all attachments as: .zip

Change History (11)

by Andrey <aela@…>, 19 years ago

Attachment: urlescape.py added

A simple code for percent-encoding of strings

comment:1 by Adrian Holovaty, 19 years ago

Summary: HttpResponseRedirect should percent-encode non-ASCII characters in Location header[patch] HttpResponseRedirect should percent-encode non-ASCII characters in Location header

Can we not use one of the utility functions in urllib or urllib2 for this?

comment:2 by Andrey, 19 years ago

Yes, seems like urllib.quote is almost what we need. We just have to tell it to leave HTTP reserved characters unquoted, like this: urllib.quote(url, safe="!*'();:@&=+$,/?%#[]")

by Andrey, 19 years ago

Attachment: urlencode.diff added

Use urllib.quote for percent-encoding URLs in HttpResponseRedirect and HttpResponsePermanentRedirect

comment:3 by anonymous, 19 years ago

Cc: aela@… added

comment:4 by Alexander Petrov, 19 years ago

I am currentry writing a Django-based mini-wiki, and of course I'd want to have non-latin character in article titles (I am a native Russian speaker). Unfortunately, Django still chokes on cyrilic characters in URLs. For example, if an URL with cyrillic character and wihout trailing slash is entered in browser (say, http://some.host/wiki/Проверка), it does not get redirected correctly (you get some cruft like http://some.host/wiki/Проверка/ instead of http://some.host/wiki/Проверка/).

This patch above fixes this issue, so it would be really nice to have it merged. :)

comment:5 by Home, 19 years ago

Type: defect

comment:6 by anonymous, 19 years ago

Type: defect

comment:7 by Adrian Holovaty, 19 years ago

Resolution: fixed
Status: newclosed

(In [3166]) Fixed #2078 -- Improved HttpResponseRedirect and HttpResponsePermanentRedirect to percent-encode non-ASCII characters in the Location header. Thanks, Andrey

comment:8 by shivaraj, 16 years ago

Resolution: fixed
Status: closedreopened

Recently I came across http://www.djangosnippets.org/snippets/1048/ and I found
params = {'v':'1.0', 'q': text.encode('utf-8')}
to be the encoded format for urlopening translation apis with non-ASCII characters,
when I didn't succeed in sending a escaped version used here to call translate api from django.

comment:9 by dc, 16 years ago

Resolution: fixed
Status: reopenedclosed

The original bug was fixed three years ago. Please ask on django-users and open a new ticket with detailed information about your problem if you are sure it is a django bug.

Note: See TracTickets for help on using tickets.
Back to Top