﻿id	summary	reporter	owner	description	type	status	component	version	severity	resolution	keywords	cc	stage	has_patch	needs_docs	needs_tests	needs_better_patch	easy	ui_ux
26971	UnicodeDecodeError with non-ASCII string in quoted URL	Oleg Blinov	nobody	"Django raises UnicodeDecodeError if there are non UTF-8 characters in the url.


[https://github.com/django/django/blob/master/django/core/handlers/wsgi.py#L190]:
{{{
return path_info.decode(UTF_8)
}}}

It doesn't work if the parameter in the URL is not in UTF-8 {{{/tag/%E7%E0%EA%EB%E0%E4%EA%E0/}}}:
{{{
GET /tag/%E7%E0%EA%EB%E0%E4%EA%E0/ => generated 0 bytes in 1 msecs (HTTP/1.1 400) 1 headers in 68 bytes (1 switches on core 0)
Bad Request (UnicodeDecodeError)
Traceback (most recent call last):
  File ""/home/ubuntu/django/lib/python3.4/site-packages/django/core/handlers/wsgi.py"", line 167, in __call__
    request = self.request_class(environ)
  File ""/home/ubuntu/django/lib/python3.4/site-packages/django/core/handlers/wsgi.py"", line 80, in __init__
    path_info = get_path_info(environ)
  File ""/home/ubuntu/django/lib/python3.4/site-packages/django/core/handlers/wsgi.py"", line 197, in get_path_info
    return path_info.decode(UTF_8)
UnicodeDecodeError: 'utf-8' codec can't decode byte 0xe7 in position 5: invalid continuation byte
}}}

With utf url-quoted parameter {{{/tag/%D0%B7%D0%B0%D0%BA%D0%BB%D0%B0%D0%B4%D0%BA%D0%B0}}} there is no errors, but the old site has used windows-1251 encoding and I need to support old links. So I use this dirty hack:
{{{#!python
try:
	return path_info.decode(UTF_8)
except:
	return path_info.decode(windows-1251)
}}}

The problem is only in wsgi handler, {{{manage.py runserver}}} handles non-utf urls without errors."	Bug	closed	HTTP handling	1.8	Normal	fixed	UnicodeDecodeError UTF-8 windows-1251 URL wsgi	loic84	Ready for checkin	1	0	0	0	0	0
