Opened 10 years ago
Closed 10 years ago
#25623 closed Bug (invalid)
Django always returns a white page along with 400 on latin encoded URLs.
| Reported by: | Christian Peters | Owned by: | nobody | 
|---|---|---|---|
| Component: | HTTP handling | Version: | 1.8 | 
| Severity: | Normal | Keywords: | |
| Cc: | Triage Stage: | Accepted | |
| Has patch: | no | Needs documentation: | no | 
| Needs tests: | no | Patch needs improvement: | no | 
| Easy pickings: | no | UI/UX: | no | 
Description (last modified by )
Given a URL like /Raumh%F6he served with gunicorn and waitress (maybe others as well), django raises a UnicodeDecodeError here:
https://github.com/django/django/blob/master/django/core/handlers/wsgi.py#L200
... which ends up in serving a white page with a 400 (sample: https://www.djangoproject.com/Raumh%F6he)
This does not happen with the built in runserver command, because it double encodes the query, as far as I understand:
https://github.com/django/django/blob/master/django/core/servers/basehttp.py#L154-L157
Gunicorn on the other hand is explicitly casting to latin:
https://github.com/benoitc/gunicorn/blob/master/gunicorn/_compat.py#L82
... which leads to the error as django is explicitly expecting utf-8.
Change History (5)
comment:1 by , 10 years ago
| Description: | modified (diff) | 
|---|
comment:3 by , 10 years ago
| Component: | Core (URLs) → HTTP handling | 
|---|---|
| Triage Stage: | Unreviewed → Accepted | 
comment:4 by , 10 years ago
In the part mentioned by OP, 
def get_path_info(environ):
    """
    Returns the HTTP request's PATH_INFO as a unicode string.
    """
    path_info = get_bytes_from_wsgi(environ, 'PATH_INFO', '/')
    return path_info.decode(UTF_8)
I changed UTF-8  to ISO_8859_1 in my local Django installation and then it was correctly throwing 404 instead of 400. 
comment:5 by , 10 years ago
| Resolution: | → invalid | 
|---|---|
| Status: | new → closed | 
Django handles this in the correct way. URI's which are encoded as latin1 are not standard, they should be UTF-8 encoded (see https://tools.ietf.org/html/rfc3986).
Because the url is not encoded in the standard way, Django correctly gives "400 Bad request".
For what it's worth:
Pyramid raises a 500 (http://www.pylonsproject.org/Raumh%F6he.htm) (error code: URLDecodeError: 'utf8' codec can't decode byte 0xf6 in position 11: invalid start byte (urldispatch.py, line 86 -> https://github.com/Pylons/pyramid/issues/2047)
Flask handles it correctly.