Opened 7 years ago

Closed 7 years ago

#25623 closed Bug (invalid)

Django always returns a white page along with 400 on latin encoded URLs.

Reported by: Christian Peters Owned by: nobody
Component: HTTP handling Version: 1.8
Severity: Normal Keywords:
Cc: Triage Stage: Accepted
Has patch: no Needs documentation: no
Needs tests: no Patch needs improvement: no
Easy pickings: no UI/UX: no

Description (last modified by Christian Peters)

Given a URL like /Raumh%F6he served with gunicorn and waitress (maybe others as well), django raises a UnicodeDecodeError here:

https://github.com/django/django/blob/master/django/core/handlers/wsgi.py#L200

... which ends up in serving a white page with a 400 (sample: https://www.djangoproject.com/Raumh%F6he)

This does not happen with the built in runserver command, because it double encodes the query, as far as I understand:

https://github.com/django/django/blob/master/django/core/servers/basehttp.py#L154-L157

Gunicorn on the other hand is explicitly casting to latin:

https://github.com/benoitc/gunicorn/blob/master/gunicorn/_compat.py#L82

... which leads to the error as django is explicitly expecting utf-8.

Change History (5)

comment:1 Changed 7 years ago by Christian Peters

Description: modified (diff)

comment:2 Changed 7 years ago by Christian Peters

For what it's worth:

Pyramid raises a 500 (http://www.pylonsproject.org/Raumh%F6he.htm) (error code: URLDecodeError: 'utf8' codec can't decode byte 0xf6 in position 11: invalid start byte (urldispatch.py, line 86 -> https://github.com/Pylons/pyramid/issues/2047)

Flask handles it correctly.

Last edited 7 years ago by Christian Peters (previous) (diff)

comment:3 Changed 7 years ago by Tim Graham

Component: Core (URLs)HTTP handling
Triage Stage: UnreviewedAccepted

comment:4 Changed 7 years ago by Dheerendra Rathor

In the part mentioned by OP,

def get_path_info(environ):
    """
    Returns the HTTP request's PATH_INFO as a unicode string.
    """
    path_info = get_bytes_from_wsgi(environ, 'PATH_INFO', '/')

    return path_info.decode(UTF_8)

I changed UTF-8 to ISO_8859_1 in my local Django installation and then it was correctly throwing 404 instead of 400.

comment:5 Changed 7 years ago by Jef Geskens

Resolution: invalid
Status: newclosed

Django handles this in the correct way. URI's which are encoded as latin1 are not standard, they should be UTF-8 encoded (see https://tools.ietf.org/html/rfc3986).

Because the url is not encoded in the standard way, Django correctly gives "400 Bad request".

Note: See TracTickets for help on using tickets.
Back to Top