Opened 13 years ago
Closed 12 years ago
#20356 closed Bug (fixed)
CommonMiddleware UnicodeDecodeError
| Reported by: | srusskih | Owned by: | nobody |
|---|---|---|---|
| Component: | HTTP handling | Version: | 1.3 |
| Severity: | Normal | Keywords: | middleware unicodedecodeerror |
| Cc: | Triage Stage: | Ready for checkin | |
| Has patch: | yes | Needs documentation: | no |
| Needs tests: | no | Patch needs improvement: | no |
| Easy pickings: | no | UI/UX: | no |
Description
Got the mail with a bug:
Traceback (most recent call last):
File "/srv/mydomain/mydomain/django/core/handlers/base.py", line 178, in get_response
response = middleware_method(request, response)
File "/srv/mydomain/mydomain/django/middleware/common.py", line 107, in process_response
% (referer, request.get_full_path(), ua, ip),
UnicodeDecodeError: 'ascii' codec can't decode byte 0xd0 in position 35: ordinal not in range(128)
<WSGIRequest
GET:<QueryDict: {}>,
POST:<QueryDict: {}>,
COOKIES:{},
META:{'CSRF_COOKIE': 'db9ed773e630c8234d67e01c3df53ac5',
'DOCUMENT_ROOT': '/htdocs',
'GATEWAY_INTERFACE': 'CGI/1.1',
'HTTP_ACCEPT': '*/*, text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8',
'HTTP_ACCEPT_CHARSET': 'windows-1251,utf-8;q=0.7,*;q=0.7',
'HTTP_ACCEPT_ENCODING': 'identity',
'HTTP_ACCEPT_LANGUAGE': 'ru-ru,ru;q=0.8,en-us;q=0.5,en;q=0.3',
'HTTP_CONNECTION': 'close',
'HTTP_HOST': 'mydomain.com',
'HTTP_REFERER': 'http://mydomain.com/c/\xd0\xbb\xd0\xb8',
'HTTP_USER_AGENT': 'Mozilla/5.0 (Windows; U; Windows NT 6.1; rv:11.0) Gecko/20100101 Firefox/11.0',
'HTTP_X_FORWARDED_FOR': '79.148.104.71',
'HTTP_X_FORWARDED_SCHEME': 'http',
'HTTP_X_FORWARDED_SECURE': 'https',
'HTTP_X_REAL_IP': '79.148.104.71',
'PATH': '/usr/local/bin:/usr/bin:/bin',
'PATH_INFO': u'/c/\u043b\u0438/',
'PATH_TRANSLATED': '/srv/mydomain/conf/run.wsgi/c/\xd0\xbb\xd0\xb8/',
'QUERY_STRING': '',
'REMOTE_ADDR': '127.0.0.1',
'REMOTE_PORT': '59661',
'REQUEST_METHOD': 'GET',
'REQUEST_URI': '/c/\xd0\xbb\xd0\xb8/',
'SCRIPT_FILENAME': '/srv/mydomain/conf/run.wsgi',
'SCRIPT_NAME': u'',
'SERVER_ADDR': '127.0.0.1',
'SERVER_ADMIN': '[no address given]',
'SERVER_NAME': 'mydomain.com',
'SERVER_PORT': '80',
'SERVER_PROTOCOL': 'HTTP/1.0',
'SERVER_SIGNATURE': '<address>Apache/2.2.14 (Ubuntu) Server at mydomain.com Port 80</address>\n',
'SERVER_SOFTWARE': 'Apache/2.2.14 (Ubuntu)',
'mod_wsgi.application_group': 'mydomain.com|',
'mod_wsgi.callable_object': 'application',
'mod_wsgi.listener_host': '',
'mod_wsgi.listener_port': '8080',
'mod_wsgi.process_group': 'mydomain',
'mod_wsgi.reload_mechanism': '1',
'mod_wsgi.script_reloading': '1',
'mod_wsgi.version': (2, 8),
'wsgi.errors': <mod_wsgi.Log object at 0xacf04920>,
'wsgi.file_wrapper': <built-in method file_wrapper of mod_wsgi.Adapter object at 0xaf4a31d0>,
'wsgi.input': <mod_wsgi.Input object at 0xba09b9d0>,
'wsgi.multiprocess': False,
'wsgi.multithread': True,
'wsgi.run_once': False,
'wsgi.url_scheme': 'http',
'wsgi.version': (1, 0)}>
Testcase to reproduce:
import mock
from django.test import RequestFactory, TestCase
from django.http import HttpResponse
from django.middleware.common import CommonMiddleware
class TestDjangoMiddlewareUnicodeError(TestCase):
@mock.patch('django.middleware.common.settings')
def test_unicodedecode_error_for_unicode_characters_in_path(self, settings):
"""
https://github.com/django/django/blob/1.3.7/django/middleware/common.py#L107
"""
settings.DEBUG = False
settings.SEND_BROKEN_LINK_EMAILS = True
request = RequestFactory().get(u'/c/\u043b\u0438/')
request.META['HTTP_REFERER'] = 'http://testserver/c/\xd0\xbb\xd0\xb8/'
response = HttpResponse(status=404)
CommonMiddleware().process_response(request, response)
Attachments (1)
Change History (4)
comment:1 by , 12 years ago
| Component: | Uncategorized → HTTP handling |
|---|---|
| Has patch: | set |
| Triage Stage: | Unreviewed → Accepted |
by , 12 years ago
| Attachment: | 20356-1.diff added |
|---|
comment:2 by , 12 years ago
| Triage Stage: | Accepted → Ready for checkin |
|---|
comment:3 by , 12 years ago
| Resolution: | → fixed |
|---|---|
| Status: | new → closed |
Note:
See TracTickets
for help on using tickets.
Technically, URLs may be using any encoding, even though modern RFCs require utf-8.
If a site whose URLs are in latin-1 links to a Django site, this problem will occur when attempting to decode the URL as utf-8.
There's a bunch of workarounds at this point, errors=replace sounds all right, displaying the raw value would work too.