#19101 closed Bug (fixed)
Non ascii chars in form cause Internal Server Error
Reported by: | kristall | Owned by: | Aymeric Augustin |
---|---|---|---|
Component: | Forms | Version: | dev |
Severity: | Release blocker | Keywords: | encoding |
Cc: | kristall | Triage Stage: | Ready for checkin |
Has patch: | yes | Needs documentation: | no |
Needs tests: | no | Patch needs improvement: | no |
Easy pickings: | no | UI/UX: | no |
Description (last modified by )
Trying to use non ascii chars in form cause trouble with python3.2 (same code works fine under python2.7).
I used "python3 manage.py runserver"
Internal Server Error: /formtest/ Traceback (most recent call last): File "/usr/local/lib/python3.2/site-packages/django/core/handlers/base.py", line 110, in get_response response = middleware_method(request, callback, callback_args, callback_kwargs) File "/usr/local/lib/python3.2/site-packages/django/middleware/csrf.py", line 174, in process_view request_csrf_token = request.POST.get('csrfmiddlewaretoken', '') File "/usr/local/lib/python3.2/site-packages/django/core/handlers/wsgi.py", line 179, in _get_post self._load_post_and_files() File "/usr/local/lib/python3.2/site-packages/django/http/__init__.py", line 340, in _load_post_and_files self._post, self._files = QueryDict(self.body, encoding=self._encoding), MultiValueDict() File "/usr/local/lib/python3.2/site-packages/django/http/__init__.py", line 392, in __init__ encoding=encoding): File "/usr/local/lib/python3.2/urllib/parse.py", line 608, in parse_qsl value = _coerce_result(value) File "/usr/local/lib/python3.2/urllib/parse.py", line 88, in _encode_result return obj.encode(encoding, errors) UnicodeEncodeError: 'ascii' codec can't encode characters in position 0-2: ordinal not in range(128) Traceback (most recent call last): File "/usr/local/lib/python3.2/site-packages/django/core/handlers/base.py", line 110, in get_response response = middleware_method(request, callback, callback_args, callback_kwargs) File "/usr/local/lib/python3.2/site-packages/django/middleware/csrf.py", line 174, in process_view request_csrf_token = request.POST.get('csrfmiddlewaretoken', '') File "/usr/local/lib/python3.2/site-packages/django/core/handlers/wsgi.py", line 179, in _get_post self._load_post_and_files() File "/usr/local/lib/python3.2/site-packages/django/http/__init__.py", line 340, in _load_post_and_files self._post, self._files = QueryDict(self.body, encoding=self._encoding), MultiValueDict() File "/usr/local/lib/python3.2/site-packages/django/http/__init__.py", line 392, in __init__ encoding=encoding): File "/usr/local/lib/python3.2/urllib/parse.py", line 608, in parse_qsl value = _coerce_result(value) File "/usr/local/lib/python3.2/urllib/parse.py", line 88, in _encode_result return obj.encode(encoding, errors) UnicodeEncodeError: 'ascii' codec can't encode characters in position 0-2: ordinal not in range(128) During handling of the above exception, another exception occurred: Traceback (most recent call last): File "/usr/local/lib/python3.2/wsgiref/handlers.py", line 137, in run self.result = application(self.environ, self.start_response) File "/usr/local/lib/python3.2/site-packages/django/contrib/staticfiles/handlers.py", line 71, in __call__ return self.application(environ, start_response) File "/usr/local/lib/python3.2/site-packages/django/core/handlers/wsgi.py", line 236, in __call__ response = self.get_response(request) File "/usr/local/lib/python3.2/site-packages/django/core/handlers/base.py", line 180, in get_response response = self.handle_uncaught_exception(request, resolver, sys.exc_info()) File "/usr/local/lib/python3.2/site-packages/django/core/handlers/base.py", line 222, in handle_uncaught_exception return debug.technical_500_response(request, *exc_info) File "/usr/local/lib/python3.2/site-packages/django/views/debug.py", line 69, in technical_500_response html = reporter.get_traceback_html() File "/usr/local/lib/python3.2/site-packages/django/views/debug.py", line 289, in get_traceback_html c = Context(self.get_traceback_data()) File "/usr/local/lib/python3.2/site-packages/django/views/debug.py", line 247, in get_traceback_data frames = self.get_traceback_frames() File "/usr/local/lib/python3.2/site-packages/django/views/debug.py", line 398, in get_traceback_frames 'vars': self.filter.get_traceback_frame_variables(self.request, tb.tb_frame), File "/usr/local/lib/python3.2/site-packages/django/views/debug.py", line 197, in get_traceback_frame_variables value = self.get_request_repr(value) File "/usr/local/lib/python3.2/site-packages/django/views/debug.py", line 105, in get_request_repr return build_request_repr(request, POST_override=self.get_post_parameters(request)) File "/usr/local/lib/python3.2/site-packages/django/views/debug.py", line 154, in get_post_parameters return request.POST File "/usr/local/lib/python3.2/site-packages/django/core/handlers/wsgi.py", line 179, in _get_post self._load_post_and_files() File "/usr/local/lib/python3.2/site-packages/django/http/__init__.py", line 340, in _load_post_and_files self._post, self._files = QueryDict(self.body, encoding=self._encoding), MultiValueDict() File "/usr/local/lib/python3.2/site-packages/django/http/__init__.py", line 392, in __init__ encoding=encoding): File "/usr/local/lib/python3.2/urllib/parse.py", line 608, in parse_qsl value = _coerce_result(value) File "/usr/local/lib/python3.2/urllib/parse.py", line 88, in _encode_result return obj.encode(encoding, errors) UnicodeEncodeError: 'ascii' codec can't encode characters in position 0-2: ordinal not in range(128)
Attachments (6)
Change History (19)
by , 12 years ago
comment:1 by , 12 years ago
Description: | modified (diff) |
---|---|
Severity: | Normal → Release blocker |
Triage Stage: | Unreviewed → Accepted |
Confirmed. We missed that because currently in QueryDictTests
we always assume that the input is a real string in Python 3, which is True for GET requests, but not for a POST request where the first argument passed to QueryDict
is the still-encoded self.body
.
comment:2 by , 12 years ago
A possible solution would be to call force_str
on self.body
when passing it to QueryDict
, but not before #5611 has been fixed, because we have to be sure that self.body is a x-www-form-urlencoded
content type.
comment:3 by , 12 years ago
Cc: | added |
---|
comment:4 by , 12 years ago
Has patch: | set |
---|
In the above patch, I also fixed #5076, as this was needed for the test with latin-1 encoding. This could be committed separately, though.
comment:6 by , 12 years ago
Owner: | changed from | to
---|
comment:7 by , 12 years ago
I'd prefer to fork the part that fixes #5076 to that ticket. I left a comment over there.
The force_str
when instantiating the QueryDict
looks suspect to me — I suppose self.body
contains bytes at this point, wouldn't it make sense to decode them with the charset of the request rather than with utf-8?
comment:8 by , 12 years ago
Now that we know for sure that the content of the request is x-www-form-urlencoded
, it should be encoded and composed of only ASCII chars at this point, AFAIK, so decoding it with 'utf-8' or 'ascii' or even any encoding should not make any difference. The charset specified in the request is then used later in QueryDict
initialization to decode (in the sense of url decoding) the query string content.
by , 12 years ago
Attachment: | 19101-3.diff added |
---|
comment:10 by , 12 years ago
We can also fix this in QueryDict.__init__
. It's a bit more consistent with the Python 2 code. It makes QueryDict
more resilient.
comment:11 by , 12 years ago
This alternate approach looks good to me. I really think that we should get rid of the errors='replace' behaviour, which is only deferring potential errors later in the stack, but this could be adressed as a whole in #18004.
comment:12 by , 12 years ago
Resolution: | → fixed |
---|---|
Status: | new → closed |
The used form