#19101 closed Bug (fixed)
Non ascii chars in form cause Internal Server Error
| Reported by: | kristall | Owned by: | Aymeric Augustin |
|---|---|---|---|
| Component: | Forms | Version: | dev |
| Severity: | Release blocker | Keywords: | encoding |
| Cc: | kristall | Triage Stage: | Ready for checkin |
| Has patch: | yes | Needs documentation: | no |
| Needs tests: | no | Patch needs improvement: | no |
| Easy pickings: | no | UI/UX: | no |
Description (last modified by )
Trying to use non ascii chars in form cause trouble with python3.2 (same code works fine under python2.7).
I used "python3 manage.py runserver"
Internal Server Error: /formtest/
Traceback (most recent call last):
File "/usr/local/lib/python3.2/site-packages/django/core/handlers/base.py", line 110, in get_response
response = middleware_method(request, callback, callback_args, callback_kwargs)
File "/usr/local/lib/python3.2/site-packages/django/middleware/csrf.py", line 174, in process_view
request_csrf_token = request.POST.get('csrfmiddlewaretoken', '')
File "/usr/local/lib/python3.2/site-packages/django/core/handlers/wsgi.py", line 179, in _get_post
self._load_post_and_files()
File "/usr/local/lib/python3.2/site-packages/django/http/__init__.py", line 340, in _load_post_and_files
self._post, self._files = QueryDict(self.body, encoding=self._encoding), MultiValueDict()
File "/usr/local/lib/python3.2/site-packages/django/http/__init__.py", line 392, in __init__
encoding=encoding):
File "/usr/local/lib/python3.2/urllib/parse.py", line 608, in parse_qsl
value = _coerce_result(value)
File "/usr/local/lib/python3.2/urllib/parse.py", line 88, in _encode_result
return obj.encode(encoding, errors)
UnicodeEncodeError: 'ascii' codec can't encode characters in position 0-2: ordinal not in range(128)
Traceback (most recent call last):
File "/usr/local/lib/python3.2/site-packages/django/core/handlers/base.py", line 110, in get_response
response = middleware_method(request, callback, callback_args, callback_kwargs)
File "/usr/local/lib/python3.2/site-packages/django/middleware/csrf.py", line 174, in process_view
request_csrf_token = request.POST.get('csrfmiddlewaretoken', '')
File "/usr/local/lib/python3.2/site-packages/django/core/handlers/wsgi.py", line 179, in _get_post
self._load_post_and_files()
File "/usr/local/lib/python3.2/site-packages/django/http/__init__.py", line 340, in _load_post_and_files
self._post, self._files = QueryDict(self.body, encoding=self._encoding), MultiValueDict()
File "/usr/local/lib/python3.2/site-packages/django/http/__init__.py", line 392, in __init__
encoding=encoding):
File "/usr/local/lib/python3.2/urllib/parse.py", line 608, in parse_qsl
value = _coerce_result(value)
File "/usr/local/lib/python3.2/urllib/parse.py", line 88, in _encode_result
return obj.encode(encoding, errors)
UnicodeEncodeError: 'ascii' codec can't encode characters in position 0-2: ordinal not in range(128)
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/usr/local/lib/python3.2/wsgiref/handlers.py", line 137, in run
self.result = application(self.environ, self.start_response)
File "/usr/local/lib/python3.2/site-packages/django/contrib/staticfiles/handlers.py", line 71, in __call__
return self.application(environ, start_response)
File "/usr/local/lib/python3.2/site-packages/django/core/handlers/wsgi.py", line 236, in __call__
response = self.get_response(request)
File "/usr/local/lib/python3.2/site-packages/django/core/handlers/base.py", line 180, in get_response
response = self.handle_uncaught_exception(request, resolver, sys.exc_info())
File "/usr/local/lib/python3.2/site-packages/django/core/handlers/base.py", line 222, in handle_uncaught_exception
return debug.technical_500_response(request, *exc_info)
File "/usr/local/lib/python3.2/site-packages/django/views/debug.py", line 69, in technical_500_response
html = reporter.get_traceback_html()
File "/usr/local/lib/python3.2/site-packages/django/views/debug.py", line 289, in get_traceback_html
c = Context(self.get_traceback_data())
File "/usr/local/lib/python3.2/site-packages/django/views/debug.py", line 247, in get_traceback_data
frames = self.get_traceback_frames()
File "/usr/local/lib/python3.2/site-packages/django/views/debug.py", line 398, in get_traceback_frames
'vars': self.filter.get_traceback_frame_variables(self.request, tb.tb_frame),
File "/usr/local/lib/python3.2/site-packages/django/views/debug.py", line 197, in get_traceback_frame_variables
value = self.get_request_repr(value)
File "/usr/local/lib/python3.2/site-packages/django/views/debug.py", line 105, in get_request_repr
return build_request_repr(request, POST_override=self.get_post_parameters(request))
File "/usr/local/lib/python3.2/site-packages/django/views/debug.py", line 154, in get_post_parameters
return request.POST
File "/usr/local/lib/python3.2/site-packages/django/core/handlers/wsgi.py", line 179, in _get_post
self._load_post_and_files()
File "/usr/local/lib/python3.2/site-packages/django/http/__init__.py", line 340, in _load_post_and_files
self._post, self._files = QueryDict(self.body, encoding=self._encoding), MultiValueDict()
File "/usr/local/lib/python3.2/site-packages/django/http/__init__.py", line 392, in __init__
encoding=encoding):
File "/usr/local/lib/python3.2/urllib/parse.py", line 608, in parse_qsl
value = _coerce_result(value)
File "/usr/local/lib/python3.2/urllib/parse.py", line 88, in _encode_result
return obj.encode(encoding, errors)
UnicodeEncodeError: 'ascii' codec can't encode characters in position 0-2: ordinal not in range(128)
Attachments (6)
Change History (19)
by , 13 years ago
comment:1 by , 13 years ago
| Description: | modified (diff) |
|---|---|
| Severity: | Normal → Release blocker |
| Triage Stage: | Unreviewed → Accepted |
Confirmed. We missed that because currently in QueryDictTests we always assume that the input is a real string in Python 3, which is True for GET requests, but not for a POST request where the first argument passed to QueryDict is the still-encoded self.body.
comment:2 by , 13 years ago
A possible solution would be to call force_str on self.body when passing it to QueryDict, but not before #5611 has been fixed, because we have to be sure that self.body is a x-www-form-urlencoded content type.
comment:3 by , 13 years ago
| Cc: | added |
|---|
comment:4 by , 13 years ago
| Has patch: | set |
|---|
In the above patch, I also fixed #5076, as this was needed for the test with latin-1 encoding. This could be committed separately, though.
comment:6 by , 13 years ago
| Owner: | changed from to |
|---|
comment:7 by , 13 years ago
I'd prefer to fork the part that fixes #5076 to that ticket. I left a comment over there.
The force_str when instantiating the QueryDict looks suspect to me — I suppose self.body contains bytes at this point, wouldn't it make sense to decode them with the charset of the request rather than with utf-8?
comment:8 by , 13 years ago
Now that we know for sure that the content of the request is x-www-form-urlencoded, it should be encoded and composed of only ASCII chars at this point, AFAIK, so decoding it with 'utf-8' or 'ascii' or even any encoding should not make any difference. The charset specified in the request is then used later in QueryDict initialization to decode (in the sense of url decoding) the query string content.
by , 13 years ago
| Attachment: | 19101-3.diff added |
|---|
comment:10 by , 13 years ago
We can also fix this in QueryDict.__init__. It's a bit more consistent with the Python 2 code. It makes QueryDict more resilient.
comment:11 by , 13 years ago
This alternate approach looks good to me. I really think that we should get rid of the errors='replace' behaviour, which is only deferring potential errors later in the stack, but this could be adressed as a whole in #18004.
comment:12 by , 13 years ago
| Resolution: | → fixed |
|---|---|
| Status: | new → closed |
The used form