#14235 closed (fixed)
UnicodeDecodeError in CSRF middleware
Reported by: | Jasper Bryant-Greene | Owned by: | Luke Plant |
---|---|---|---|
Component: | Forms | Version: | dev |
Severity: | Keywords: | csrf unicode | |
Cc: | pswartz@… | Triage Stage: | Accepted |
Has patch: | yes | Needs documentation: | no |
Needs tests: | no | Patch needs improvement: | no |
Easy pickings: | no | UI/UX: | no |
Description
Just updated to SVN head to fix the CSRF security issue and am now getting:
mod_wsgi (pid=15736): Exception occurred processing WSGI script '/home/.../django.wsgi'. Traceback (most recent call last): File "/usr/lib/python2.5/site-packages/django/core/handlers/wsgi.py", line 245, in __call__ response = middleware_method(request, response) File "/usr/lib/python2.5/site-packages/django/middleware/csrf.py", line 256, in process_response response.content, n = _POST_FORM_RE.subn(add_csrf_field, response.content) UnicodeDecodeError: 'ascii' codec can't decode byte 0xe2 in position 953: ordinal not in range(128)
Attachments (3)
Change History (18)
comment:1 by , 14 years ago
Has patch: | set |
---|---|
Keywords: | csrf unicode added |
comment:2 by , 14 years ago
I've also been bitten by this bugs - as well as all non-english language users would be after updating to 1.2.2. A more fine-grained patch that only works around changes from the 1.2.2 security patch:
Index: middleware/csrf.py =================================================================== --- middleware/csrf.py (revision 13700) +++ middleware/csrf.py (working copy) @@ -249,7 +249,7 @@ """Returns the matched <form> tag plus the added <input> element""" return mark_safe(match.group() + "<div style='display:none;'>" + \ "<input type='hidden' " + idattributes.next() + \ - " name='csrfmiddlewaretoken' value='" + escape(csrf_token) + \ + " name='csrfmiddlewaretoken' value='" + escape(csrf_token).encode('utf-8') + \ "' /></div>")
comment:3 by , 14 years ago
Owner: | changed from | to
---|---|
Status: | new → assigned |
OK, we've been bitten again by Python's unicode handling:
For the typical case, csrf_token is entirely ASCII, but escape(csrf_token) returns a unicode object (that contains entirely ASCII chars, but it doesn't matter). The result is that the entire expression is silently converted to a unicode object. HttpResponse.content is a property that returns an encoded str, not a unicode object, so the regex substitution method fails as above.
As for the fix, we probably need to be aware of the encoding of the response, which might not be UTF-8. We could use response._charset to find out, but people might not be using Django's HttpResponse object. An alternative is to use encode('ascii', 'ignore')
, which should work fine with all the CSRF tokens that Django produces.
Note, I think this is only a problem for people using CsrfResponseMiddleware, which is deprecated. But a fix is still needed urgently.
comment:4 by , 14 years ago
Triage Stage: | Unreviewed → Accepted |
---|
Perhaps try using response._charset
for the encoding and fall back to ascii, ignore only if it does not exist?
comment:5 by , 14 years ago
I thin kit is more robust and more easily tested if we deal with both the XSS issue addressed in [36898] and the unicode issue here if we just sanitise the input cookie to be alphanumeric. This also means that that outgoing cookie and token will match. Patch attached shortly.
comment:6 by , 14 years ago
I thought of an advantage of doing it the way my last patch does it: if there is some client side code that is building a form dynamically, and includes a CSRF hidden token by using the CSRF cookie value, they might still be vulnerable to the XSS attack addressed in [13698]. This way, the cookie value is sanitised and they would not be vulnerable. (This is probably an unlikely attack, though).
comment:7 by , 14 years ago
Cc: | added |
---|
follow-up: 9 comment:8 by , 14 years ago
I assume the escape still in the csrftoken tag implementation (http://code.djangoproject.com/browser/django/trunk/django/template/defaulttags.py#L46) is still needed?
There's now an unneeded import of escape in the csrf tests.py file.
For consistency with American spellings uses elsewhere throughout the code it should spelled _sanitize_token
....
comment:9 by , 14 years ago
Replying to kmtracey:
I assume the escape still in the csrftoken tag implementation (http://code.djangoproject.com/browser/django/trunk/django/template/defaulttags.py#L46) is still needed?
Actually, no, although it doesn't harm having it there. For consistency I guess it should be removed.
There's now an unneeded import of escape in the csrf tests.py file.
For consistency with American spellings uses elsewhere throughout the code it should spelled
_sanitize_token
....
Good catches.
comment:10 by , 14 years ago
I can't see any immediate problem here, but we were discussing this over dinner (Karen, myself, Russ, ...) and the potential problem corner that occurred to me is this: if somebody manages to get a corrupted/changed cookie to the point that this sanitization removes *all* characters, will that cause some kind of unpleasant error? I suspect not, but I'm raising it for thought for a few hours (late at night here).
comment:11 by , 14 years ago
For an empty token, the result would currently be that the {% csrf_token %}
tag would output nothing, and subsequent form submission would fail.
The scenarios I can think of here, are 1) that a subdomain could have set the CSRF cookie, presumably for malicious purposes, or 2) that some one has just messed around with their own cookie. In the first case, we have a known attack that we can't prevent, but that attack might be dealt with by the owner of the subdomain being sorted out. So in either case it would be nicer if the user wasn't stuck with a dodgy cookie that made them unable to submit forms. So, I'll add a condition to cater for that.
comment:12 by , 14 years ago
Looks good to me. James is around at the sprints (in Portland) today, as well as myself and Karen, so we can roll out a fix release today (including the packaging problem fix as well).
comment:13 by , 14 years ago
Component: | Uncategorized → Forms |
---|
comment:14 by , 14 years ago
Resolution: | → fixed |
---|---|
Status: | assigned → closed |
I don’t know if this is the correct fix, but it worked around the issue for me: