#24240 closed Bug (fixed)
Setting Content-Encoding on a non-ASCII StreamingHttpResponse raises UnicodeEncodeError
Reported by: | Matthew Somerville | Owned by: | nobody |
---|---|---|---|
Component: | HTTP handling | Version: | 1.7 |
Severity: | Normal | Keywords: | |
Cc: | Triage Stage: | Ready for checkin | |
Has patch: | yes | Needs documentation: | no |
Needs tests: | no | Patch needs improvement: | no |
Easy pickings: | no | UI/UX: | no |
Description
HttpResponseBase's make_bytes() assumes that if the Content-Encoding header is set, then everything has already been dealt with bytes-wise, but in a streaming situation this is not necessarily the case as the header has been set before make_bytes() will be called on the original content.
For example, if you provide an iterable including a non-ASCII Unicode string to a StreamingHttpResponse and are using the GZip middleware, the response will die with a UnicodeEncodeError when it arrives at the first non-ASCII Unicode chunk.
I have added two tests at https://github.com/dracos/django/compare/non-ascii-compressed-streaming (let me know if you'd prefer a PR), firstly a working one to show that supplying non-ASCII Unicode strings to a StreamingHttpResponse works fine normally; then a failing one to show that passing such a response through the GZip middleware dies with a UnicodeEncodeError.
This appears to be happening because as we're working with an iterable, make_bytes() is only called when necessary, but by the point that happens and the iterable is consumed, the middleware has added the Content-Encoding header and thus make_bytes() tries to call bytes(value) (and dies). If it had been a normal HttpResponse, make_bytes() would have been called when the content was set, well before the middleware set the Content-Encoding header.
I'm not aware of other intricacies of HttpResponse to suggest an obvious fix, sorry. The start of this short circuit was back in #4969 but lots has changed since then; I don't know if this special case when Content-Encoding is set is still necessary in the same way.
Change History (5)
comment:1 by , 10 years ago
comment:2 by , 10 years ago
Has patch: | set |
---|
Sure, created at https://github.com/django/django/pull/4011 :)
I did remove one test that would clearly fail, because this change will allow you to supply Unicode strings to a Response with a Content-Encoding and it will be encoded as normal. I thought that was acceptable given it fixes the issue with any StreamingHttpResponse. If it isn't, then the change in response.py could check if it was a StreamingHttpResponse, but that would then be different behaviour between HttpResponse and StreamingHttpResponse, in that Streaming would be able to set Content-Encoding and put Unicode strings in the iterator.
comment:3 by , 10 years ago
Triage Stage: | Unreviewed → Ready for checkin |
---|
comment:4 by , 10 years ago
Resolution: | → fixed |
---|---|
Status: | new → closed |
Please create a PR including your tests and the fix in
make_bytes
, and we'll see what the test suite think about that!