Opened 18 years ago
Closed 12 years ago
#2504 closed Bug (duplicate)
ConditionalGetMiddleware causes Content-Length to be 0 on HttpResponse with fileobject
Reported by: | Owned by: | Aymeric Augustin | |
---|---|---|---|
Component: | Core (Other) | Version: | dev |
Severity: | Normal | Keywords: | |
Cc: | hv@…, kmike84@… | Triage Stage: | Accepted |
Has patch: | yes | Needs documentation: | no |
Needs tests: | no | Patch needs improvement: | yes |
Easy pickings: | no | UI/UX: | no |
Description
middleware/http.py:
if not response.has_header('Content-Length'): response['Content-Length'] = str(len(response.content))
If response is build with a file object, str(len(response.content)) is 0, and the browser will ignore the page, if the handler does not set content-length.
If response.content is not a string, do not add Content-Length.
Optional we could implement:
If response.content is a file like object, and content.name is a valid filename, we could get the size with os.path.getsize and those.
Attachments (4)
Change History (26)
comment:1 by , 18 years ago
Owner: | changed from | to
---|
comment:2 by , 18 years ago
comment:3 by , 18 years ago
Triage Stage: | Unreviewed → Accepted |
---|
comment:4 by , 17 years ago
Component: | Core framework → Internationalization |
---|---|
Needs documentation: | set |
Triage Stage: | Accepted → Design decision needed |
comment:5 by , 17 years ago
Component: | Internationalization → Core framework |
---|---|
Owner: | changed from | to
Triage Stage: | Design decision needed → Accepted |
Please don't do anonymous triage, or change things without a reason.
comment:6 by , 17 years ago
Owner: | changed from | to
---|---|
Status: | new → assigned |
by , 17 years ago
Attachment: | get_content_length.diff added |
---|
by , 17 years ago
comment:7 by , 17 years ago
Has patch: | set |
---|---|
Needs documentation: | unset |
The Transfer-Encoding
heading is a hop-by-hop header, which are not permitted to be set by PEP 333 (WSGI):
applications and middleware must not apply any kind of Transfer-Encoding to their output, such as chunking or gzipping; as "hop-by-hop" operations, these encodings are the province of the actual web server/gateway
There doesn't seem to be any way for a WSGI handler to require or even indicate that chunked encoding should be used: it appears that this remains the purview of the web server. I also checked and couldn't find a way in mod_python
either.
The attached patch (including docs and tests) introduces a get_content_length()
method to HttpResponse
which efficiently determines the content length, if possible.
As Malcolm pointed out to me on IRC, it would be sensible to use response.get_content_length()
instead of len(response.content
throughout. I checked and the only instance I could see of this was in GZipMiddleware
, so I've attached a patch to update that too.
comment:8 by , 17 years ago
Better patch now attached which should handle all types of iterators passed in as content, not just files, lists and tuples.
Note the change in _get_content
that stores the joined string. This was done because once an iterator has been iterated through, it is effectively spent and cannot be iterated through again. It may also be slightly optimal as it means it won't need to join as many values in subsequent calls.
comment:9 by , 17 years ago
I create tar files on the fly with os.popen2() and pass stdout, which is a fileobj, as response.content, doing a len(response.content) reads the full tar file into memory, which can be > 500MB in my case. Without checking for the content length everything works fine and I can do a "streaming" download to the browser ...
As a workaroung I have patched my django tree and added the following two lines at the top of process_response() in the ConditionalGetMiddleware:
if(response._is_string == False): return response
This may not be fully HTTP compatible, but I see no other way of sending large files without creating temp-files ...
comment:10 by , 17 years ago
as a side note and as reply @ 09/27/06 06:57:46 changed by mtredinnick:
Both, RFC 1945 HTTP/1.0 and RFC 2626 HTTP/1.1, state that an absent Content-Length Header is allowed. If the header is missing and no special Transfer-Encoding (e.g. chunked) is set, closing the connection on the server side determines the end of the response content.
In RFC 1945 this is stated in part 7.2.2 and 10.4,
In RFC 2616 this ist stated in part 4.4 method 5
comment:11 by , 17 years ago
Owner: | changed from | to
---|---|
Patch needs improvement: | set |
Status: | assigned → new |
by , 17 years ago
Attachment: | 2504_http_response_and_conditional_middleware.patch added |
---|
new patch, fixing the issue with pipes and files
comment:12 by , 17 years ago
Patch needs improvement: | unset |
---|---|
Triage Stage: | Accepted → Ready for checkin |
the new patch doesn't contain http/1.1 and chunked encoding stuff. this should be fixed in a seperate bug
comment:13 by , 17 years ago
Patch needs improvement: | set |
---|---|
Triage Stage: | Ready for checkin → Accepted |
comment:14 by , 17 years ago
Also see #7581, related to streaming and the Content-Length
header with WSGI.
comment:15 by , 14 years ago
Cc: | added |
---|
comment:16 by , 14 years ago
Type: | defect → Bug |
---|
comment:17 by , 14 years ago
Severity: | major → Normal |
---|
comment:20 by , 13 years ago
Cc: | added |
---|
comment:21 by , 12 years ago
Owner: | changed from | to
---|
comment:22 by , 12 years ago
Resolution: | → duplicate |
---|---|
Status: | new → closed |
With the current implementation of HttpResponse
, when it's instantiated with an iterator, the first access to response.content
will return the content, and the second one will return the empty string.
This is the root cause of this issue, and it's tracked in #6527.
Some notes about a solution to this problem, since I haven't started working on it and somebody else may wish to:
If we are using HTTP/1.1, we can send the
Transfer-Encoding: chunked
header and chunk the output into "reasonable size chunks", without needing to worry about a content length ahead of time.With HTTP/1.0, however, we need a content-length header. For file objects, looking at the file size is not bad. For other iterators, we need to read it all into memory and compute the length.