Code

Opened 6 years ago

Closed 5 years ago

Last modified 5 years ago

#7894 closed (duplicate)

Support handler optimisations for file downloads

Reported by: graham.carlyle@… Owned by: nobody
Component: Core (Other) Version: master
Severity: Keywords:
Cc: django-ticket-7894@…, piranha@… Triage Stage: Design decision needed
Has patch: yes Needs documentation: no
Needs tests: no Patch needs improvement: no
Easy pickings: UI/UX:

Description

Sometimes a django app wants to return a file in a response. You can currently do this by returning the data or an iterator for the data in the content of an HttpResponse. However WSGI provides an optional mechanism for higher performance file transmission...

http://www.python.org/dev/peps/pep-0333/#optional-platform-specific-file-handling

...and I imagine mod_python is capable of a similar speed up.

In order to flag to the handler that the response is suitable for this speedup I suggest a new HttpResponseFileWrapper object that wraps a file like object. This response falls back to providing its content via an iterator that reads in block_size (a extra optional parameter) bytes at a time. The speedup mechanism can also use the block_size as a suggestion for reading data from the filelike object.

So for example...

from django.http import HttpResponseFileWrapper

def view(request):
    return HttpResponseFileWrapper(open('foo.pdf'), block_size=8192)

Attachments (3)

file_wrapper_response.diff (2.0 KB) - added by graham.carlyle@… 6 years ago.
Provides the HttpResponseFileWrapper and support for the wsgi file_wrapper speedup as well as a fix to achieve this is the simple_server
file_wrapper_response2.diff (6.0 KB) - added by graham.carlyle@… 6 years ago.
Updated patch with API updated to allows mod_python support and optional length & offset
file_wrapper_response3.diff (11.4 KB) - added by graham.carlyle@… 6 years ago.
Fixed incorrect content length when only supplying an offset, added tests for wsgi handler, patched serve generic function to use

Download all attachments as: .zip

Change History (14)

Changed 6 years ago by graham.carlyle@…

Provides the HttpResponseFileWrapper and support for the wsgi file_wrapper speedup as well as a fix to achieve this is the simple_server

comment:1 Changed 6 years ago by grahamd

  • Needs documentation unset
  • Needs tests unset
  • Patch needs improvement unset

The mod_python package cannot do the same thing for an open file like object. The closest thing in mod_python is req.sendfile() which takes a file name.

In Trac they abstract the interface up to the file name level. Thus for mod_python it then just calls req.sendfile(). For WSGI, if wsgi.file_wrapper is available Trac opens file in raw mode 'rb' (you didn't use raw mode and so would not behave right on Windows), and then creates returnable object using hook provided by WSGI adapter. For where wsgi.file_wrapper not available, a FileWrapper class similar to that specified in WSGI spec is used instead. Do note though that Trac was actually forgetting to do the latter for generic WSGI adapter and fix has only just been committed to their subversion repository.

Note that mod_python doesn't set any headers, for content length, so application has to do that as well as set content type headers etc. For WSGI, better off ensuring content length also set in headers, although for mod_wsgi it will set content length header for file like object corresponding to real actual file if it can.

If you want to support range headers, mod_python does allow offset and length arguments. For WSGI, where wsgi.file_wrapper is not available, the FileWrapper example could be modified to support a length, with offset achieved by doing a file seek if actual file before supplying it to FileWrapper. Where wsgi.file_wrapper does exist it gets a bit more complicated. This is because mod_wsgi is only WSGI adapter that will honour Content-Length in response headers for wsgi.file_wrapper and only send that amount of data from file. Other implementations that support wsgi.file_wrapper just send remainder of file thus meaning more data could be returned than should and client would have to be relied upon to ignore any extra.

Question looks to have arisen out of this ticket, but see:

http://groups.google.com/group/modwsgi/browse_frm/thread/4d73a08485cc4030?hl=en

Changed 6 years ago by graham.carlyle@…

Updated patch with API updated to allows mod_python support and optional length & offset

comment:2 Changed 6 years ago by graham.carlyle@…

The response now takes a file path as a required argument, and optional offset and length parameters. Note that some WSGI adaptors can't deal with the length and the file_wrapper speedup and so the wsgi handler defaults to response's iterator if it doesn't know whether the adaptor supports this (currently only mod_wsgi does).

from django.http import HttpResponseFileWrapper

def view(request):
    return HttpResponseFileWrapper('foo.bin', block_size=8192, offset=500, length=120000)

The patch also added modpython support but I haven't had a chance to check this :)

Also the logic in the wsgi handler is getting hairy so tests are needed but didn't see any obvious way of introducing them yet.

comment:3 Changed 6 years ago by grahamd

Haven't looked complete code properly yet, but are you sure:

 	450	class HttpResponseFileWrapper(HttpResponse): 
 	451	    def __init__(self, file_path, block_size=8192, length=None, offset=0, **kwargs): 
 	452	        content_length = length or os.path.getsize(file_path) 
 	453	        HttpResponse.__init__(self, content=FileWrapper(file_path, block_size, content_length, offset), **kwargs) 
 	454	        self.file_path = file_path 
 	455	        self.block_size = block_size 
 	456	        self.length = length 
 	457	        self.offset = offset 
 	458	        self['Content-Length'] = content_length 

is correct.

Thing I am worried about is setting of Content-Length response header. You set it to actual size of file and don't take into consideration the offset or length from offset that user code may have specified.

comment:4 Changed 6 years ago by grahamd

BTW, I don't know what is reality as don't actually use Django, but occasionally see comments to effect that 'django.views.static.serve' only works for development server. Eg, in thread:

http://groups.google.com/group/django-users/browse_frm/thread/4d615cab984be5c4/c5402097a629e09b?hl=en&lnk=st&q=mod_python#c5402097a629e09b

Thus, one is pushed to serve static files with Apache when using mod_python or mod_wsgi.

Could 'django.views.static.serve' be made to use the extension being added per this ticket so that it works with mod_python and mod_wsgi, with where possible the speed ups offered by mod_python and mod_wsgi being used.

Changed 6 years ago by graham.carlyle@…

Fixed incorrect content length when only supplying an offset, added tests for wsgi handler, patched serve generic function to use

comment:5 Changed 6 years ago by graham.carlyle@…

  • Has patch set

The file_wrapper_response3.diff patch fixes the content length header if only an offset has been provided (as pointed out by grahamd above in comment 3).

This patch also changes the 'django.views.static.serve' generic function (as suggested by grahamd in comment 4) to use the HttpResponseFileWrapper, however this is just a speedup in case the development server is using a wsgi adaptor provides the wsgi.file_wrapper. The HttpResponseFileWrapper is only meant to be used in cases where some view logic is needed in combination with serving a file, serving static files is obviously best down directly by the webserver.

comment:6 Changed 6 years ago by ericholscher

  • milestone set to post-1.0
  • Triage Stage changed from Unreviewed to Design decision needed

comment:7 Changed 6 years ago by Miz

  • Cc django-ticket-7894@… added

comment:8 Changed 6 years ago by piranha

  • Cc piranha@… added

comment:9 Changed 5 years ago by guettli

Looks like a duplicate: #2131.

comment:10 Changed 5 years ago by jacob

  • Resolution set to duplicate
  • Status changed from new to closed

Yeah, the HttpResponseSendFile option from #2131 is a better approach.

comment:11 Changed 5 years ago by anonymous

  • milestone post-1.0 deleted

Milestone post-1.0 deleted

Add Comment

Modify Ticket

Change Properties
<Author field>
Action
as closed
as The resolution will be set. Next status will be 'closed'
The resolution will be deleted. Next status will be 'new'
Author


E-mail address and user name can be saved in the Preferences.

 
Note: See TracTickets for help on using tickets.