Django

Code

Ticket #2131 (new)

Opened 4 years ago

Last modified 5 days ago

HttpResponseSendFile for serving static files handler-specific sendfile mechanism

Reported by: ymasuda[at]ethercube.com Assigned to: ccahoon
Milestone: 1.2 Component: Core framework
Version: SVN Keywords:
Cc: Graham.Dumpleton@gmail.com, jdunck@gmail.com, django-ticket-2131@codef0x.org, andreas@pelme.se, andrehcampos@gmail.com, gregoire@audacy.fr, gabor@nekomancer.net, ville@unessa.net, hackeron@gmail.com, chris.cahoon@gmail.com, exelib@gmail.com, redalastor@gmail.com, dane.springmeyer@gmail.com, jezdez, dan@danros.org.uk, danny.adair@unfold.co.nz, apollo13 Triage Stage: Fixed on a branch
Has patch: 1 Needs documentation: 0
Needs tests: 0 Patch needs improvement: 0

Description

This patch modifies modpython handler and WSGI (basehttp) handler to add HttpResponseSendFile?. The HttpResponseSendFile? allows static files to be served using handler-specific sendfile mechanism. It uses response.sendfile() on mod_python, FileWrapper? on wsgi.

Attachments

http-response-send-file.diff (3.0 kB) - added by ymasuda[at]ethercube.com on 06/11/06 08:47:04.
diff implements HttpResponseSendFile?
http-response-send-file.2.diff (2.9 kB) - added by ymasuda[at]ethercube.com on 06/11/06 20:58:48.
fixed typos above, sorry
django-ticket-2131-http-response-send-file.2-django1.0.2final.diff (3.4 kB) - added by mizatservercave on 02/10/09 21:12:22.
updated for Django 1.0.2
django-ticket-2131-http-response-send-file.2-django1.0.2final-v1.1.diff (3.4 kB) - added by mizatservercave on 02/18/09 12:07:13.
Supersedes previous patches for Django 1.0.2 final.
django-ticket-2131-http-response-send-file.2-django1.0.2final-v1.2_test.diff (1.9 kB) - added by mizatservercave on 02/26/09 09:51:44.
PARTIAL patch (FOR TESTING ONLY) to verify HttpResponseSendFile? works without specific handler support
django-ticket-2131-http-response-send-file.2-django1.0.2final-v1.2.diff (3.6 kB) - added by mizatservercave on 02/26/09 09:53:16.
Updated patch. Adds support for 'response.content'
httpresponsesendfile-no-default-content.diff (3.7 kB) - added by mrts on 03/14/09 09:47:52.
Disallow default HttpResponse behaviour (content etc)
httpresponsesendfile-no-default-content_bypass-middleware.diff (7.0 kB) - added by mrts on 03/14/09 16:54:11.
Bypass middleware
httpresponsesendfile-no-default-content_bypass-middleware.1.diff (7.0 kB) - added by mrts on 03/15/09 09:04:32.
Fix method naming typo
httpresponsesendfile-no-default-content_bypass-middleware.2.diff (7.0 kB) - added by mrts on 03/15/09 09:07:35.
Doh, previous diff didn't contain the fix, second try.
httpresponsesendfile-no-default-content_bypass-middleware.3.diff (7.0 kB) - added by mrts on 03/16/09 05:44:04.
Fix yet another copy-paste omission.
httpresponsesendfile-no-default-content_bypass-middleware_with-header_with-docs-and-tests.diff (12.3 kB) - added by mrts on 03/21/09 17:05:58.
Added the requested header, as well as docs and tests.

Change History

06/11/06 08:47:04 changed by ymasuda[at]ethercube.com

  • attachment http-response-send-file.diff added.

diff implements HttpResponseSendFile?

06/11/06 08:53:57 changed by anonymous

  • summary changed from HttpResponseSendFile for serving static files handler-specific sendfile mechanism to [patch] HttpResponseSendFile for serving static files handler-specific sendfile mechanism.

06/11/06 20:58:48 changed by ymasuda[at]ethercube.com

  • attachment http-response-send-file.2.diff added.

fixed typos above, sorry

(follow-up: ↓ 5 ) 06/14/06 10:10:16 changed by adrian

  • status changed from new to closed.
  • resolution set to wontfix.

Django isn't meant to serve static files, so I'm marking this as a wontfix.

04/05/07 11:12:58 changed by alex@halogen-dg.com

This patch could be very useful in a situations with a password protected files to a specific users. Typically I am talking about e-commerce site which sells PDF books. I am currently returning data via standard HttpResponse. It works well for small files, but it would probably be a problem for bigger files.

07/19/07 10:20:37 changed by Stefan (stelund on gmail)

I've implemented a similar workaround to restrict photo images accesses. I think its ridiculous not to implement this feature to aid performance for views.static.serve() as well. HttpResponse is implemented as an iterator so it fits perfectly to read the file chunk by chunk. The performance increase i've seen is great and i'm happy not to need another tcp-server running on my box.

(in reply to: ↑ 2 ; follow-up: ↓ 6 ) 08/12/07 11:59:39 changed by johann@browsershots.org

Replying to adrian:

Django isn't meant to serve static files, so I'm marking this as a wontfix.

But this feature would be useful to reduce memory usage with large dynamically generated content.

In my application, I generate large (30MB) ZIP files in Django and send them through response.write(). The ZIP files are just collections of PNG files, so it doesn't make sense to store them on disk permanently.

Unfortunately, using response.write() means that the whole ZIP file must be loaded into memory. It would be so cool if I could create the file on disk and then use sendfile. The same use-case applies to generated PDF files.

As a workaround, I could write the file to disk and then redirect the browser, but that's tedious, requires another TCP round-trip, and makes automated downloads more difficult.

(in reply to: ↑ 5 ) 08/12/07 18:28:39 changed by johann@browsershots.org

Here's a solution that doesn't require patching Django. It's not the real sendfile(), but it loads and sends the file in 8KB chunks, rather than loading the whole file into memory.

http://www.djangosnippets.org/snippets/365/

It uses the FileWrapper object from source:django/trunk/django/core/servers/basehttp.py.

08/13/08 17:21:46 changed by TP

Even using a file-wrapper iterator does not speed up things as much as directly using sendfile. Using mod_wsgi's exposed sendfile uses 30x less CPU vs Django-with-iterator on my linux box testing downloading hundreds of images concurrently.

08/13/08 18:40:05 changed by grahamd

Trac also isn't meant to serve static media files either and preferred setup is to use Apache to do it directly. In Trac however you have the concept of attachments associated with tickets. For such a thing as attachments, where it is still stored in file system, but where Apache cant get to it, you have no choice but to serve it through the application. Any similar sort of scenario in Django is currently not very efficient in terms of performance, or in terms of memory usage if people don't use techniques to limit memory impact. Thus, Django should really provide a default way of handling this efficiently and saying 'wontfix' means that you haven't looked very widely at how Django can be used. :-)

08/13/08 19:07:00 changed by jdunck

  • cc set to jdunck@gmail.com.

08/13/08 19:21:16 changed by mtredinnick

  • status changed from closed to reopened.
  • stage changed from Unreviewed to Design decision needed.
  • resolution deleted.
  • milestone set to post-1.0.

Claiming we "haven't looked very widely" is ridiculous, given what Django can do. More possibly "some cases may have been overlooked" (there are alternatives to the Trac setup). So let's try to avoid hyperbole. :-)

Reopening for now and we can look at it after 1.0.

08/13/08 23:57:06 changed by mrts

+1 on including this. My use case: instrument Java classfiles, generate a JAR and return it to the user on the fly.

08/14/08 02:57:27 changed by graham.carlyle@maplecroft.com

Another ticket that covers this issue is #7984, which probably means that its a duplicate. Sorry I didn't spot this ticket when I filed it. The patch on #7984 is very similar to this one :) but has some extra logic related to content-length for mod_wsgi because of some amibguities in the WSGI spec. (see discussion on mod_wsgi group)

08/14/08 02:58:38 changed by graham.carlyle@maplecroft.com

gah i'm an idiot I meant #7894

08/23/08 11:01:29 changed by Miz

  • cc changed from jdunck@gmail.com to jdunck@gmail.com, django-ticket-2131@codef0x.org.

09/23/08 00:01:09 changed by anonymous

  • cc changed from jdunck@gmail.com, django-ticket-2131@codef0x.org to jdunck@gmail.com, django-ticket-2131@codef0x.org, andreas@pelme.se.

10/12/08 18:56:54 changed by alsleme

  • owner changed from adrian to alsleme.
  • status changed from reopened to new.

11/03/08 03:03:32 changed by ahlongxp

  • version set to SVN.

I'd really like to see this merged in django.

01/07/09 05:27:12 changed by anonymous

  • cc changed from jdunck@gmail.com, django-ticket-2131@codef0x.org, andreas@pelme.se to jdunck@gmail.com, django-ticket-2131@codef0x.org, andreas@pelme.se, andrehcampos@gmail.com.

02/10/09 21:12:22 changed by mizatservercave

  • attachment django-ticket-2131-http-response-send-file.2-django1.0.2final.diff added.

updated for Django 1.0.2

02/10/09 21:21:22 changed by mizatservercave

  • needs_tests set to 1.

I have added an updated patch that SEEMS to work on Django 1.0.2. I had an earlier revision working on Django 1.0.0, but have only updated my local copy to 1.0.2 tonight.

Extensive testing should be done. Right now I have only tested WSGI (via Django's manage.py runserver).

The exclusion/removal of the 'Content-Length' header has been added to force WSGI to recalculate the content length from the file length, where prior to that point in the code something previously tends to add 'Content-Length' = 0, which is probably not wise to mess with "at the source".

The additional content_type == None check was added for Windows compatibility -- Google Chrome would not download a file that ultimately resulted in "Content-Type: None", so I chose a generic 'reasonable' default. See also http://bugs.python.org/issue4969 -- but Python 2.5 without that fix is too commonly deployed, at least in my environment. It's easier to do a 2-line fix where it's needed, as-needed.

Other than that, the patch is essentially the same as the original poster's.

02/18/09 12:07:13 changed by mizatservercave

  • attachment django-ticket-2131-http-response-send-file.2-django1.0.2final-v1.1.diff added.

Supersedes previous patches for Django 1.0.2 final.

02/18/09 12:13:54 changed by mizatservercave

New patch uploaded. Fixes mod_python support.

This patch is now used on a production website and shows phenomenal performance.

The production use-case for this patch is a "members-only" upload/download area which requires extensive permissions checking and custom auditing. This patch makes it possible to do all the logic needed while remaining completely within the Django framework, and without sacrificing performance while either uploading or downloading.

02/25/09 13:23:59 changed by jacob

This looks like a nice patch to me, and hearing that it works in production is good to know. Still needs a few things, though:

  • The current setup requires cooperation between the HttpResponseSendFile object and the handler. This means it'll break if HttpResponseSendFile is used with a custom handler that doesn't know about it; that's a potential backwards-incompatibility, and we need to avoid it. If possible, HttpResponseSendFile needs to fake being a regular HttpResponse, probably by providing a content property which just reads the file.
  • Documentation.
  • Tests.

02/25/09 13:51:44 changed by

  • milestone deleted.

Milestone post-1.0 deleted

02/26/09 03:41:24 changed by graham.carlyle@maplecroft.com

Looked in to this a while back (I created the duplicate ticket #7894 with similar patch), but found that in production using the WSGI file_wrapper approach can be problematic. I found that serving PDFs to an adobe reader client failed for documents above around 25MB in size because reader required the web server to satisfy byte range requests which didn't work with WSGI. I ended up using the X-SendFile? header and mod_xsendfile (on apache) which didn't have these problems. I'm not certain whether serving bytes range is inherently impossible with WSGI but requests to apache with mod_wsgi didn't satisfy them for the WSGI content.

Also you might like to make the same change that the patch in #7894 made to the 'django.views.static.serve' function so the django development server can use this speedup.

02/26/09 09:50:40 changed by mizatservercave

Implementing additional HTTP features (HTTP 206 PARTIAL CONTENT) directly in the core is something I suspect the core maintainers want to avoid, on grounds that Django is an application framework and *not* a webserver.

I think you bring up a good point, though. This patch serves a niche interest instead of a general purpose one. That is, instead of this patch being able to fully support all HTTP access methods (such as 206 PARTIAL CONTENT), it is useful in only very certain circumstances:

1. When using a handler with an 'efficient mechanism' for 'sending files' AND which knows about HttpResponseSendFile? as implemented by this patch, AND

2. HttpResponseSendFile?, as-implemented, is only suitable for sending whole files.

All the fallback methods revert to 'pure python iterators' in some manner or form. I spent some time bouncing between mod_python, the WSGI spec, and mod_python, trying to decide whether or not trying to wedge a generator into the works (for efficiency) was worthwhile overall. I don't think it is.

It appears that HttpResponseSendFile? is only efficient when both the above preconditions are met. Anything which would cause any fallback method (iterators) to be used will DECREASE application performance, but probably not more than doing a raw "HttpResponse(file.read())". Any more 'advanced functionality' or efficiency requirements and you would be well advised to defer the functionality to the webserver, such as those servers which support X-Sendfile natively or by extension.

The goal of promoting this patch is to provide basic functionality just beyond the realm of python iterators for file-reading/file-sending (when preconditions for efficiency are met), while remaining entirely contained within Django -- without *additional* external dependencies, beyond normal deployment against mod_wsgi or mod_python. I do think the implementation useful, however. But it is certainly not "feature rich".

In regards to WSGI's support of all this, it appears that WSGI defers partial content delivery responsibility to the WSGI application [eg, Django or applications written with Django]. In other words, the WSGI spec does not appear support partial content directly. Reading a file and/or chopping it up to feed to the WSGI interface would require pure-python with (presumably) iterators/generators (or even tempfiles with the 'cut-out' content), which would entirely defeat the purpose of this patch.

Mod_python appears at first glance to expose sendfile 'directly' which would seem to support partial content easily, but there seems to be no straightforward way to be efficient across anything BUT Mod_python in this case, and is probably the subject of an additional patch to implement, say, HttpResponseResumableFile?().

In short: This patch (unfortunately) serves only a small audience. The larger audience can probably be served with X-Sendfile.

[Kudo's for the reference. I didn't know X-Sendfile even existed 'til you posted, Graham. :)]

Advantages:
* Provides amazingly good results, when used to send whole files across a mutually-supported interface

Disadvantages:
* Introduces response-specific code into the handlers, which would ultimately require maintenance
* All fallback methods are "not efficient" simply by virtue that "everything is done in pure python without a speedy extension, like mod_python or mod_wsgi provides". Large files consume a lot of time and a lot of RAM -- although attempting to serve those types of files through Django before this patch suffers from the same issues of slowness and lots of RAM consumed.
* response.content is one of these inefficient methods

Additional Considerations:
* As this is a 'send whole file' mechanism, should a 'Content-Disposition: attachment; filename=' header be added? Should the filename come from the HttpResponseSendFile? constructor so the application can name it what it likes? [Personally, I think so, on both counts.]

02/26/09 09:51:44 changed by mizatservercave

  • attachment django-ticket-2131-http-response-send-file.2-django1.0.2final-v1.2_test.diff added.

PARTIAL patch (FOR TESTING ONLY) to verify HttpResponseSendFile? works without specific handler support

02/26/09 09:53:16 changed by mizatservercave

  • attachment django-ticket-2131-http-response-send-file.2-django1.0.2final-v1.2.diff added.

Updated patch. Adds support for 'response.content'

(follow-up: ↓ 34 ) 02/26/09 09:59:56 changed by mizatservercave

Regarding the 'v1.2' version of the patch I just uploaded, note that for large files when accessed through response.content loads the whole thing up into the webserver's RAM before serving. A good test example is OpenOffice?.org 3's 143MB installer. Bad for small (eg, 256MB RAM) webservers.

RAM also seems to 'hang around' after each request, so I suspect that .close() isn't being called properly on self._container, or _get_content needs to add a signal-hook to signals.request_finished to expire the content from RAM by deleting the variable or setting it to or somesuch.

That being said, accessing request.content is a compatibility property, and so shouldn't be used if at all possible.

The below test-case works for a raw, manual test, however, in addition to working with the "_test" patch above:

from django.http import HttpResponseSendFile? r = HttpResponseSendFile?('./bigfile') t = open('/tmp/test', 'wb') t.write(r.content) t.close()

# md5 and filesize of ./bigfile and /tmp/test match.

02/26/09 10:01:35 changed by mizatservercave

Reformatting the example since formatting got munged in the previous update.

from django.http import HttpResponseSendFile
r = HttpResponseSendFile('./bigfile')
t = open('/tmp/test', 'wb')
t.write(r.content)
t.close()

# md5 and filesize of ./bigfile and /tmp/test match.

02/26/09 10:18:46 changed by jacob

Looks good. I'd like to see support for X-SendFile baked in. This should be too hard to do: instead of storing the file name in self.sendfile_filename, store it in an X-SendFile header. Then we just need a small piece of middleware -- or, perhaps, something added to the common middleware? -- that responds to a setting (SERVER_SUPPORTS_X_SENDFILE or somesuch) and returns the response with the header and an empty body.

02/28/09 13:46:25 changed by jacob

  • stage changed from Design decision needed to Accepted.
  • milestone set to 1.1 beta.

03/07/09 03:41:46 changed by grahamd

X-Sendfile, although understood by some servers/proxies isn't the only header used for this purpose. The perbal package has something called X-Reproxy-File which I believe acts the same way. The perbal package also has X-Reproxy-URL, which instead of serving up a file, reinjects the URL as a request into the request handling system. This means the request gets pushed back to a backend system to serve. Thus it might get bumped off to a static media server which then serves up the matching file. The request may also be handled by a dynamic web application. In nginx, there is a similar thing to X-Reproxy-URL called X-Accel-Redirect. Finally both these are in some respects similar to the CGI internal redirection using Location header and 200 response which Apache CGI supports and which other servers provding CGI should also support. This CGI mechanism of using Location header will also supported for mod_wsgi daemon mode in mod_wsgi 3.0.

Have been looking at X-Sendfile support for mod_wsgi daemon mode in mod_wsgi 3.0, but the problem with it is that it side steps standard Apache permissions checks on files. Yes, an application could just read the file itself and return it anyway, but the difference is that in daemon mode the user can be different to that Apache runs as, thus X-Sendfile would be handled as the Apache user and not the daemon process user. Thus access may not be available to the file, or access may be available to files that the daemon process shouldn't otherwise have access to. This is why mod_wsgi 3.0 at this point will only support internal URL redirection via Location header as per CGI specification. Namely, the internal redirection has all the Apache permission checks applied.

The only issue with things like X-Accel-Redirect and CGI Location redirection is that the URL has to be mapped to a file and the presence of a mapping would normally mean an outside party could access the resource directly. In nginx they get around this through flagging the URL as private and so only a candidate for use by internal redirects. In the case of Apache, you would need to use mod_rewrite to enforce a condition that URL is forbidden if not a sub request made within the server itself.

Anyway, it may be felt that this is all irrelevant, but want to at least point out that X-Sendfile isn't the only header name used and so you may want to make it configurable. The URL redirection mechanisms available might still also be employed if header can be overridden, but may be a bit tricky here if code can force the file to be loaded inside Django application before response gets sent, with it turning into a byte stream instead. The problem here is that for X-Reprox-URL, it may actually refer to something on a different host if they allow a 'http://host' component in the URL, thus can't sensibly open it as a file. For CGI Location header, no 'http://host', but it is still a logical URL and can't be opened as a file. But then, if X-Sendfile or X-Reproxy-File are used and the front end is on a different system, the Django application may not even be able to open the path anyway as file system view could be different.

So, in respect of the latter, with the X-Sendfile approach, is it always the case that it always makes it out as a X-Sendfile response, or can the Django application, because of some sort of output filter or similar, force the path to be resolved in the application with data read at that point.

Oh and finally, for X-Sendfile, does the response sent always have a 200 HTTP status code? I know that is what CGI Location wants, but not sure about all the others. If the response codes aren't consistent, then even if can override header name, may not work for all.

03/07/09 03:52:34 changed by grahamd

BTW, mod_wsgi 2.0 through 2.3 has an issue that can occur under certain conditions, with a file returned using wsgi.file_wrapper being truncated when size is > 255 and < ~8000 bytes. This is due to an obscure Apache optimisation. The issue will be fixed in mod_wsgi 2.4. Details, with a description of fix, can be found at:

http://code.google.com/p/modwsgi/issues/detail?id=132

One also should not use mod_wsgi 2.0 through 2.1, which can also truncate a file returned using wsgi.file_wrapper if it contains an embedded null character.

03/09/09 06:50:12 changed by gregoire

  • cc changed from jdunck@gmail.com, django-ticket-2131@codef0x.org, andreas@pelme.se, andrehcampos@gmail.com to jdunck@gmail.com, django-ticket-2131@codef0x.org, andreas@pelme.se, andrehcampos@gmail.com, gregoire@audacy.fr.

The patch does not work with GZIP middleware.

Using mod_wsgi, I get this error: "IOError: client connection closed"

03/14/09 05:49:47 changed by mrts

  • needs_better_patch set to 1.
  • needs_docs set to 1.

Questions:

  1. Why to you first do an expensive call to getsize in self['Content-Length'] = os.path.getsize(path_to_file) and later discard it in if hasattr(response, 'sendfile_filename') and response.has_header('Content-Length'): del response['Content-Length'] in WSGI handler? I didn't find anything in PEP 333 that would suggest this is needed. Please correct me or remove that bit.
  2. Although this is mostly a conceptual/taste matter, wouldn't it be more clear to use isinstance(response, HttpResponseSendFile) instead of hasattr(response, 'sendfile_filename') to discern HttpResponseSendFile objects?

Nitpicking HttpResponseSendFile.__init__:

  1. Comparisons to singletons like None should always be done with 'is' or 'is not', never the equality operators. -- PEP 8, thus content_type == None should be content_type is None instead.
  2. Opening the file needs to be consistent. You open it in the constructor and later reopen again in handlers, so the file is opened twice.
  3. The logic of parent constructor should be reused, I suggest the following instead:
    class HttpResponseSendFile(HttpResponse):
        def __init__(self, path_to_file, content_type=None, blocksize=8192):
            if not content_type: # or even if content_type is None
                from mimetypes import guess_type
                content_type = guess_type(path_to_file)[0]
                if content_type is None: # probably windows w/o proper mimetype lookup
                    content_type = "application/octet-stream"
            container = open(path_to_file) # be consistent, either open() here and don't reopen in handlers or don't open() here
            super(HttpResponseSendFile, self).__init__(content=container, content_type=content_type)
            self.sendfile_filename = path_to_file
            self.block_size = blocksize
            # compatibility; fake being a regular HttpResponse if needed
            self._headers['content-length'] = ('Content-Length', os.path.getsize(path_to_file))
    

Looks like more thought has to be given to the general context (basehttp.py etc). Docs and tests are missing as well.

Otherwise, nice work!

03/14/09 09:47:21 changed by mrts

As far as I can see, HttpResponseSendFile should not mimic HttpResponse in the sense that it should always be specially handled. Ordinary HttpResponse can take a file object in the content parameter already and serve the file from within Python, so that behaviour should be explicitly disallowed in HttpResponseSendFile.

Attaching a patch that does exactly this, i.e. doesn't open() the file and passes None as content to parent constructor to make it explicit that the content is neither a string nor can be handled with the default behaviour. I'm not sure about the Content-Length header though, I've currently added it to be compatible with ServerHandler.cleanup_headers(), but it may need more thought.

Btw, how mod_python handles req.sendfile can be seen here: https://svn.apache.org/repos/asf/quetzalcoatl/mod_python/trunk/src/requestobject.c .

03/14/09 09:47:52 changed by mrts

  • attachment httpresponsesendfile-no-default-content.diff added.

Disallow default HttpResponse behaviour (content etc)

(in reply to: ↑ 25 ) 03/14/09 09:52:13 changed by mrts

RAM also seems to 'hang around' after each request, so I suspect that .close() isn't being called properly on self._container, or _get_content needs to add a signal-hook to signals.request_finished to expire the content from RAM by deleting the variable or setting it to or somesuch.

There's a slight possibility that the cause may be return iter(lambda: filelike.read(block_size), '') that was used in the previous patch if. There's no close() in the resulting iterator. I used the FileWrapper from basehttp.py instead, please try if this changes things.

03/14/09 10:02:14 changed by mrts

As for middleware, they can explicitly bypass HttpResponseSendFile responses as follows:

    def process_response(self, request, response):
        if isinstance(response, HttpResponseSendFile):
            return response

or the handlers can bypass HttpResponseSendFile implicitly in __call__:

            if not isinstance(response, HttpResponseSendFile):
                for middleware_method in self._response_middleware:
                    response = middleware_method(request, response)

03/14/09 10:14:35 changed by mrts

03/14/09 16:54:11 changed by mrts

  • attachment httpresponsesendfile-no-default-content_bypass-middleware.diff added.

Bypass middleware

03/15/09 09:04:32 changed by mrts

  • attachment httpresponsesendfile-no-default-content_bypass-middleware.1.diff added.

Fix method naming typo

03/15/09 09:07:35 changed by mrts

  • attachment httpresponsesendfile-no-default-content_bypass-middleware.2.diff added.

Doh, previous diff didn't contain the fix, second try.

03/16/09 05:44:04 changed by mrts

  • attachment httpresponsesendfile-no-default-content_bypass-middleware.3.diff added.

Fix yet another copy-paste omission.

03/17/09 23:28:28 changed by grahamd

  • cc changed from jdunck@gmail.com, django-ticket-2131@codef0x.org, andreas@pelme.se, andrehcampos@gmail.com, gregoire@audacy.fr to Graham.Dumpleton@gmail.com, jdunck@gmail.com, django-ticket-2131@codef0x.org, andreas@pelme.se, andrehcampos@gmail.com, gregoire@audacy.fr.

Can someone indicate whether the intention is, or have these changes supported X-Sendfile as mentioned? I looked through some of the multitude of diffs, but couldn't see it mentioned.

(follow-up: ↓ 41 ) 03/17/09 23:42:33 changed by jacob

Graham: Yes, I'd expect to see x-sendfile support before I check this in.

mrts: I explicitly asked asked that HttpResponseSendFile mimic standard HttpResponse semantics. This is a backwards-compatibility issue: writing custom handlers is an advanced but support feature.

03/17/09 23:57:05 changed by grahamd

Jacob: Okay. Only asked as wanted to make sure that the name of header used was going to be somehow configurable to allow for those header names used by other things like Apache/nginx and perlbal and not be X-Sendfile specific. Thanks.

03/17/09 23:57:53 changed by jacob

That should be pretty easy to pull off, I think.

(in reply to: ↑ 38 ) 03/18/09 04:38:20 changed by anonymous

Replying to jacob:

mrts: I explicitly asked asked that HttpResponseSendFile mimic standard HttpResponse semantics. This is a backwards-compatibility issue: writing custom handlers is an advanced but support feature.

OK, I'll try to look into it in the evening and provide something for X-Sendfile as well.

03/18/09 04:55:44 changed by mrts

(It was me posting the last comment)

Note that I don't like the HttpResponse compatible solution.

Concerns:

  • open() in __init__() will remain messy, if only because of mod_python that doesn't accept a file object in sendfile(); bluntly consuming two file descriptors for one response even though one isn't needed does not look sane to me,
  • we can open() lazily, but then we have to introduce state tracking (file_is_open), resulting in more complex and less robust code.

Is it wise to go to such lengths to cater for minor backwards-compatibility concerns? Jacob, which approach do you prefer?

(follow-up: ↓ 44 ) 03/18/09 08:28:47 changed by jacob

Neither of those is necessary. You just need to make content into a property which lazily opens and reads the file.

(in reply to: ↑ 43 ) 03/18/09 11:37:35 changed by mrts

Replying to jacob:

Neither of those is necessary. You just need to make content into a property which lazily opens and reads the file.

That's what I meant by "lazily". However, state tracking is still required as the file has to be explicitly closed in HttpResponseSendFile?.close(). This can not be done in content due to __iter__. And the file may not be open() at all if content is never touched. Thus, state tracking seems unavoidable.

03/20/09 10:00:56 changed by gabor

  • cc changed from Graham.Dumpleton@gmail.com, jdunck@gmail.com, django-ticket-2131@codef0x.org, andreas@pelme.se, andrehcampos@gmail.com, gregoire@audacy.fr to Graham.Dumpleton@gmail.com, jdunck@gmail.com, django-ticket-2131@codef0x.org, andreas@pelme.se, andrehcampos@gmail.com, gregoire@audacy.fr, gabor@nekomancer.net.

03/21/09 17:05:58 changed by mrts

  • attachment httpresponsesendfile-no-default-content_bypass-middleware_with-header_with-docs-and-tests.diff added.

Added the requested header, as well as docs and tests.

03/21/09 17:22:15 changed by mrts

Added the configurable X-Sendfile header and preliminary docs and tests -- mainly to ease work for others who like to drive this further.

However, as of now the patch is still incomplete. I personally don't like providing any content for the reasons elaborated in http://groups.google.com/group/django-developers/browse_thread/thread/82367cf18eedfe89. I'm not the only one, see e.g. http://john.guen.in/past/2007/4/17/send_files_faster_with_xsendfile/.

Sadly, it looks like this will probably not make it into 1.1 as I likely will not have time for this in the weekend.

03/21/09 17:41:18 changed by mrts

Another issue: the disposition header should perhaps be configurable with a constructor argument (inline is needed to display files directly in browser).

(follow-up: ↓ 49 ) 03/22/09 21:39:03 changed by mtredinnick

  • milestone changed from 1.1 beta to 1.2.

Still too many edge-cases not being accounted for here (basically handling all the stuff Graham mentioned for a start). It's not ready for 1.1.

mrts: The final solution will have to fall-back gracefully if the user's webserver doesn't have any direct sendfile support, since otherwise you can't use this feature in a redistributable application (the application writer doesn't know how it will be installed). The client browser shouldn't ever receive an empty file. That's what Jacob meant by backwards-compatible, he just used an ambiguous phrase.

(in reply to: ↑ 48 ) 03/23/09 14:43:40 changed by mrts

Replying to mtredinnick:

mrts: The final solution will have to fall-back gracefully if the user's webserver doesn't have any direct sendfile support, since otherwise you can't use this feature in a redistributable application (the application writer doesn't know how it will be installed). The client browser shouldn't ever receive an empty file. That's what Jacob meant by backwards-compatible, he just used an ambiguous phrase.

The point is valid. However, looking at it from deployer's viewpoint I would always want to know if the efficient method is used or not (finding it out via the OS utilities is really cumbersome). As of now I'll add another default setting, SENDFILE_FALLBACK_ALLOWED = True and define content with:

def _get_content(self):
    if not settings.SENDFILE_FALLBACK_ALLOWED:
        raise RuntimeError("Fall-back to serving files via Python not allowed by configuration in HttpResponseSendFile.")
    else:
        # open the file and return its content

This will be just a proof of concept and I abhor adding yet another setting as much as you probably do, but I can't find a better way. Feel free to suggest one.

As this is not urgent, being targeted to 1.2, I can't yet tell when I update the patch.

03/29/09 12:49:36 changed by Uninen

  • cc changed from Graham.Dumpleton@gmail.com, jdunck@gmail.com, django-ticket-2131@codef0x.org, andreas@pelme.se, andrehcampos@gmail.com, gregoire@audacy.fr, gabor@nekomancer.net to Graham.Dumpleton@gmail.com, jdunck@gmail.com, django-ticket-2131@codef0x.org, andreas@pelme.se, andrehcampos@gmail.com, gregoire@audacy.fr, gabor@nekomancer.net, ville@unessa.net.

05/20/09 12:33:03 changed by hackeron

  • cc changed from Graham.Dumpleton@gmail.com, jdunck@gmail.com, django-ticket-2131@codef0x.org, andreas@pelme.se, andrehcampos@gmail.com, gregoire@audacy.fr, gabor@nekomancer.net, ville@unessa.net to Graham.Dumpleton@gmail.com, jdunck@gmail.com, django-ticket-2131@codef0x.org, andreas@pelme.se, andrehcampos@gmail.com, gregoire@audacy.fr, gabor@nekomancer.net, ville@unessa.net, hackeron@gmail.com.

06/03/09 22:02:19 changed by ccahoon

  • cc changed from Graham.Dumpleton@gmail.com, jdunck@gmail.com, django-ticket-2131@codef0x.org, andreas@pelme.se, andrehcampos@gmail.com, gregoire@audacy.fr, gabor@nekomancer.net, ville@unessa.net, hackeron@gmail.com to Graham.Dumpleton@gmail.com, jdunck@gmail.com, django-ticket-2131@codef0x.org, andreas@pelme.se, andrehcampos@gmail.com, gregoire@audacy.fr, gabor@nekomancer.net, ville@unessa.net, hackeron@gmail.com, chris.cahoon@gmail.com.
  • owner changed from alsleme to ccahoon.

06/30/09 21:51:49 changed by ccahoon

(In [11131]) [soc2009/http-wsgi-improvements] Initial HttpResponseSendFile? support, changes pulled from 03/21/09 patch on refs #2131.

This does not pass the included regression tests. However, since this feature will be entirely based on these changes, which have already gone through a great number of iterations, I thought it would be sensible to start here.

All of the work here is ymasuda, mizatservercave, and mrts (apologies if I missed anyone). I hope to take their work down the final stretch.

(follow-up: ↓ 55 ) 07/17/09 21:43:13 changed by ccahoon

I have made changes for HttpResponseSendFile? support, in r11266 and earlier commits. I am still getting used to putting the refs in my commit messages, but when the changes get merged with trunk it will be there. If you are interested, let me know what you think.

(in reply to: ↑ 54 ) 07/17/09 21:46:12 changed by ccahoon

Replying to ccahoon:

I have made changes for HttpResponseSendFile? support, in r11266 and earlier commits. I am still getting used to putting the refs in my commit messages, but when the changes get merged with trunk it will be there. If you are interested, let me know what you think.

Correction: r11268. Also, r11212 and r11210. Sorry about that.

07/23/09 15:17:31 changed by ccahoon

(In [11319]) [soc2009/http-wsgi-improvements] Changes for HttpResponseSendFile? support in FastCGI. Refs #2131.

07/23/09 20:57:44 changed by ccahoon

(In [11320]) [soc2009/http-wsgi-improvements] Fix early settings use in HttpResponseSendFile?. Refs #2131.

07/24/09 12:52:57 changed by ccahoon

(In [11325]) [soc2009/http-wsgi-improvements] Add a how-to for using HttpResponseSendFile? with various server arrangements. Refs #2131.

07/24/09 14:31:15 changed by ccahoon

(In [11326]) [soc2009/http-wsgi-improvements] Remove setting the Content-Length header for HttpResponseSendFile? from the handler, for compatibility, and add a content attribute. Refs #2131.

Also adds a _charset class attribute to HttpResponse so the children all have it.

07/24/09 17:18:41 changed by ccahoon

(In [11327]) [soc2009/http-wsgi-improvements] Change the setting name to SENDFILE_HEADER and update docs. Refs #2131.

07/30/09 17:41:22 changed by ccahoon

  • needs_better_patch deleted.
  • stage changed from Accepted to Fixed on a branch.
  • needs_tests deleted.
  • needs_docs deleted.

07/30/09 17:43:01 changed by ccahoon

Fixed on branches/soc2009/http-wsgi-improvements.

(follow-up: ↓ 64 ) 09/01/09 10:16:11 changed by andreaskundig

  • summary changed from [patch] HttpResponseSendFile for serving static files handler-specific sendfile mechanism to it breaks i18n on django 1.1.

I'm not sure this is the right place to signal a bug, but HttpResponseSendFile? breaks internationalization.

I have a page where the user can change his language as described in the documentation: http://docs.djangoproject.com/en/dev/topics/i18n/#the-set-language-redirect-view

On my page there is a link to a view that sends a pdf with HttpResponseSendFile?. If I click on this link I receive the pdf, but afterwards I no longer can change the language. If I was on an english page and then submit a language change to french, the /i18n/setlang/ view recognizes the correct language code 'fr' and puts it into the session, but once the request arrives at my view request.LANGUAGE_CODE is 'en'.

It works again after I restart Django.

(I am using Django 1.1 with python 2.5.4 on windows)

(in reply to: ↑ 63 ) 09/01/09 10:56:23 changed by andreaskundig

Bypassing the middleware in base.py that breaks i18n. When I comment that out it works again.

09/01/09 11:04:24 changed by kmtracey

  • summary changed from it breaks i18n on django 1.1 to HttpResponseSendFile for serving static files handler-specific sendfile mechanism.

Reverting confusing change to ticket summary.

(follow-up: ↓ 67 ) 09/01/09 19:35:35 changed by ccahoon

Which patch is it that breaks i18n? Are you using the one from the SOC branch?

(in reply to: ↑ 66 ) 09/02/09 01:34:40 changed by andreaskundig

Replying to ccahoon:

Which patch is it that breaks i18n? Are you using the one from the SOC branch?

I'm using the latest patch (added by mrts on 03/21/09): http://code.djangoproject.com/attachment/ticket/2131/httpresponsesendfile-no-default-content_bypass-middleware_with-header_with-docs-and-tests.diff

I don't know what the SOC branch is.

10/09/09 17:56:12 changed by AntonBessonov

  • cc changed from Graham.Dumpleton@gmail.com, jdunck@gmail.com, django-ticket-2131@codef0x.org, andreas@pelme.se, andrehcampos@gmail.com, gregoire@audacy.fr, gabor@nekomancer.net, ville@unessa.net, hackeron@gmail.com, chris.cahoon@gmail.com to Graham.Dumpleton@gmail.com, jdunck@gmail.com, django-ticket-2131@codef0x.org, andreas@pelme.se, andrehcampos@gmail.com, gregoire@audacy.fr, gabor@nekomancer.net, ville@unessa.net, hackeron@gmail.com, chris.cahoon@gmail.com, exelib@gmail.com.

11/19/09 22:42:53 changed by redalastor

  • cc changed from Graham.Dumpleton@gmail.com, jdunck@gmail.com, django-ticket-2131@codef0x.org, andreas@pelme.se, andrehcampos@gmail.com, gregoire@audacy.fr, gabor@nekomancer.net, ville@unessa.net, hackeron@gmail.com, chris.cahoon@gmail.com, exelib@gmail.com to Graham.Dumpleton@gmail.com, jdunck@gmail.com, django-ticket-2131@codef0x.org, andreas@pelme.se, andrehcampos@gmail.com, gregoire@audacy.fr, gabor@nekomancer.net, ville@unessa.net, hackeron@gmail.com, chris.cahoon@gmail.com, exelib@gmail.com, redalastor@gmail.com.

12/03/09 21:31:10 changed by springmeyer

  • cc changed from Graham.Dumpleton@gmail.com, jdunck@gmail.com, django-ticket-2131@codef0x.org, andreas@pelme.se, andrehcampos@gmail.com, gregoire@audacy.fr, gabor@nekomancer.net, ville@unessa.net, hackeron@gmail.com, chris.cahoon@gmail.com, exelib@gmail.com, redalastor@gmail.com to Graham.Dumpleton@gmail.com, jdunck@gmail.com, django-ticket-2131@codef0x.org, andreas@pelme.se, andrehcampos@gmail.com, gregoire@audacy.fr, gabor@nekomancer.net, ville@unessa.net, hackeron@gmail.com, chris.cahoon@gmail.com, exelib@gmail.com, redalastor@gmail.com, dane.springmeyer@gmail.com.

01/06/10 16:16:07 changed by jezdez

  • cc changed from Graham.Dumpleton@gmail.com, jdunck@gmail.com, django-ticket-2131@codef0x.org, andreas@pelme.se, andrehcampos@gmail.com, gregoire@audacy.fr, gabor@nekomancer.net, ville@unessa.net, hackeron@gmail.com, chris.cahoon@gmail.com, exelib@gmail.com, redalastor@gmail.com, dane.springmeyer@gmail.com to Graham.Dumpleton@gmail.com, jdunck@gmail.com, django-ticket-2131@codef0x.org, andreas@pelme.se, andrehcampos@gmail.com, gregoire@audacy.fr, gabor@nekomancer.net, ville@unessa.net, hackeron@gmail.com, chris.cahoon@gmail.com, exelib@gmail.com, redalastor@gmail.com, dane.springmeyer@gmail.com, jezdez.

01/15/10 11:11:18 changed by danros

  • cc changed from Graham.Dumpleton@gmail.com, jdunck@gmail.com, django-ticket-2131@codef0x.org, andreas@pelme.se, andrehcampos@gmail.com, gregoire@audacy.fr, gabor@nekomancer.net, ville@unessa.net, hackeron@gmail.com, chris.cahoon@gmail.com, exelib@gmail.com, redalastor@gmail.com, dane.springmeyer@gmail.com, jezdez to Graham.Dumpleton@gmail.com, jdunck@gmail.com, django-ticket-2131@codef0x.org, andreas@pelme.se, andrehcampos@gmail.com, gregoire@audacy.fr, gabor@nekomancer.net, ville@unessa.net, hackeron@gmail.com, chris.cahoon@gmail.com, exelib@gmail.com, redalastor@gmail.com, dane.springmeyer@gmail.com, jezdez, danros.

01/15/10 11:12:09 changed by danros

  • cc changed from Graham.Dumpleton@gmail.com, jdunck@gmail.com, django-ticket-2131@codef0x.org, andreas@pelme.se, andrehcampos@gmail.com, gregoire@audacy.fr, gabor@nekomancer.net, ville@unessa.net, hackeron@gmail.com, chris.cahoon@gmail.com, exelib@gmail.com, redalastor@gmail.com, dane.springmeyer@gmail.com, jezdez, danros to Graham.Dumpleton@gmail.com, jdunck@gmail.com, django-ticket-2131@codef0x.org, andreas@pelme.se, andrehcampos@gmail.com, gregoire@audacy.fr, gabor@nekomancer.net, ville@unessa.net, hackeron@gmail.com, chris.cahoon@gmail.com, exelib@gmail.com, redalastor@gmail.com, dane.springmeyer@gmail.com, jezdez, dan@danros.org.uk.

01/31/10 06:08:22 changed by danny_adair

  • cc changed from Graham.Dumpleton@gmail.com, jdunck@gmail.com, django-ticket-2131@codef0x.org, andreas@pelme.se, andrehcampos@gmail.com, gregoire@audacy.fr, gabor@nekomancer.net, ville@unessa.net, hackeron@gmail.com, chris.cahoon@gmail.com, exelib@gmail.com, redalastor@gmail.com, dane.springmeyer@gmail.com, jezdez, dan@danros.org.uk to Graham.Dumpleton@gmail.com, jdunck@gmail.com, django-ticket-2131@codef0x.org, andreas@pelme.se, andrehcampos@gmail.com, gregoire@audacy.fr, gabor@nekomancer.net, ville@unessa.net, hackeron@gmail.com, chris.cahoon@gmail.com, exelib@gmail.com, redalastor@gmail.com, dane.springmeyer@gmail.com, jezdez, dan@danros.org.uk, danny.adair@unfold.co.nz.

02/04/10 16:21:29 changed by anonymous

  • cc changed from Graham.Dumpleton@gmail.com, jdunck@gmail.com, django-ticket-2131@codef0x.org, andreas@pelme.se, andrehcampos@gmail.com, gregoire@audacy.fr, gabor@nekomancer.net, ville@unessa.net, hackeron@gmail.com, chris.cahoon@gmail.com, exelib@gmail.com, redalastor@gmail.com, dane.springmeyer@gmail.com, jezdez, dan@danros.org.uk, danny.adair@unfold.co.nz to Graham.Dumpleton@gmail.com, jdunck@gmail.com, django-ticket-2131@codef0x.org, andreas@pelme.se, andrehcampos@gmail.com, gregoire@audacy.fr, gabor@nekomancer.net, ville@unessa.net, hackeron@gmail.com, chris.cahoon@gmail.com, exelib@gmail.com, redalastor@gmail.com, dane.springmeyer@gmail.com, jezdez, dan@danros.org.uk, danny.adair@unfold.co.nz, apollo13.

Add/Change #2131 (HttpResponseSendFile for serving static files handler-specific sendfile mechanism)




Change Properties
Action