Opened 19 years ago
Closed 11 years ago
#2131 closed New feature (wontfix)
HttpResponseSendFile for serving static files handler-specific sendfile mechanism
Reported by: | ymasuda[at]ethercube.com | Owned by: | nobody |
---|---|---|---|
Component: | HTTP handling | Version: | dev |
Severity: | Normal | Keywords: | |
Cc: | Graham.Dumpleton@…, jdunck@…, django-ticket-2131@…, andreas@…, andrehcampos@…, gregoire@…, gabor@…, ville@…, hackeron@…, exelib@…, redalastor@…, dane.springmeyer@…, Jannis Leidel, dan@…, danny.adair@…, Florian Apolloner, Jari Pennanen, hv@…, bronger@…, David Larlet, mmitar@…, jivan@…, darkrho@…, onecreativenerd@…, milos.urbanek@…, andre.cruz@…, sfllaw@… | Triage Stage: | Accepted |
Has patch: | yes | Needs documentation: | no |
Needs tests: | no | Patch needs improvement: | yes |
Easy pickings: | no | UI/UX: | no |
Description
This patch modifies modpython handler and WSGI (basehttp) handler to add HttpResponseSendFile.
The HttpResponseSendFile allows static files to be served using handler-specific sendfile mechanism.
It uses response.sendfile() on mod_python, FileWrapper on wsgi.
Attachments (13)
Change History (113)
by , 19 years ago
Attachment: | http-response-send-file.diff added |
---|
comment:1 by , 19 years ago
Summary: | HttpResponseSendFile for serving static files handler-specific sendfile mechanism → [patch] HttpResponseSendFile for serving static files handler-specific sendfile mechanism |
---|
follow-up: 5 comment:2 by , 19 years ago
Resolution: | → wontfix |
---|---|
Status: | new → closed |
Django isn't meant to serve static files, so I'm marking this as a wontfix.
comment:3 by , 18 years ago
This patch could be very useful in a situations with a password protected files to a specific users. Typically I am talking about e-commerce site which sells PDF books. I am currently returning data via standard HttpResponse. It works well for small files,
but it would probably be a problem for bigger files.
comment:4 by , 18 years ago
I've implemented a similar workaround to restrict photo images accesses. I think its ridiculous not to implement this feature to aid performance for views.static.serve() as well. HttpResponse is implemented as an iterator so it fits perfectly to read the file chunk by chunk. The performance increase i've seen is great and i'm happy not to need another tcp-server running on my box.
follow-up: 6 comment:5 by , 17 years ago
Replying to adrian:
Django isn't meant to serve static files, so I'm marking this as a wontfix.
But this feature would be useful to reduce memory usage with large dynamically generated content.
In my application, I generate large (30MB) ZIP files in Django and send them through response.write(). The ZIP files are just collections of PNG files, so it doesn't make sense to store them on disk permanently.
Unfortunately, using response.write() means that the whole ZIP file must be loaded into memory. It would be so cool if I could create the file on disk and then use sendfile. The same use-case applies to generated PDF files.
As a workaround, I could write the file to disk and then redirect the browser, but that's tedious, requires another TCP round-trip, and makes automated downloads more difficult.
comment:6 by , 17 years ago
Here's a solution that doesn't require patching Django. It's not the real sendfile(), but it loads and sends the file in 8KB chunks, rather than loading the whole file into memory.
http://www.djangosnippets.org/snippets/365/
It uses the FileWrapper object from source:django/trunk/django/core/servers/basehttp.py.
comment:7 by , 16 years ago
Even using a file-wrapper iterator does not speed up things as much as directly using sendfile. Using mod_wsgi's exposed sendfile uses 30x less CPU vs Django-with-iterator on my linux box testing downloading hundreds of images concurrently.
comment:8 by , 16 years ago
Trac also isn't meant to serve static media files either and preferred setup is to use Apache to do it directly. In Trac however you have the concept of attachments associated with tickets. For such a thing as attachments, where it is still stored in file system, but where Apache cant get to it, you have no choice but to serve it through the application. Any similar sort of scenario in Django is currently not very efficient in terms of performance, or in terms of memory usage if people don't use techniques to limit memory impact. Thus, Django should really provide a default way of handling this efficiently and saying 'wontfix' means that you haven't looked very widely at how Django can be used. :-)
comment:9 by , 16 years ago
Cc: | added |
---|
comment:10 by , 16 years ago
milestone: | → post-1.0 |
---|---|
Resolution: | wontfix |
Status: | closed → reopened |
Triage Stage: | Unreviewed → Design decision needed |
Claiming we "haven't looked very widely" is ridiculous, given what Django can do. More possibly "some cases may have been overlooked" (there are alternatives to the Trac setup). So let's try to avoid hyperbole. :-)
Reopening for now and we can look at it after 1.0.
comment:11 by , 16 years ago
+1 on including this. My use case: instrument Java classfiles, generate a JAR and return it to the user on the fly.
comment:12 by , 16 years ago
Another ticket that covers this issue is #7984, which probably means that its a duplicate. Sorry I didn't spot this ticket when I filed it. The patch on #7984 is very similar to this one :) but has some extra logic related to content-length for mod_wsgi because of some amibguities in the WSGI spec. (see discussion on mod_wsgi group)
comment:14 by , 16 years ago
Cc: | added |
---|
comment:15 by , 16 years ago
Cc: | added |
---|
comment:18 by , 16 years ago
Cc: | added |
---|
by , 16 years ago
Attachment: | django-ticket-2131-http-response-send-file.2-django1.0.2final.diff added |
---|
updated for Django 1.0.2
comment:19 by , 16 years ago
Needs tests: | set |
---|
I have added an updated patch that SEEMS to work on Django 1.0.2. I had an earlier revision working on Django 1.0.0, but have only updated my local copy to 1.0.2 tonight.
Extensive testing should be done. Right now I have only tested WSGI (via Django's manage.py runserver).
The exclusion/removal of the 'Content-Length' header has been added to force WSGI to recalculate the content length from the file length, where prior to that point in the code something previously tends to add 'Content-Length' = 0, which is probably not wise to mess with "at the source".
The additional content_type == None check was added for Windows compatibility -- Google Chrome would not download a file that ultimately resulted in "Content-Type: None", so I chose a generic 'reasonable' default. See also http://bugs.python.org/issue4969 -- but Python 2.5 without that fix is too commonly deployed, at least in my environment. It's easier to do a 2-line fix where it's needed, as-needed.
Other than that, the patch is essentially the same as the original poster's.
by , 16 years ago
Attachment: | django-ticket-2131-http-response-send-file.2-django1.0.2final-v1.1.diff added |
---|
Supersedes previous patches for Django 1.0.2 final.
comment:20 by , 16 years ago
New patch uploaded. Fixes mod_python support.
This patch is now used on a production website and shows phenomenal performance.
The production use-case for this patch is a "members-only" upload/download area which requires extensive permissions checking and custom auditing. This patch makes it possible to do all the logic needed while remaining completely within the Django framework, and without sacrificing performance while either uploading or downloading.
comment:21 by , 16 years ago
This looks like a nice patch to me, and hearing that it works in production is good to know. Still needs a few things, though:
- The current setup requires cooperation between the
HttpResponseSendFile
object and the handler. This means it'll break ifHttpResponseSendFile
is used with a custom handler that doesn't know about it; that's a potential backwards-incompatibility, and we need to avoid it. If possible,HttpResponseSendFile
needs to fake being a regularHttpResponse
, probably by providing acontent
property which just reads the file. - Documentation.
- Tests.
comment:23 by , 16 years ago
Looked in to this a while back (I created the duplicate ticket #7894 with similar patch), but found that in production using the WSGI file_wrapper approach can be problematic. I found that serving PDFs to an adobe reader client failed for documents above around 25MB in size because reader required the web server to satisfy byte range requests which didn't work with WSGI. I ended up using the X-SendFile header and mod_xsendfile (on apache) which didn't have these problems. I'm not certain whether serving bytes range is inherently impossible with WSGI but requests to apache with mod_wsgi didn't satisfy them for the WSGI content.
Also you might like to make the same change that the patch in #7894 made to the 'django.views.static.serve' function so the django development server can use this speedup.
comment:24 by , 16 years ago
Implementing additional HTTP features (HTTP 206 PARTIAL CONTENT) directly in the core is something I suspect the core maintainers want to avoid, on grounds that Django is an application framework and *not* a webserver.
I think you bring up a good point, though. This patch serves a niche interest instead of a general purpose one. That is, instead of this patch being able to fully support all HTTP access methods (such as 206 PARTIAL CONTENT), it is useful in only very certain circumstances:
- When using a handler with an 'efficient mechanism' for 'sending files' AND which knows about HttpResponseSendFile as implemented by this patch, AND
- HttpResponseSendFile, as-implemented, is only suitable for sending whole files.
All the fallback methods revert to 'pure python iterators' in some manner or form. I spent some time bouncing between mod_python, the WSGI spec, and mod_python, trying to decide whether or not trying to wedge a generator into the works (for efficiency) was worthwhile overall. I don't think it is.
It appears that HttpResponseSendFile is only efficient when both the above preconditions are met. Anything which would cause any fallback method (iterators) to be used will DECREASE application performance, but probably not more than doing a raw "HttpResponse(file.read())". Any more 'advanced functionality' or efficiency requirements and you would be well advised to defer the functionality to the webserver, such as those servers which support X-Sendfile natively or by extension.
The goal of promoting this patch is to provide basic functionality just beyond the realm of python iterators for file-reading/file-sending (when preconditions for efficiency are met), while remaining entirely contained within Django -- without *additional* external dependencies, beyond normal deployment against mod_wsgi or mod_python. I do think the implementation useful, however. But it is certainly not "feature rich".
In regards to WSGI's support of all this, it appears that WSGI defers partial content delivery responsibility to the WSGI application [eg, Django or applications written with Django]. In other words, the WSGI spec does not appear support partial content directly. Reading a file and/or chopping it up to feed to the WSGI interface would require pure-python with (presumably) iterators/generators (or even tempfiles with the 'cut-out' content), which would entirely defeat the purpose of this patch.
Mod_python appears at first glance to expose sendfile 'directly' which would seem to support partial content easily, but there seems to be no straightforward way to be efficient across anything BUT Mod_python in this case, and is probably the subject of an additional patch to implement, say, HttpResponseResumableFile().
In short: This patch (unfortunately) serves only a small audience. The larger audience can probably be served with X-Sendfile.
[Kudo's for the reference. I didn't know X-Sendfile even existed 'til you posted, Graham. :)]
Advantages:
- Provides amazingly good results, when used to send whole files across a mutually-supported interface
Disadvantages:
- Introduces response-specific code into the handlers, which would ultimately require maintenance
- All fallback methods are "not efficient" simply by virtue that "everything is done in pure python without a speedy extension, like mod_python or mod_wsgi provides". Large files consume a lot of time and a lot of RAM -- although attempting to serve those types of files through Django before this patch suffers from the same issues of slowness and lots of RAM consumed.
- response.content is one of these inefficient methods
Additional Considerations:
- As this is a 'send whole file' mechanism, should a 'Content-Disposition: attachment; filename=' header be added? Should the filename come from the HttpResponseSendFile constructor so the application can name it what it likes? [Personally, I think so, on both counts.]
by , 16 years ago
PARTIAL patch (FOR TESTING ONLY) to verify HttpResponseSendFile works without specific handler support
by , 16 years ago
Attachment: | django-ticket-2131-http-response-send-file.2-django1.0.2final-v1.2.diff added |
---|
Updated patch. Adds support for 'response.content'
follow-up: 34 comment:25 by , 16 years ago
Regarding the 'v1.2' version of the patch I just uploaded, note that for large files when accessed through response.content loads the whole thing up into the webserver's RAM before serving. A good test example is OpenOffice.org 3's 143MB installer. Bad for small (eg, 256MB RAM) webservers.
RAM also seems to 'hang around' after each request, so I suspect that .close() isn't being called properly on self._container, or _get_content needs to add a signal-hook to signals.request_finished to expire the content from RAM by deleting the variable or setting it to or somesuch.
That being said, accessing request.content is a compatibility property, and so shouldn't be used if at all possible.
The below test-case works for a raw, manual test, however, in addition to working with the "_test" patch above:
from django.http import HttpResponseSendFile
r = HttpResponseSendFile('./bigfile')
t = open('/tmp/test', 'wb')
t.write(r.content)
t.close()
# md5 and filesize of ./bigfile and /tmp/test match.
comment:26 by , 16 years ago
Reformatting the example since formatting got munged in the previous update.
from django.http import HttpResponseSendFile r = HttpResponseSendFile('./bigfile') t = open('/tmp/test', 'wb') t.write(r.content) t.close() # md5 and filesize of ./bigfile and /tmp/test match.
follow-up: 87 comment:27 by , 16 years ago
Looks good. I'd like to see support for X-SendFile
baked in. This should be too hard to do: instead of storing the file name in self.sendfile_filename
, store it in an X-SendFile
header. Then we just need a small piece of middleware -- or, perhaps, something added to the common middleware? -- that responds to a setting (SERVER_SUPPORTS_X_SENDFILE
or somesuch) and returns the response with the header and an empty body.
comment:28 by , 16 years ago
milestone: | → 1.1 beta |
---|---|
Triage Stage: | Design decision needed → Accepted |
comment:29 by , 16 years ago
X-Sendfile, although understood by some servers/proxies isn't the only header used for this purpose. The perbal package has something called X-Reproxy-File which I believe acts the same way. The perbal package also has X-Reproxy-URL, which instead of serving up a file, reinjects the URL as a request into the request handling system. This means the request gets pushed back to a backend system to serve. Thus it might get bumped off to a static media server which then serves up the matching file. The request may also be handled by a dynamic web application. In nginx, there is a similar thing to X-Reproxy-URL called X-Accel-Redirect. Finally both these are in some respects similar to the CGI internal redirection using Location header and 200 response which Apache CGI supports and which other servers provding CGI should also support. This CGI mechanism of using Location header will also supported for mod_wsgi daemon mode in mod_wsgi 3.0.
Have been looking at X-Sendfile support for mod_wsgi daemon mode in mod_wsgi 3.0, but the problem with it is that it side steps standard Apache permissions checks on files. Yes, an application could just read the file itself and return it anyway, but the difference is that in daemon mode the user can be different to that Apache runs as, thus X-Sendfile would be handled as the Apache user and not the daemon process user. Thus access may not be available to the file, or access may be available to files that the daemon process shouldn't otherwise have access to. This is why mod_wsgi 3.0 at this point will only support internal URL redirection via Location header as per CGI specification. Namely, the internal redirection has all the Apache permission checks applied.
The only issue with things like X-Accel-Redirect and CGI Location redirection is that the URL has to be mapped to a file and the presence of a mapping would normally mean an outside party could access the resource directly. In nginx they get around this through flagging the URL as private and so only a candidate for use by internal redirects. In the case of Apache, you would need to use mod_rewrite to enforce a condition that URL is forbidden if not a sub request made within the server itself.
Anyway, it may be felt that this is all irrelevant, but want to at least point out that X-Sendfile isn't the only header name used and so you may want to make it configurable. The URL redirection mechanisms available might still also be employed if header can be overridden, but may be a bit tricky here if code can force the file to be loaded inside Django application before response gets sent, with it turning into a byte stream instead. The problem here is that for X-Reprox-URL, it may actually refer to something on a different host if they allow a 'http://host' component in the URL, thus can't sensibly open it as a file. For CGI Location header, no 'http://host', but it is still a logical URL and can't be opened as a file. But then, if X-Sendfile or X-Reproxy-File are used and the front end is on a different system, the Django application may not even be able to open the path anyway as file system view could be different.
So, in respect of the latter, with the X-Sendfile approach, is it always the case that it always makes it out as a X-Sendfile response, or can the Django application, because of some sort of output filter or similar, force the path to be resolved in the application with data read at that point.
Oh and finally, for X-Sendfile, does the response sent always have a 200 HTTP status code? I know that is what CGI Location wants, but not sure about all the others. If the response codes aren't consistent, then even if can override header name, may not work for all.
comment:30 by , 16 years ago
BTW, mod_wsgi 2.0 through 2.3 has an issue that can occur under certain conditions, with a file returned using wsgi.file_wrapper being truncated when size is > 255 and < ~8000 bytes. This is due to an obscure Apache optimisation. The issue will be fixed in mod_wsgi 2.4. Details, with a description of fix, can be found at:
One also should not use mod_wsgi 2.0 through 2.1, which can also truncate a file returned using wsgi.file_wrapper if it contains an embedded null character.
comment:31 by , 16 years ago
Cc: | added |
---|
The patch does not work with GZIP middleware.
Using mod_wsgi, I get this error: "IOError: client connection closed"
comment:32 by , 16 years ago
Needs documentation: | set |
---|---|
Patch needs improvement: | set |
Questions:
- Why to you first do an expensive call to
getsize
inself['Content-Length'] = os.path.getsize(path_to_file)
and later discard it inif hasattr(response, 'sendfile_filename') and response.has_header('Content-Length'): del response['Content-Length']
in WSGI handler? I didn't find anything in PEP 333 that would suggest this is needed. Please correct me or remove that bit. - Although this is mostly a conceptual/taste matter, wouldn't it be more clear to use
isinstance(response, HttpResponseSendFile)
instead ofhasattr(response, 'sendfile_filename')
to discernHttpResponseSendFile
objects?
Nitpicking HttpResponseSendFile.__init__
:
- Comparisons to singletons like None should always be done with 'is' or 'is not', never the equality operators. -- PEP 8, thus
content_type == None
should becontent_type is None
instead. - Opening the file needs to be consistent. You open it in the constructor and later reopen again in handlers, so the file is opened twice.
- The logic of parent constructor should be reused, I suggest the following instead:
class HttpResponseSendFile(HttpResponse): def __init__(self, path_to_file, content_type=None, blocksize=8192): if not content_type: # or even if content_type is None from mimetypes import guess_type content_type = guess_type(path_to_file)[0] if content_type is None: # probably windows w/o proper mimetype lookup content_type = "application/octet-stream" container = open(path_to_file) # be consistent, either open() here and don't reopen in handlers or don't open() here super(HttpResponseSendFile, self).__init__(content=container, content_type=content_type) self.sendfile_filename = path_to_file self.block_size = blocksize # compatibility; fake being a regular HttpResponse if needed self._headers['content-length'] = ('Content-Length', os.path.getsize(path_to_file))
Looks like more thought has to be given to the general context (basehttp.py etc). Docs and tests are missing as well.
Otherwise, nice work!
comment:33 by , 16 years ago
As far as I can see, HttpResponseSendFile
should not mimic HttpResponse
in the sense that it should always be specially handled. Ordinary HttpResponse
can take a file object in the content
parameter already and serve the file from within Python, so that behaviour should be explicitly disallowed in HttpResponseSendFile
.
Attaching a patch that does exactly this, i.e. doesn't open()
the file and passes None
as content
to parent constructor to make it explicit that the content is neither a string nor can be handled with the default behaviour. I'm not sure about the Content-Length header though, I've currently added it to be compatible with ServerHandler.cleanup_headers()
, but it may need more thought.
Btw, how mod_python
handles req.sendfile
can be seen here: https://svn.apache.org/repos/asf/quetzalcoatl/mod_python/trunk/src/requestobject.c .
by , 16 years ago
Attachment: | httpresponsesendfile-no-default-content.diff added |
---|
Disallow default HttpResponse behaviour (content etc)
comment:34 by , 16 years ago
RAM also seems to 'hang around' after each request, so I suspect that .close() isn't being called properly on self._container, or _get_content needs to add a signal-hook to signals.request_finished to expire the content from RAM by deleting the variable or setting it to or somesuch.
There's a slight possibility that the cause may be return iter(lambda: filelike.read(block_size), '')
that was used in the previous patch if. There's no close()
in the resulting iterator. I used the FileWrapper
from basehttp.py instead, please try if this changes things.
comment:35 by , 16 years ago
As for middleware, they can explicitly bypass HttpResponseSendFile
responses as follows:
def process_response(self, request, response): if isinstance(response, HttpResponseSendFile): return response
or the handlers can bypass HttpResponseSendFile
implicitly in __call__
:
if not isinstance(response, HttpResponseSendFile): for middleware_method in self._response_middleware: response = middleware_method(request, response)
comment:36 by , 16 years ago
mod_wsgi
file_wrapper
handling can be seen here: http://code.google.com/p/modwsgi/source/browse/trunk/mod_wsgi/mod_wsgi.c#3146
by , 16 years ago
Attachment: | httpresponsesendfile-no-default-content_bypass-middleware.diff added |
---|
Bypass middleware
by , 16 years ago
Attachment: | httpresponsesendfile-no-default-content_bypass-middleware.1.diff added |
---|
Fix method naming typo
by , 16 years ago
Attachment: | httpresponsesendfile-no-default-content_bypass-middleware.2.diff added |
---|
Doh, previous diff didn't contain the fix, second try.
by , 16 years ago
Attachment: | httpresponsesendfile-no-default-content_bypass-middleware.3.diff added |
---|
Fix yet another copy-paste omission.
comment:37 by , 16 years ago
Cc: | added |
---|
Can someone indicate whether the intention is, or have these changes supported X-Sendfile as mentioned? I looked through some of the multitude of diffs, but couldn't see it mentioned.
follow-up: 41 comment:38 by , 16 years ago
Graham: Yes, I'd expect to see x-sendfile support before I check this in.
mrts: I explicitly asked asked that HttpResponseSendFile
mimic standard HttpResponse
semantics. This is a backwards-compatibility issue: writing custom handlers is an advanced but support feature.
comment:39 by , 16 years ago
Jacob: Okay. Only asked as wanted to make sure that the name of header used was going to be somehow configurable to allow for those header names used by other things like Apache/nginx and perlbal and not be X-Sendfile specific. Thanks.
comment:41 by , 16 years ago
Replying to jacob:
mrts: I explicitly asked asked that
HttpResponseSendFile
mimic standardHttpResponse
semantics. This is a backwards-compatibility issue: writing custom handlers is an advanced but support feature.
OK, I'll try to look into it in the evening and provide something for X-Sendfile
as well.
comment:42 by , 16 years ago
(It was me posting the last comment)
Note that I don't like the HttpResponse
compatible solution.
Concerns:
open()
in__init__()
will remain messy, if only because ofmod_python
that doesn't accept a file object insendfile()
; bluntly consuming two file descriptors for one response even though one isn't needed does not look sane to me,- we can
open()
lazily, but then we have to introduce state tracking (file_is_open
), resulting in more complex and less robust code.
Is it wise to go to such lengths to cater for minor backwards-compatibility concerns? Jacob, which approach do you prefer?
follow-up: 44 comment:43 by , 16 years ago
Neither of those is necessary. You just need to make content
into a property which lazily opens and reads the file.
comment:44 by , 16 years ago
Replying to jacob:
Neither of those is necessary. You just need to make
content
into a property which lazily opens and reads the file.
That's what I meant by "lazily". However, state tracking is still required as the file has to be explicitly closed in HttpResponseSendFile.close(). This can not be done in content
due to __iter__
. And the file may not be open()
at all if content
is never touched. Thus, state tracking seems unavoidable.
comment:45 by , 16 years ago
Cc: | added |
---|
by , 16 years ago
Attachment: | httpresponsesendfile-no-default-content_bypass-middleware_with-header_with-docs-and-tests.diff added |
---|
Added the requested header, as well as docs and tests.
comment:46 by , 16 years ago
Added the configurable X-Sendfile
header and preliminary docs and tests -- mainly to ease work for others who like to drive this further.
However, as of now the patch is still incomplete. I personally don't like providing any content for the reasons elaborated in http://groups.google.com/group/django-developers/browse_thread/thread/82367cf18eedfe89. I'm not the only one, see e.g. http://john.guen.in/past/2007/4/17/send_files_faster_with_xsendfile/.
Sadly, it looks like this will probably not make it into 1.1 as I likely will not have time for this in the weekend.
comment:47 by , 16 years ago
Another issue: the disposition header should perhaps be configurable with a constructor argument (inline
is needed to display files directly in browser).
follow-up: 49 comment:48 by , 16 years ago
milestone: | 1.1 beta → 1.2 |
---|
Still too many edge-cases not being accounted for here (basically handling all the stuff Graham mentioned for a start). It's not ready for 1.1.
mrts: The final solution will have to fall-back gracefully if the user's webserver doesn't have any direct sendfile support, since otherwise you can't use this feature in a redistributable application (the application writer doesn't know how it will be installed). The client browser shouldn't ever receive an empty file. That's what Jacob meant by backwards-compatible, he just used an ambiguous phrase.
comment:49 by , 16 years ago
Replying to mtredinnick:
mrts: The final solution will have to fall-back gracefully if the user's webserver doesn't have any direct sendfile support, since otherwise you can't use this feature in a redistributable application (the application writer doesn't know how it will be installed). The client browser shouldn't ever receive an empty file. That's what Jacob meant by backwards-compatible, he just used an ambiguous phrase.
The point is valid. However, looking at it from deployer's viewpoint I would always want to know if the efficient method is used or not (finding it out via the OS utilities is really cumbersome). As of now I'll add another default setting, SENDFILE_FALLBACK_ALLOWED = True
and define content with:
def _get_content(self): if not settings.SENDFILE_FALLBACK_ALLOWED: raise RuntimeError("Fall-back to serving files via Python not allowed by configuration in HttpResponseSendFile.") else: # open the file and return its content
This will be just a proof of concept and I abhor adding yet another setting as much as you probably do, but I can't find a better way. Feel free to suggest one.
As this is not urgent, being targeted to 1.2, I can't yet tell when I update the patch.
comment:50 by , 16 years ago
Cc: | added |
---|
comment:51 by , 16 years ago
Cc: | added |
---|
comment:52 by , 16 years ago
Cc: | added |
---|---|
Owner: | changed from | to
comment:53 by , 16 years ago
(In [11131]) [soc2009/http-wsgi-improvements] Initial HttpResponseSendFile support, changes pulled from 03/21/09 patch on refs #2131.
This does not pass the included regression tests. However, since this feature
will be entirely based on these changes, which have already gone through a
great number of iterations, I thought it would be sensible to start here.
All of the work here is ymasuda, mizatservercave, and mrts (apologies if I
missed anyone). I hope to take their work
down the final stretch.
follow-up: 55 comment:54 by , 16 years ago
I have made changes for HttpResponseSendFile support, in r11266 and earlier commits. I am still getting used to putting the refs in my commit messages, but when the changes get merged with trunk it will be there. If you are interested, let me know what you think.
comment:55 by , 16 years ago
Replying to ccahoon:
I have made changes for HttpResponseSendFile support, in r11266 and earlier commits. I am still getting used to putting the refs in my commit messages, but when the changes get merged with trunk it will be there. If you are interested, let me know what you think.
Correction: r11268. Also, r11212 and r11210. Sorry about that.
comment:56 by , 16 years ago
comment:57 by , 16 years ago
comment:58 by , 16 years ago
comment:59 by , 16 years ago
comment:60 by , 16 years ago
comment:61 by , 15 years ago
Needs documentation: | unset |
---|---|
Needs tests: | unset |
Patch needs improvement: | unset |
Triage Stage: | Accepted → Fixed on a branch |
follow-up: 64 comment:63 by , 15 years ago
Summary: | [patch] HttpResponseSendFile for serving static files handler-specific sendfile mechanism → it breaks i18n on django 1.1 |
---|
I'm not sure this is the right place to signal a bug, but HttpResponseSendFile breaks internationalization.
I have a page where the user can change his language as described in the documentation:
http://docs.djangoproject.com/en/dev/topics/i18n/#the-set-language-redirect-view
On my page there is a link to a view that sends a pdf with HttpResponseSendFile. If I click on this link I receive the pdf, but afterwards I no longer can change the language. If I was on an english page and then submit a language change to french, the /i18n/setlang/ view recognizes the correct language code 'fr' and puts it into the session, but once the request arrives at my view request.LANGUAGE_CODE is 'en'.
It works again after I restart Django.
(I am using Django 1.1 with python 2.5.4 on windows)
comment:64 by , 15 years ago
Bypassing the middleware in base.py that breaks i18n. When I comment that out it works again.
comment:65 by , 15 years ago
Summary: | it breaks i18n on django 1.1 → HttpResponseSendFile for serving static files handler-specific sendfile mechanism |
---|
Reverting confusing change to ticket summary.
follow-up: 67 comment:66 by , 15 years ago
Which patch is it that breaks i18n? Are you using the one from the SOC branch?
comment:67 by , 15 years ago
Replying to ccahoon:
Which patch is it that breaks i18n? Are you using the one from the SOC branch?
I'm using the latest patch (added by mrts on 03/21/09):
http://code.djangoproject.com/attachment/ticket/2131/httpresponsesendfile-no-default-content_bypass-middleware_with-header_with-docs-and-tests.diff
I don't know what the SOC branch is.
comment:68 by , 15 years ago
Cc: | added |
---|
comment:69 by , 15 years ago
Cc: | added |
---|
comment:70 by , 15 years ago
Cc: | added |
---|
comment:71 by , 15 years ago
Cc: | added |
---|
comment:72 by , 15 years ago
Cc: | added |
---|
comment:73 by , 15 years ago
Cc: | added; removed |
---|
comment:74 by , 15 years ago
Cc: | added |
---|
comment:75 by , 15 years ago
Cc: | added |
---|
comment:76 by , 15 years ago
Cc: | added |
---|
comment:77 by , 15 years ago
Note that you don't need this fix to use efficient file serving from files accessible from the filesystem.
Just use an empty response and set the X-Sendfile
header manually if using Apache. If not, check your server documentation for the right header name.
E.g.:
... response = HttpResponse() response['X-Sendfile'] = '/full/path/to/file' response['Content-Type'] = 'some/content_type' response['Content-Length'] = os.stat('/full/path/to/file').st_size response['Content-Disposition'] = 'attachment; filename="foo"' return response
comment:78 by , 15 years ago
Cc: | added |
---|
comment:79 by , 15 years ago
milestone: | 1.2 |
---|
1.2 is feature-frozen; bumping this feature request off the milestone.
comment:80 by , 15 years ago
Cc: | added |
---|
comment:81 by , 15 years ago
Cc: | added |
---|
comment:82 by , 15 years ago
Cc: | added |
---|
comment:83 by , 15 years ago
Cc: | added |
---|
comment:84 by , 14 years ago
Cc: | added |
---|
comment:85 by , 14 years ago
Patch needs improvement: | set |
---|---|
Triage Stage: | Fixed on a branch → Accepted |
I'm going to put the stage back to Accepted, since the branch this is fixed on is rather stale. It'd be good to see a stand-alone patch here updated with any changes made in the branch if someone wants a task.
comment:86 by , 14 years ago
Cc: | added |
---|
comment:87 by , 14 years ago
Replying to jacob:
Looks good. I'd like to see support for
X-SendFile
baked in. This should be too hard to do: instead of storing the file name inself.sendfile_filename
, store it in anX-SendFile
header. Then we just need a small piece of middleware -- or, perhaps, something added to the common middleware? -- that responds to a setting (SERVER_SUPPORTS_X_SENDFILE
or somesuch) and returns the response with the header and an empty body.
Testing the current patch, the "baked in" x-sendfile support breaks the feature. I see both wsgi and x-sendfile messages in the log, for the same file. I suspect something like the following: due to the header, apache discards the returned content (the file wsgi is supposed to be serving), and tries to serve the file itself via mod_xsendfile. In my config the wsgi daemon has permission to read the file, but the "apache" user does not, so xsendfile fails. If your user "apache" has permission to read the file, you probably won't see the bug, but behind the scenes the file is being read twice (discarded the first time).
Disabling mod_xsendfile allows it to work. It's worth reiterating that x-sendfile does not work if the main apache user does not have permission to read the file. A solution is sorely needed for serving files in the context of wsgi.
comment:88 by , 14 years ago
Severity: | normal → Normal |
---|---|
Type: | enhancement → New feature |
comment:89 by , 13 years ago
Easy pickings: | unset |
---|---|
UI/UX: | unset |
SendFile support
in django would be nice. But I am happy this now: https://github.com/johnsensible/django-sendfile
This is a wrapper around web-server specific methods for sending files to web clients. This is useful when Django needs to check permissions associated files, but does not want to serve the actual bytes of the file itself. i.e. as serving large files is not what Django is made for.
comment:90 by , 13 years ago
Resolution: | → wontfix |
---|---|
Status: | reopened → closed |
I asked on django-dev about this ticket, there was only one reply. But the core developers don't care.
I close this ticket. You have got to use the external django-sendfile project.
comment:91 by , 13 years ago
Component: | Core (Other) → HTTP handling |
---|---|
Needs documentation: | set |
Resolution: | wontfix |
Status: | closed → reopened |
A core developer clearly has marked this ticket as accepted above and non of the core devs have argued against it on the mailing list. So in other words, this just needs someone to spear head the porting of the changes made to the soc2009/http-wsgi-improvements branch.
by , 13 years ago
Attachment: | ticket2131.diff added |
---|
updated patch compatible with Django-1.4b1 including test case
comment:92 by , 13 years ago
Cc: | added |
---|---|
Needs documentation: | unset |
comment:93 by , 13 years ago
Cc: | added |
---|
So, now that there is an updated patch and some core developers interested, can we expect this patch to be included in the next Django version?
comment:94 by , 12 years ago
Owner: | changed from | to
---|---|
Status: | reopened → new |
comment:95 by , 12 years ago
I see people struggling with private static files all the time, and I've implemented several variants of this myself. This feature is needed.
However, I'm not sure that merging django-sendfile
as a contrib app is the best option. There've been several discussions about endorsing third-party applications covering common needs. If we started doing that, we could just point to https://github.com/johnsensible/django-sendfile in the documentation.
EDIT: Alex noticed some dubious code. We can't endorse that.
NB: since #7581 was fixed, serving files through django.views.static.serve
or with a StreamingHttpResponse
is less inefficient (although still not recommended): the entire file content isn't loaded in memory.
NB: the latest patch won't work with nginx because X-Accel-Redirect
requires a relative path. Also it still contains changes for mod_python
which was removed from master.
comment:96 by , 12 years ago
Owner: | changed from | to
---|
comment:97 by , 12 years ago
Just released http://pypi.python.org/pypi/django-downloadview, which is a third-party tool that provides class-based download views. Support of nginx's X-Accel is implemented with a middleware (global) or with decorators (per-view configuration).
I didn't know django-sendfile before I discovered this ticket... I'm to look at it ;)
comment:98 by , 12 years ago
Cc: | added |
---|
comment:99 by , 12 years ago
Cc: | removed |
---|
comment:100 by , 11 years ago
Just released an update of django-downloadview, and came back here to check the status of the ticket...
One year ago, aaugustin said:
I'm not sure that merging django-sendfile as a contrib app is the best option. There've been several discussions about endorsing third-party applications covering common needs. If we started doing that, we could just point to https://github.com/johnsensible/django-sendfile in the documentation.
But this ticket is still tagged as "accepted"... is it still valid? I mean, would such a feature be included in Django's core one day?
In the meantime, we keep on developing third-party projects (by the way, I discovered a new one called "django-transfer" recently).
If such a feature is to be added to Django's core, may you consider django-dowloadview's mechanisms?
- Generic views cover commons patterns, depending on the file (FileField in Model, Storage, path, URL, text, StringIO, generator...)
- Generic views and mixins are easy to override to implement specific use cases
- Views return some "FileStreamingResponse" (StreamingHttpResponse subclass)
- FileStreamingResponse carries a file wrapper
- File wrappers describe files. FieldFile, File, ContentFile are valid file wrappers. Some other wrappers may be added to support storages, URLs or generators.
- Response middlewares (or decorators) are given the opportunity to capture the FileStreamingResponse and convert them to some optimized response (Sendfile, XAccel, ...) before file content is read from disk/network or loaded into memory.
- Test utilities help developers check responses or views.
See http://django-downloadview.readthedocs.org/en/1.3/overview.html for details.
comment:101 by , 11 years ago
Resolution: | → wontfix |
---|---|
Status: | new → closed |
Benoît, thanks for this update. The scope covered by django-downloadview is far from trivial. Thinking about this again one year later, I believe it's better left to third party apps.
This ticket has grown long over the years and it's very hard to tell what it's proposing exactly after all these discussions. I'm going to close it.
If some simple changes to Django would make the implementation of django-downloadview easier, let us know, either on the django-developers mailing list or in a new ticket.
If someone wants to propose a simple and straightforward API to handle the common case in Django, please send a concrete proposal to django-developers. (This could stifle third-party apps.)
diff implements HttpResponseSendFile