Opened 11 months ago

Last modified 6 weeks ago

#29069 new Bug

Static file serving does not call request_finished signal

Reported by: André Cruz Owned by: nobody
Component: HTTP handling Version: 1.11
Severity: Normal Keywords: streamingresponse request_finished
Cc: Tom Forbes, Matt Hoskins, Herbert Fortes, Collin Anderson Triage Stage: Accepted
Has patch: no Needs documentation: no
Needs tests: no Patch needs improvement: no
Easy pickings: no UI/UX: no

Description

I'm using the method described here for serving uploaded files in development:

https://docs.djangoproject.com/en/2.0/howto/static-files/#serving-files-uploaded-by-a-user-during-development

However I've found that requests that hit these URLs emit the request_started signal but not the request_finished signal. Supposedly this signal would be sent after the content was sent but it is never sent. I've tried both with runserver and the latest uWSGI.

This is a problem when we are using a connection pool since Django uses this signal to cleanup db connections (and thus return them to the pool), so we have a connection leak.

Change History (12)

comment:1 Changed 11 months ago by Tim Graham

Triage Stage: UnreviewedAccepted

I can reproduce this but I'm not sure if it's a problem in Django or elsewhere. I guess the WSGI server doesn't call close() on the response (which sends the request_finished signal). If it's some sort of limitation rather than a bug, we might document it.

comment:2 Changed 11 months ago by Tom Forbes

Hmm, interesting. It seems the issue is we convert our Django FileResponse to a wsgiref.util.FileWrapper object here: https://github.com/django/django/blob/d7b2aa24f75434c2ce50100cfef3586071e0747a/django/core/handlers/wsgi.py#L151

So when the wsgi server calls close() on it it doesn't do anything Django specific like send a signal.

comment:3 Changed 10 months ago by Tom Forbes

I started to see if this was possible to fix, and I think it's pretty easy: We just need to set the wsgi_file_wrapper attribute to be the FileResponse class. It's compatible with the wsgiref default one.

Possible PR: https://github.com/django/django/pull/9659

It seems suspiciously easy to fix, there could be some unforeseen consequences here. If Django has never sent a request_finished signal for static files, do we want to now? We could just document it, I mean in production these signals would not be sent anyway.

comment:4 Changed 10 months ago by Tom Forbes

Cc: Tom Forbes added

comment:5 Changed 9 months ago by Matt Hoskins

I just hit this as well - on my dev server, with quite a few test instances running under runserver, I hit a postgres server connection limit. I noticed that if I did multiple static file requests in a row (without accessing a normal django view) I would get a new idle connection left after each static connection. If I then did a normal django view access it would seem to clear out most of those idle connections.

In trying to figure out what was going on I also discovered (like Tom) that the issue was that request_finished wasn't being triggered for static file serving because of the reasons set out by Tom (wsgiref.util.FileWrapper being returned) and thus django.db.close_old_connections isn't being triggered at the end of handling a static file request.

(I'm not sure why the normal django view access clears up the idle connections - perhaps it's just as a result of some python garbage collection going off which isn't otherwise happening just when static file requests are being processed).

The suggested PR only helps when using the django provided http server. If you use a third party web server that provides wsgi.file_wrapper then you'll still have the issue that for FileResponse responses they'll get converted to file_wrapper from that third party web server and thus the standard django response close handling won't be called.

One solution is of course to not do that conversion to wsgi.file_wrapper at all - however the downside of that is of course that you're not taking advantage of the web server's offered performance option of wsgi.file_wrapper. Would it be a workable option to wrap response.file_to_stream to pass into wsgi.file_wrapper in a way that means the standard django response close handling can be hooked (the PEP on wsgi.file_wrapper does talk about the requirement for close methods on the passed object and its iterable)? If it's not viable, then it may just need to be the case that when WSGIHandler makes use of wsgi.file_wrapper it at least then calls django.db.close_old_connections after doing the wrapping to try clear up connections.

comment:6 Changed 9 months ago by Matt Hoskins

Cc: Matt Hoskins added

comment:7 Changed 9 months ago by Tom Forbes

Has patch: set

comment:8 Changed 8 months ago by Tom Forbes

Has patch: unset

comment:9 Changed 4 months ago by Tom Forbes

I had another experiment today. As far as I can see we can pick only one:

  1. Support sendfile and other efficient means of sending static files
  2. Have the response ended signal fire

The WSGI spec [has a section on this](https://www.python.org/dev/peps/pep-0333/#optional-platform-specific-file-handling). I'll try and summarize what I have found:

We must give an iterable to wsgi.file_wrapper, which can optionally be a file like object. We currently pass the underlying file-like object given to the FileResponse, which is fine, but when it's close() method is called (or it finishes iterating) then the FileResponse.close() is not called. The wsgi.file_wrapper can try and detect if the file is a real file and if so use more efficient streaming methods.

If we pass the entire FileResponse to wsgi.file_wrapper then we do not get fast streaming because the response is not an actual file, but the close() method is called

If we give the underlying user-controlled value the FileResponse is given, we may get streaming but will not get the FileResponse.close() called.

I can see two ways to hack around this:

  1. Return a mocked instance that wraps the user-supplied value, with a close() method that calls FileResponse.close(). This is tricky because it *cannot* have attributes or methods like filenum() if the *underlying* value does not, because hasattr() will be True and things will break. The spec gives an example of such a class (grep for class FileWrapper:), but... there are numerous issues with that.
  1. Somehow inject our own logic into the values close() method and pass that to wsgi.wrapper...
old_close = value.close
function fake_close(*args, **kwargs):
    old_close(*args, **kwargs)
    our_response.close()   # send signals
value.close = fake_close 

I don't think either is nice.

comment:10 Changed 4 months ago by Tom Forbes

On the PR Tim suggested:

Maybe it's possible to makes sense to omit the request_started signal for static files (if possible)?

I do not believe it is, not without breaking a lot of our lovely abstractions. We could maybe fix it hackily for *just* for staticfiles, but the problem persists for *any* arbitrary FileResponse

One potential thing I thought of is to wrap the 'wsgi.file_wrapper' in a custom wrapper, which then calls close() on the request. But the problems would be similar to solution #1 above, and it would break any isinstance() checks that wsgi servers may perform.

All in all I'm a bit stumped. This seems to be a deficiency with the wsgi spec more than anything

comment:11 Changed 4 months ago by Herbert Fortes

Cc: Herbert Fortes added

comment:12 Changed 6 weeks ago by Collin Anderson

Cc: Collin Anderson added

Maybe we could just call request_finished when wrapping in wsgi.file_wrapper. It's weird to special case it, and it's not really "finished", but at least it gets called.

Note: See TracTickets for help on using tickets.
Back to Top