Opened 5 years ago

Closed 5 years ago

Last modified 5 years ago

#30565 closed Bug (fixed)

HttpResponseBase.close not called when using FileResponse / WSGI "file wrapper" object

Reported by: Chris Jerdonek Owned by: Chris Jerdonek
Component: HTTP handling Version: dev
Severity: Normal Keywords: HttpResponse, streaming, StreamingHttpResponse
Cc: Johannes Maron Triage Stage: Ready for checkin
Has patch: yes Needs documentation: no
Needs tests: no Patch needs improvement: no
Easy pickings: no UI/UX: no

Description (last modified by Chris Jerdonek)

See comment #6 further down for the correct, updated description for this ticket.

This ticket is to suggest doing for StreamingHttpResponse what #25725 did for HttpReponse, namely to close the underlying content iterator after it has been iterated over.

Currently, if creating a StreamingHttpResponse from a file-like object, it doesn't seem like there's an obvious way to close the underlying file after the file has been streamed. And as one of the comments in #25725 pointed out, trying to do this in StreamingHttpResponse.close() isn't a good solution because WSGI servers can't be relied upon to call close().

I believe an alternative, more reliable solution may be to call close() immediately after the iterator has been exhausted (if hasattr(value, 'close') is true, as #25725 does). This is essentially what #25725 did for non-streaming HttpResponse objects. Here is that code:

def content(self, value):
    # Consume iterators upon assignment to allow repeated iteration.
    if hasattr(value, '__iter__') and not isinstance(value, (bytes, str)):
        content = b''.join(self.make_bytes(chunk) for chunk in value)
        if hasattr(value, 'close'):
            try:
                value.close()
            except Exception:
                pass
    else:
        content = self.make_bytes(value)

(from
https://github.com/django/django/blob/1564e42ad397021093585147875a21dae1a3b3fc/django/http/response.py#L310-L319 )

In the streaming case, the content value argument could be wrapped something like so (inside StreamingHttpResponse._set_streaming_content(value)):

def iter_content():
    yield from value
    if hasattr(value, 'close'):
        try:
            value.close()
        except Exception:
            pass

new_value = iter_content()

Here is the current code for StreamingHttpResponse._set_streaming_content():

def _set_streaming_content(self, value):
    # Ensure we can never iterate on "value" more than once.
    self._iterator = iter(value)
    if hasattr(value, 'close'):
        self._closable_objects.append(value)

(from: https://github.com/django/django/blob/1564e42ad397021093585147875a21dae1a3b3fc/django/http/response.py#L376-L380)

Change History (26)

comment:1 by Chris Jerdonek, 5 years ago

Description: modified (diff)
Type: UncategorizedCleanup/optimization
Version: 2.2master

comment:2 by Carlton Gibson, 5 years ago

Hmmm.

There's a comment by HttpResponseBase.close():

    # The WSGI server must call this method upon completion of the request.
    # See http://blog.dscpl.com.au/2012/10/obligations-for-calling-close-on.html

We're saying that's not the case?

I'm thinking this probably needs to go via the DevelopersMailingList to get more input.

comment:3 by Carlton Gibson, 5 years ago

Cc: Johannes Maron added

cc-ing Joe on this, who was the author on #25725. (What are your thoughts Joe? 🙂)

comment:4 by Chris Jerdonek, 5 years ago

We're saying that's not the case?

A few comments on this. My first idea to make sure my content stream was closed was to subclass StreamingHttpResponse and close my content stream inside a new StreamingHttpResponse.close() implementation, precisely because of that code comment you found. But then I found that HttpResponseBase.close() wasn't getting called for some reason. Indeed, HttpResponseBase.close() already calls close() on everything inside self._closable_objects, so it shouldn't have been necessary for me to subclass StreamingHttpResponse if HttpResponseBase.close() was already getting called.

And then I saw the comment in #25725 saying:

If i remember Jacob's talk correctly not even all WSGI server call close.

So I thought that was the coming into play.

But now I'm seeing that when I use the approach I suggested of wrapping the content stream with an iterator that closes the underlying content stream immediately after it has been exhausted, HttpResponseBase.close() does get closed. In other words, not closing the underlying content stream was causing HttpResponseBase.close() not to be called.

So there's an argument for what I'm suggesting independent of whether HttpResponseBase.close() can be relied upon to close. Put another way, implementing what I'm suggesting helps that code comment be true, at least in the case I ran into.

comment:5 by Chris Jerdonek, 5 years ago

Some updates / more info: I'm now focusing more on the question of why HttpReponse.close() isn't being called in my case, since that could signal an underlying bug.

I believe I'm seeing this behavior only when I use / subclass from FileResponse and not from StreamingHttpResponse. Also, I'm using 1.11.20 so haven't confirmed that the issue occurs in master.

The FileResponse class has special logic when the value argument has a read() method:

def _set_streaming_content(self, value):
    if hasattr(value, 'read'):
        self.file_to_stream = value
        filelike = value
        if hasattr(filelike, 'close'):
            self._closable_objects.append(filelike)
        value = iter(lambda: filelike.read(self.block_size), b'')
    else:
        self.file_to_stream = None
    super(FileResponse, self)._set_streaming_content(value)

(from: https://github.com/django/django/blob/1c9cb948d7b0c264d244763b6682ab790a6b90a0/django/http/response.py#L411-L420 )

So maybe the reason things start working when I wrap the file object in an iterator is that it causes value not to have a read() method anymore, and so the special logic isn't get executed. So maybe that logic is the problem.

By the way, is it okay that the value passed to the parent class's (StreamingHttpResponse's) _set_streaming_content() method is different from the value set to self.file_to_stream?

comment:6 by Chris Jerdonek, 5 years ago

Okay, I think I understand what's happening now.

As was remarked above, Django's code has this code comment above HttpResponseBase.close():

# The WSGI server must call this method upon completion of the request.
# See http://blog.dscpl.com.au/2012/10/obligations-for-calling-close-on.html
def close(self):

However, this comment doesn't seem to be true currently in the case that FileResponse is used with a content stream with a read() method. This is because in this case, self.file_to_stream is set to the content stream, and then Django constructs a "file wrapper" from the file (per the WSGI spec) instead of using the HttpResponse object:

if getattr(response, 'file_to_stream', None) is not None and environ.get('wsgi.file_wrapper'):
    response = environ['wsgi.file_wrapper'](response.file_to_stream)
return response

(from: https://github.com/django/django/blob/87f5d07eededc86f8ce1797fdfca7d4903ee0edc/django/core/handlers/wsgi.py#L151-L153 )

The implication of this is that when the WSGI server calls close() on the "file wrapper" object, it only causes close() to be called on the underlying file-like object and not on the HttpResponseBase object.

So when FileResponse is used as I've described above, it doesn't seem like HttpResponseBase.close() can be used as a hook for the end of the request. For this to happen, before Django constructs the "file wrapper" object, I think Django would need to modify the file-like object's close() method so that it calls FileResponse.close().

comment:7 by Chris Jerdonek, 5 years ago

Description: modified (diff)
Summary: Close StreamingHttpResponse content immediately after iterating itHttpResponseBase.close not called when using FileResponse / WSGI "file wrapper" object

comment:8 by Chris Jerdonek, 5 years ago

This might not be so elegant, but I confirmed that something like the following would work to have a close() on the "file wrapper" object as I described above. (You need to be careful to prevent infinite loops.) This could be done inside FileResponse._set_streaming_content().

response = FileResponse(stream)
stream_close = stream.close

class State:
    closing = False

def file_wrapper_close():
    if State.closing:
        return

    State.closing = True

    stream_close()
    response.close()

stream.close = file_wrapper_close

comment:9 by Johannes Maron, 5 years ago

Type: Cleanup/optimizationUncategorized

Hi Chirs,

thanks for reaching out. I think you need to slow down a bit, an elaborate a bit more on the problem that you are facing.
Let me try to paraphrase what I understood so far:

You implemented some kind of custom FileResponse and for some reason the file object you pass as the streaming_content argument is not being closed.

Is this correct so far?

Can you please provide a bit more code, so I can see if I am able to help.

As for the feature itself. All WSGI server should call close – as defined in PEP333 – and if they don't, that is something that needs to be fixed in the WSGI server, not the application.

You are also correct, that should the WSGI setup provide an OS specific file_wrapper, it will ignore the HttpResponse object and only call close on the file. I don't see how this is a problem in most scenarios tho.

comment:10 by Chris Jerdonek, 5 years ago

You implemented some kind of custom FileResponse and for some reason the file object you pass as the streaming_content argument is not being closed. Is this correct so far?

No, that was in my original post before I had developed a correct understanding of the issue. After diagnosing the issue, I updated the ticket's title with the new understanding ("HttpResponseBase.close not called when using FileResponse / WSGI "file wrapper" object") and changed the first sentence of my original post to read, "See comment #6 further down for the correct, updated description for this ticket."

What I think should be addressed is that either the code comment above HttpResponseBase.close() should be clarified to say that it's not called if "file_wrapper" is used, or else FileResponse._set_streaming_content() should wrap the file-like object's close() method so that when the WSGI server calls close() on the file-like object, it also calls close() on the corresponding HttpResponseBase object. (I provided a proof-of-concept for how that can be done in last comment above.)

Here are a few reasons I think the latter should be done. First, HttpResponseBase.close() raises the request_finished signal, which Django's documentation says is "sent when Django finishes delivering an HTTP response to the client." There are a number of valid reasons someone might want to act on the end of the request. Currently, people won't be getting that signal simply because "file_wrapper" happens to be used, and they might not know they aren't. "file_wrapper" is just an optimization, so it doesn't seem like it should be affecting Django's behavior otherwise.

The second reason is that it's not obvious for the developer to know in advance if "file_wrapper" will be used because a number of conditions need to met and those conditions even depend on the deployment setting (e.g. whether the environment contains wsgi.file_wrapper). So it's hard for the developer to know whether they can rely on the request_finished signal for a given request.

Third, Django's code even seems to expect that HttpResponseBase.close() will be called in the "file_wrapper" case because in its "file_wrapper" code path FileResponse._set_streaming_content() adds the file-like object to self._closable_objects, which is only used / needed in the HttpResponseBase.close() case.

An alternative idea for a code change would be not to wrap the file-like object's close() so that it calls HttpResponseBase.close(), but instead to (1) update the code comment to clarify that it's not called when "file_wrapper" is used, and (2) wrap the file-like object's close() so that it sends the request_finished signal. That way developers will be able to rely upload the request_finished signal in both situations.

comment:11 by Johannes Maron, 5 years ago

Triage Stage: UnreviewedAccepted
Type: UncategorizedBug

I see, thanks for the clarification. I agree that either the documentation is wrong or the behavior is inconsistent.

I suggest fixing the behavior as well. So please go ahead and wrap the file object. That seems to be in full agreement with PEP333.

If you don't care to implement a patch, let me know. I am happy to write one myself.

Oh, and fantastic catch. It always amazes me, what bugs people dig up.

comment:12 by Chris Jerdonek, 5 years ago

Thanks, Johannes! I'll try to put together a patch as time permits.

comment:13 by Chris Jerdonek, 5 years ago

FYI, I posted a PR for this here: https://github.com/django/django/pull/11484

comment:14 by Mariusz Felisiak, 5 years ago

Has patch: set
Owner: changed from nobody to Chris Jerdonek
Status: newassigned

comment:15 by Mariusz Felisiak, 5 years ago

Triage Stage: AcceptedReady for checkin

comment:16 by Mariusz Felisiak, 5 years ago

Triage Stage: Ready for checkinAccepted

comment:17 by Mariusz Felisiak, 5 years ago

Triage Stage: AcceptedReady for checkin

comment:18 by Johannes Maron, 5 years ago

Patch needs improvement: set
Triage Stage: Ready for checkinAccepted

comment:19 by Johannes Maron, 5 years ago

Needs documentation: set
Patch needs improvement: unset

comment:20 by Carlton Gibson, 5 years ago

Needs documentation: unset

OK, patch looks good. Thanks for the input everybody!

comment:21 by Carlton Gibson, 5 years ago

Triage Stage: AcceptedReady for checkin

comment:22 by Carlton Gibson <carlton@…>, 5 years ago

In 53331178:

Refs #30565 -- Doc'd HttpResponse.close() method.

comment:23 by Carlton Gibson <carlton@…>, 5 years ago

Resolution: fixed
Status: assignedclosed

In cce47ff6:

Fixed #30565 -- Closed HttpResponse when wsgi.file_wrapper closes file-like object.

comment:24 by Carlton Gibson <carlton.gibson@…>, 5 years ago

In d200069b:

[2.2.x] Refs #30565 -- Doc'd HttpResponse.close() method.

Backport of 533311782fd0c974208490ec9d11da3bbe179dea from master

comment:25 by Mariusz Felisiak <felisiak.mariusz@…>, 5 years ago

In 5494455:

Reverted "Fixed #30565 -- Closed HttpResponse when wsgi.file_wrapper closes file-like object."

This reverts commit cce47ff65a4dd3786c049ec14ee889e128ca7de9.

comment:26 by Mariusz Felisiak <felisiak.mariusz@…>, 5 years ago

In 22c25be:

[3.0.x] Reverted "Fixed #30565 -- Closed HttpResponse when wsgi.file_wrapper closes file-like object."

This reverts commit cce47ff65a4dd3786c049ec14ee889e128ca7de9.

Backport of 549445519ce90cc5c1e3f981853cc0c67725f3ed from master

Note: See TracTickets for help on using tickets.
Back to Top