Opened 7 years ago

Closed 4 years ago

#12214 closed Bug (fixed)

never_cache decorator breaks HttpResponse with iterator content

Reported by: Ben Davis Owned by: nobody
Component: Core (Cache system) Version: master
Severity: Normal Keywords:
Cc: Forest Bond, net147 Triage Stage: Accepted
Has patch: no Needs documentation: no
Needs tests: no Patch needs improvement: no
Easy pickings: no UI/UX: no

Description

This generally isn't a problem, as normal views don't typically use never_cache. However, admin views use never_cache by default, and if you're streaming content from the result of an admin action (for example, a very large csv file), you'll get an empty response.

Code example:

    def export_assembly_list(self, request, batches):
        import csv
        from StringIO import StringIO
      
        #define output columns
        cols = get_csv_cols()

        stream = StringIO()
        writer = csv.writer(stream)
        def content():
            writer.writerow([k for k,v in cols])
            yield stream.getvalue()
            stream.truncate(0)
      
            for batch in batches:
                for invitation in batch.invitations.values(*[v for k,v in cols]):
                    writer.writerow([invitation[v] for k,v in cols])
                    yield stream.getvalue()
                    stream.truncate(0)

            return
      
        response = HttpResponse(content(), mimetype='text/csv')
        response['Content-Disposition'] = 'attachment; filename=batch_assembly_list.csv'
        return response

What happens is this: never_cache() calls django.utils.cache.add_never_cache_headers(), which calls patch_response_headers(). This function adds the ETag header to the response, and to get the ETag, it does this:

if not response.has_header('ETag'):
    response['ETag'] = '"%s"' % md5_constructor(response.content).hexdigest()

response.content, in turn, does "".join(self._container), which causes the iterator to "complete". The problem with this is that the next time the iterator is called, the cursor is already at the end of the iterator, thus the empty response.

The workaround is easy but not at all obvious: just define the Etag on your response.

The overall fix should be just as easy: In the patch_response_headers() function, just detect if the response content is an iterator, and if so use a different method for generating the ETag. Although, I have no idea what that should be.

Change History (8)

comment:1 Changed 7 years ago by Ben Davis

One possibility is to just use a universally unique id for streamed content, since we can't ever predict the unique nature of streamed content.

import uuid
response['ETag'] = '"%s"' % uuid.uuid4().get_hex()

comment:2 Changed 7 years ago by Forest Bond

Why not simply not include an ETag header in the response if the content is an iterator? It's not required by the HTTP spec, right?

comment:3 Changed 7 years ago by Forest Bond

Cc: Forest Bond added

comment:4 Changed 7 years ago by Russell Keith-Magee

Triage Stage: UnreviewedAccepted

#11788 is related here.

comment:5 Changed 7 years ago by Eloff

Why is never_cache adding an etag header using an md5 anyway? Presumably the purpose of the decorator is to prevent caching, not give the browser something it might possibly cache with. If there is a point to adding an etag header, make it a large random value, not an md5 hash! Why burn CPU just for the hell of it?

comment:6 Changed 6 years ago by Matt McClanahan

Severity: Normal
Type: Bug

comment:7 Changed 5 years ago by net147

Cc: net147 added
Easy pickings: unset
UI/UX: unset

comment:8 Changed 4 years ago by Aymeric Augustin

Resolution: fixed
Status: newclosed

You should now use StreamingHttpResponse for such responses. never_cache won't attempt to compute an Etag for them:
https://github.com/django/django/blob/4b27813198ae31892f1159d437e492f7745761a0/django/utils/cache.py#L97

You can provide an Etag by yourself if you have a meaningful way to compute one.

Note: See TracTickets for help on using tickets.
Back to Top