Opened 6 years ago

Closed 3 years ago

#12214 closed Bug (fixed)

never_cache decorator breaks HttpResponse with iterator content

Reported by: bendavis78 Owned by: nobody
Component: Core (Cache system) Version: master
Severity: Normal Keywords:
Cc: forest, net147 Triage Stage: Accepted
Has patch: no Needs documentation: no
Needs tests: no Patch needs improvement: no
Easy pickings: no UI/UX: no


This generally isn't a problem, as normal views don't typically use never_cache. However, admin views use never_cache by default, and if you're streaming content from the result of an admin action (for example, a very large csv file), you'll get an empty response.

Code example:

    def export_assembly_list(self, request, batches):
        import csv
        from StringIO import StringIO
        #define output columns
        cols = get_csv_cols()

        stream = StringIO()
        writer = csv.writer(stream)
        def content():
            writer.writerow([k for k,v in cols])
            yield stream.getvalue()
            for batch in batches:
                for invitation in batch.invitations.values(*[v for k,v in cols]):
                    writer.writerow([invitation[v] for k,v in cols])
                    yield stream.getvalue()

        response = HttpResponse(content(), mimetype='text/csv')
        response['Content-Disposition'] = 'attachment; filename=batch_assembly_list.csv'
        return response

What happens is this: never_cache() calls django.utils.cache.add_never_cache_headers(), which calls patch_response_headers(). This function adds the ETag header to the response, and to get the ETag, it does this:

if not response.has_header('ETag'):
    response['ETag'] = '"%s"' % md5_constructor(response.content).hexdigest()

response.content, in turn, does "".join(self._container), which causes the iterator to "complete". The problem with this is that the next time the iterator is called, the cursor is already at the end of the iterator, thus the empty response.

The workaround is easy but not at all obvious: just define the Etag on your response.

The overall fix should be just as easy: In the patch_response_headers() function, just detect if the response content is an iterator, and if so use a different method for generating the ETag. Although, I have no idea what that should be.

Change History (8)

comment:1 Changed 6 years ago by bendavis78

  • Needs documentation unset
  • Needs tests unset
  • Patch needs improvement unset

One possibility is to just use a universally unique id for streamed content, since we can't ever predict the unique nature of streamed content.

import uuid
response['ETag'] = '"%s"' % uuid.uuid4().get_hex()

comment:2 Changed 6 years ago by forest

Why not simply not include an ETag header in the response if the content is an iterator? It's not required by the HTTP spec, right?

comment:3 Changed 6 years ago by forest

  • Cc forest added

comment:4 Changed 6 years ago by russellm

  • Triage Stage changed from Unreviewed to Accepted

#11788 is related here.

comment:5 Changed 6 years ago by Eloff

Why is never_cache adding an etag header using an md5 anyway? Presumably the purpose of the decorator is to prevent caching, not give the browser something it might possibly cache with. If there is a point to adding an etag header, make it a large random value, not an md5 hash! Why burn CPU just for the hell of it?

comment:6 Changed 5 years ago by mattmcc

  • Severity set to Normal
  • Type set to Bug

comment:7 Changed 4 years ago by net147

  • Cc net147 added
  • Easy pickings unset
  • UI/UX unset

comment:8 Changed 3 years ago by aaugustin

  • Resolution set to fixed
  • Status changed from new to closed

You should now use StreamingHttpResponse for such responses. never_cache won't attempt to compute an Etag for them:

You can provide an Etag by yourself if you have a meaningful way to compute one.

Note: See TracTickets for help on using tickets.
Back to Top