Opened 3 years ago

Closed 3 years ago

#21179 closed Cleanup/optimization (fixed)

How-to output CSV from Django should suggest using `StreamingHttpResponse`

Reported by: Simon Charette Owned by: Rigel Di Scala
Component: Documentation Version: master
Severity: Normal Keywords: afraid-to-commit
Cc: Triage Stage: Accepted
Has patch: yes Needs documentation: no
Needs tests: no Patch needs improvement: no
Easy pickings: yes UI/UX: no


The Outputting CSV with Django how-to doesn't even mention StreamingHttpResponse even if it’s useful for generating large CSV files.

I suggest we replace the example with something along the following:

import csv
from StringIO import StringIO

from django.http import StreamingHttpResponse

def some_view(request):
    rows = (
        ['First row', 'Foo', 'Bar', 'Baz'],
        ['Second row', 'A', 'B', 'C', '"Testing"', "Here's a quote"]

    # Define a generator to stream data directly to the client
    def stream():
        buffer_ = StringIO()
        writer = csv.writer(buffer_)
        for row in rows:
            data =
            yield data

    # Create the streaming response  object with the appropriate CSV header.
    response = StreamingHttpResponse(stream(), content_type='text/csv')
    response['Content-Disposition'] = 'attachment; filename="somefilename.csv"'

    return response

Change History (17)

comment:1 Changed 3 years ago by Daniele Procida

Triage Stage: UnreviewedAccepted

Yes, and also should link to this, in the text "For instance, it’s useful for generating large CSV files"

comment:2 Changed 3 years ago by Daniele Procida

Keywords: afraid-to-commit added

comment:3 Changed 3 years ago by Marc Tamlyn

I'm not convinced. I've output many a CSV file and never needed the streaming response to get performance. Whilst this is a useful addition to mention at this point in the docs, I don't think we should be recommending the more complex option.

comment:4 Changed 3 years ago by Aymeric Augustin

The code example looks like C, not like Python... I don't want to see in our docs.

Streaming responses don't change much when you pull all the data in RAM, and if the data comes from a queryset, Django currently does that even if you use .iterator(). It seems much more interesting to me to optimize the database side than the HTTP response side.

comment:5 Changed 3 years ago by Simon Charette

Thinking about it I must agree that without server-side cursor support (#16614) the tradeoff is not worth turning the simple example into a overly complex one.

I just thought it was odd that StreamingHttpResponse's documentation mentions that it’s useful for generating large CSV files but our provided tutorial doesn't even mention it.

What do you guys think of adding an admonition with no specific example to the how-to explaining StreamingHttpResponse might be useful in this case?

comment:6 Changed 3 years ago by Daniele Procida

StreamingHttpResponse could still do with some example code in the docs, even if it doesn't replace the existing example.

comment:7 Changed 3 years ago by ANUBHAV JOSHI

Any ideas regarding what type of example should be given in the docs for StreamingHttpResponse?

comment:8 Changed 3 years ago by Rigel Di Scala

Owner: changed from nobody to Rigel Di Scala
Status: newassigned

comment:9 Changed 3 years ago by Rigel Di Scala

Owner: Rigel Di Scala deleted
Status: assignednew

Hello, I would like to work on this ticket.

I think that some information on how to test a view that returns a StreamingHttpResponse() would be useful. The Django test Client actually returns an iterable response, and the .streaming_content property is an instance of <itertools.imap>. You would then need to concatenate it into a string in order to test it, as you would do with the standard HttpResponse.

comment:10 Changed 3 years ago by Rigel Di Scala

I was thinking of something along these lines:

import csv

from django.http import StreamingHttpResponse

class Echo(object):
    def write(self, value):
        return value

def some_streaming_view(request):
    rows = (["Row {0}".format(idx), str(idx)] for idx in xrange(100))
    buffer_ = Echo()
    writer = csv.writer(buffer_)
    response = StreamingHttpResponse((writer.writerow(row) for row in rows),
    response['Content-Disposition'] = 'attachment; filename="somefilename.csv"'
    return response

I have tested it with curl, a simple test case with the Django test client, and a regular browser.

comment:11 Changed 3 years ago by Rigel Di Scala

Owner: set to Rigel Di Scala
Status: newassigned

comment:12 Changed 3 years ago by Rigel Di Scala

You can also test this with an infinite series, such as the classic Fibonacci function, if you replace the range generator with something like:

def fib():
    a, b = 0, 1
    while 1:
        yield a
        a, b = b, a + b

I tested this and the memory use did not increase significantly even after streaming over a gigabyte of data for a single request.

Last edited 3 years ago by Rigel Di Scala (previous) (diff)

comment:13 in reply to:  11 Changed 3 years ago by Daniele Procida

The example above looks good to me. Please do submit a pull request - thanks.

comment:14 Changed 3 years ago by Rigel Di Scala

Has patch: set

I have opened a pull request here:

I am using a slight variation of the above example, using Python 3 friendly code and some additional comments, as suggested by bmispelon.

comment:15 Changed 3 years ago by Rigel Di Scala

Resubmitted a new pull request:

comment:16 Changed 3 years ago by Tim Graham

Needs documentation: unset

comment:17 Changed 3 years ago by Tim Graham <timograham@…>

Resolution: fixed
Status: assignedclosed

In fad47367bf622635b4cf931db72310cce41cebb4:

Fixed #21179 -- Added a StreamingHttpResponse example for CSV files.

Thanks charettes for the suggestion.

Note: See TracTickets for help on using tickets.
Back to Top