#21179 closed Cleanup/optimization (fixed)

How-to output CSV from Django should suggest using `StreamingHttpResponse`

Reported by: charettes Owned by: zr
Component: Documentation Version: master
Severity: Normal Keywords: afraid-to-commit
Cc: Triage Stage: Accepted
Has patch: yes Needs documentation: no
Needs tests: no Patch needs improvement: no
Easy pickings: yes UI/UX: no

Description

The Outputting CSV with Django how-to doesn't even mention StreamingHttpResponse even if it’s useful for generating large CSV files.

I suggest we replace the example with something along the following:

import csv
from StringIO import StringIO

from django.http import StreamingHttpResponse


def some_view(request):
    rows = (
        ['First row', 'Foo', 'Bar', 'Baz'],
        ['Second row', 'A', 'B', 'C', '"Testing"', "Here's a quote"]
    )

    # Define a generator to stream data directly to the client
    def stream():
        buffer_ = StringIO()
        writer = csv.writer(buffer_)
        for row in rows:
            writer.writerow(row)
            buffer_.seek(0)
            data = buffer_.read()
            buffer_.seek(0)
            buffer_.truncate()
            yield data

    # Create the streaming response  object with the appropriate CSV header.
    response = StreamingHttpResponse(stream(), content_type='text/csv')
    response['Content-Disposition'] = 'attachment; filename="somefilename.csv"'

    return response

Change History (17)

comment:1 Changed 19 months ago by EvilDMP

  • Triage Stage changed from Unreviewed to Accepted

Yes, and also https://docs.djangoproject.com/en/dev/ref/request-response/#django.http.StreamingHttpResponse should link to this, in the text "For instance, it’s useful for generating large CSV files"

comment:2 Changed 19 months ago by EvilDMP

  • Keywords afraid-to-commit added

comment:3 Changed 19 months ago by mjtamlyn

I'm not convinced. I've output many a CSV file and never needed the streaming response to get performance. Whilst this is a useful addition to mention at this point in the docs, I don't think we should be recommending the more complex option.

comment:4 Changed 19 months ago by aaugustin

The code example looks like C, not like Python... I don't want to see buffer_.seek(0) in our docs.

Streaming responses don't change much when you pull all the data in RAM, and if the data comes from a queryset, Django currently does that even if you use .iterator(). It seems much more interesting to me to optimize the database side than the HTTP response side.

comment:5 Changed 19 months ago by charettes

Thinking about it I must agree that without server-side cursor support (#16614) the tradeoff is not worth turning the simple example into a overly complex one.

I just thought it was odd that StreamingHttpResponse's documentation mentions that it’s useful for generating large CSV files but our provided tutorial doesn't even mention it.

What do you guys think of adding an admonition with no specific example to the how-to explaining StreamingHttpResponse might be useful in this case?

comment:6 Changed 19 months ago by EvilDMP

StreamingHttpResponse could still do with some example code in the docs, even if it doesn't replace the existing example.

comment:7 Changed 16 months ago by anubhav9042

Any ideas regarding what type of example should be given in the docs for StreamingHttpResponse?

comment:8 Changed 14 months ago by zr

  • Owner changed from nobody to zr
  • Status changed from new to assigned

comment:9 Changed 14 months ago by zr

  • Owner zr deleted
  • Status changed from assigned to new

Hello, I would like to work on this ticket.

I think that some information on how to test a view that returns a StreamingHttpResponse() would be useful. The Django test Client actually returns an iterable response, and the .streaming_content property is an instance of <itertools.imap>. You would then need to concatenate it into a string in order to test it, as you would do with the standard HttpResponse.

comment:10 Changed 14 months ago by zr

I was thinking of something along these lines:

import csv

from django.http import StreamingHttpResponse


class Echo(object):
    def write(self, value):
        return value


def some_streaming_view(request):
    rows = (["Row {0}".format(idx), str(idx)] for idx in xrange(100))
    buffer_ = Echo()
    writer = csv.writer(buffer_)
    response = StreamingHttpResponse((writer.writerow(row) for row in rows),
                                     content_type="text/csv")
    response['Content-Disposition'] = 'attachment; filename="somefilename.csv"'
    return response

I have tested it with curl, a simple test case with the Django test client, and a regular browser.

comment:11 follow-up: Changed 14 months ago by zr

  • Owner set to zr
  • Status changed from new to assigned

comment:12 Changed 14 months ago by zr

You can also test this with an infinite series, such as the classic Fibonacci function, if you replace the range generator with something like:

I tested this and the memory use did not increase significantly even after streaming over a gigabyte of data for a single request.

Version 0, edited 14 months ago by zr (next)

comment:13 in reply to: ↑ 11 Changed 14 months ago by EvilDMP

The example above looks good to me. Please do submit a pull request - thanks.

comment:14 Changed 14 months ago by zr

  • Has patch set

I have opened a pull request here:

https://github.com/django/django/pull/2358

I am using a slight variation of the above example, using Python 3 friendly code and some additional comments, as suggested by bmispelon.

comment:15 Changed 14 months ago by zr

Resubmitted a new pull request: https://github.com/django/django/pull/2397

comment:16 Changed 14 months ago by timo

  • Needs documentation unset

comment:17 Changed 14 months ago by Tim Graham <timograham@…>

  • Resolution set to fixed
  • Status changed from assigned to closed

In fad47367bf622635b4cf931db72310cce41cebb4:

Fixed #21179 -- Added a StreamingHttpResponse example for CSV files.

Thanks charettes for the suggestion.

Note: See TracTickets for help on using tickets.
Back to Top