Opened 8 years ago

Closed 8 years ago

Last modified 8 years ago

#16494 closed Bug (fixed)

HttpResponse should raise an error if given a non-string object without `__iter__`

Reported by: matt@… Owned by: Paul McMillan
Component: HTTP handling Version: 1.3
Severity: Normal Keywords:
Cc: anssi.kaariainen@… Triage Stage: Accepted
Has patch: yes Needs documentation: no
Needs tests: no Patch needs improvement: no
Easy pickings: no UI/UX: no


where response =

<class 'django.http.HttpResponse'>

and response._container =

[<Response><Speak loop="1" voice="slt">Hello Friend</Speak></Response>]

Content type = text/xml

matt@dragoon:~/Projects/my_project$ python test my_app
Creating test database for alias 'default'...
ERROR: test_hello (my_app.tests.RestTestCase)
print response
Traceback (most recent call last):
  File "/Users/matt/Projects/my_project/my_app/", line 22, in test_hello
    print result.content
  File "/Library/Python/2.6/site-packages/django/http/", line 596, in _get_content
    return smart_str(''.join(self._container), self._charset)
TypeError: sequence item 0: expected string, Response found
Ran 1 test in 3.177s

FAILED (errors=1)
Destroying test database for alias 'default'...

The following patch corrected the issue for me

<     def _get_content(self):        if self.has_header('Content-Encoding'):
<             return ''.join(self._container)
<         return smart_str(''.join(self._container), self._charset)
>     def _get_content(self):

Attachments (1)

httpresponse_content.diff (6.2 KB) - added by Paul McMillan 8 years ago.
Converts non-string inputs to strings just before output. Adds tests.

Download all attachments as: .zip

Change History (15)

comment:1 Changed 8 years ago by anonymous

Type: UncategorizedBug

comment:2 Changed 8 years ago by Paul McMillan

Resolution: needsinfo
Status: newclosed

There's not enough information here to reproduce your error. What does your testcase look like? What does the rest of the project look like?

If you can provide enough information for someone else to reproduce your issue, please feel free to re-open the ticket.

comment:3 Changed 8 years ago by matt@…

Resolution: needsinfo
Status: closedreopened

I printed out the type of each object in the response's and realized I am getting the error because I was returning an object in my view whose unicode method returns a string containing XML instead of returning an HttpResponse. I think one of two resolutions could help people making this mistake here:

  • Raise a TypeError if a view returns an object that does not behave like or does not subclass an HttpResponse
  • Use the patch I included in the original description to attempt to convert each object in the _container list to a string before joining them

comment:4 Changed 8 years ago by matt@…

Pardon me. I mean it is currently returning an HttpResponse with a non-string as the first argument, e.g.

class Response(object):
    def __unicode__(self):
        return '<Response><Speak loop="1" voice="slt">Hello Friend</Speak></Response>'

return HttpResponse(Response())

comment:5 Changed 8 years ago by Paul McMillan

Easy pickings: unset
Has patch: unset
Summary: django.http.HttpResponse __unicode__ raises TypeError during testHttpResponse should raise an error if given a non-string object without `__iter__`
Triage Stage: UnreviewedAccepted

The relevant code is here:

Django currently assumes that everything that is an instance of the base string or has no __iter__ property is a string.

We should either be converting to a string on line 550, or raising an error if that object is not a string. I'm in favor of doing the string conversion.

This looks like a legitimate bug to me, so I'm marking it as such and changing the title to reflect the real issue.

comment:6 Changed 8 years ago by anonymous

I just ran into this today. I was returning the pk of an object in the HttpResponse object like so:

return HttpResponse(

This caused the same error. One fix that I found floating around the internet was changing this:

def _get_content(self):
        if self.has_header('Content-Encoding'):
            return ''.join(self._container)
        return smart_str(''.join(self._container), self._charset)

to this:

def _get_content(self):
        if self.has_header('Content-Encoding'):
            return ''.join(self._container)
        return smart_str(''.join(map(str, self._container)), self._charset)

This would fix it as far as just forcing everything to a string. It worked for me when I tried it. However, in the interest of not modifying Django's libraries unnecessarily, I'll just be going through my code and fixing it there. It would be nice, though, if this were to make it into Django at some point. It seems unnecessary to have to call


Lastly, it seems likely that this appeared in Django 1.3. I upgraded not too long ago, and had no problems up until that point. This is the first time noticing this problem.

comment:7 Changed 8 years ago by Paul McMillan

Owner: changed from nobody to Paul McMillan
Status: reopenednew

@anonymous I looked through the changelog and didn't see any recent changes in django/http/ that would have affected this. There may have been some elsewhere.

Your post points out that we have this bug in the _container property as well. I'll work on a patch here - I don't see any good reason not to convert to a string when setting HttpResponse._container, since we go on to use it as a string.

comment:8 Changed 8 years ago by Paul McMillan

I put together an initial draft of a patch, which adds tests and raises an exception on non-string, non-iterator input:

After further consideration, I noticed that when the HttpResponse.content is accessed via iterator, it DOES convert to a string:

Given this, the most consistent fix (in line with existing behavior) is to convert to string on output. I'm working on a comprehensive patch to both clean up the code and do this in a way that is backwards compatible.

Changed 8 years ago by Paul McMillan

Attachment: httpresponse_content.diff added

Converts non-string inputs to strings just before output. Adds tests.

comment:9 Changed 8 years ago by Paul McMillan

Has patch: set

I've added a patch that converts everything to strings just before output. It cleans up the logic and fixes the original edge-case logic error. It also adds tests since this component was relatively under-tested.

comment:10 Changed 8 years ago by Gabriel Hurley

I'm not overwhelmingly familiar with this chunk of the codebase, but from a cursory glance it appears that the provided patch offers a more consistent approach to handling the content of HttpResponse. The tests and docs look reasonable, but I'd like another set of eyes on the code itself prior to marking it RFC.

comment:11 Changed 8 years ago by Anssi Kääriäinen

Patch needs improvement: set

I haven't actually tested the patch, but according to my reading of it, I think there is one big problem and one smaller problem. Both are present already in current code.

  • The big one: If the given content is iter(['a', 'b']), isn't it so that on first call of reponse.content you will get 'ab' and on second call . That is, the iter is consumed on first call and on the second one it is already consumed.
  • The smaller one: If the content is already a string, doesn't the .join(self._container) create a new copy of the string? That seems to be non-necessary. The content can be large and there is no point of copying it.

Try the following (should "fail" on both old and new version of get_content).

from import setup_environ
import settings
from django.http import HttpResponse
h = HttpResponse(iter(['a', 'b']))
print h.content
print h.content

I think the best approach is to turn the given iterable to a string in set_content, store it in self._raw_content. Turn it into correctly encoded string when content is asked. Test and make sure there will be as little copying of the content as possible, including the conversion to the correct encoding. UTF-8 should be the expected result, so maybe ._raw_content could be in UTF-8?

comment:12 Changed 8 years ago by Anssi Kääriäinen

Cc: anssi.kaariainen@… added
Patch needs improvement: unset

Forget about the string copying. It just doesn't matter performance wise, the _get_content isn't called when generating pages.

If I understand this correctly, which I hope is now the case, the normal (as in when running under Apache) path just iterates the response object, so that is the case which should be optimized, not the access to .content.

The multiple consuming of the iterator would be nice to fix, though. But even that should not be a blocker for this ticket. The problem exists already, and can be fixed separately. So, I removed the patch needs improvement flag.

comment:13 Changed 8 years ago by Paul McMillan

As you say, the normal case is to iterate the HttpResponse object. This is why it is inappropriate to turn it into a string in _set_content(). Accessing content directly is only the correct thing to do if you're consuming the content once and for all, or the original content is not an iterator. #7581 is the issue you describe as multiple consuming the iterator and has been outstanding for a long time. This patch fixes a different issue (though I'm sure the patch for that ticket will need to be updated once we commit this).

comment:14 Changed 8 years ago by Paul McMillan

Resolution: fixed
Status: newclosed

In [16829]:

Fixed #16494 by normalizing HttpResponse behavior with non-string input. HttpResponse now always converts content to string on output, regardless of input type.

Note: See TracTickets for help on using tickets.
Back to Top