#16494 closed Bug (fixed)
HttpResponse should raise an error if given a non-string object without `__iter__`
Reported by: | Owned by: | Paul McMillan | |
---|---|---|---|
Component: | HTTP handling | Version: | 1.3 |
Severity: | Normal | Keywords: | |
Cc: | anssi.kaariainen@… | Triage Stage: | Accepted |
Has patch: | yes | Needs documentation: | no |
Needs tests: | no | Patch needs improvement: | no |
Easy pickings: | no | UI/UX: | no |
Description
where response =
<class 'django.http.HttpResponse'>
and response._container =
[<Response><Speak loop="1" voice="slt">Hello Friend</Speak></Response>]
Content type = text/xml
matt@dragoon:~/Projects/my_project$ python manage.py test my_app Creating test database for alias 'default'... ====================================================================== ERROR: test_hello (my_app.tests.RestTestCase) ---------------------------------------------------------------------- print response Traceback (most recent call last): File "/Users/matt/Projects/my_project/my_app/tests.py", line 22, in test_hello print result.content File "/Library/Python/2.6/site-packages/django/http/__init__.py", line 596, in _get_content return smart_str(''.join(self._container), self._charset) TypeError: sequence item 0: expected string, Response found ---------------------------------------------------------------------- Ran 1 test in 3.177s FAILED (errors=1) Destroying test database for alias 'default'...
The following patch corrected the issue for me
django/http/__init__.py 593,595c593 < def _get_content(self): if self.has_header('Content-Encoding'): < return ''.join(self._container) < return smart_str(''.join(self._container), self._charset) --- > def _get_content(self):
Attachments (1)
Change History (15)
comment:1 by , 13 years ago
Type: | Uncategorized → Bug |
---|
comment:2 by , 13 years ago
Resolution: | → needsinfo |
---|---|
Status: | new → closed |
comment:3 by , 13 years ago
Resolution: | needsinfo |
---|---|
Status: | closed → reopened |
I printed out the type of each object in the response's and realized I am getting the error because I was returning an object in my view whose unicode method returns a string containing XML instead of returning an HttpResponse. I think one of two resolutions could help people making this mistake here:
- Raise a TypeError if a view returns an object that does not behave like or does not subclass an HttpResponse
- Use the patch I included in the original description to attempt to convert each object in the _container list to a string before joining them
comment:4 by , 13 years ago
Pardon me. I mean it is currently returning an HttpResponse with a non-string as the first argument, e.g.
class Response(object): def __unicode__(self): return '<Response><Speak loop="1" voice="slt">Hello Friend</Speak></Response>' return HttpResponse(Response())
comment:5 by , 13 years ago
Easy pickings: | unset |
---|---|
Has patch: | unset |
Summary: | django.http.HttpResponse __unicode__ raises TypeError during test → HttpResponse should raise an error if given a non-string object without `__iter__` |
Triage Stage: | Unreviewed → Accepted |
The relevant code is here: https://code.djangoproject.com/browser/django/trunk/django/http/__init__.py#L546
Django currently assumes that everything that is an instance of the base string or has no __iter__
property is a string.
We should either be converting to a string on line 550, or raising an error if that object is not a string. I'm in favor of doing the string conversion.
This looks like a legitimate bug to me, so I'm marking it as such and changing the title to reflect the real issue.
comment:6 by , 13 years ago
I just ran into this today. I was returning the pk of an object in the HttpResponse object like so:
return HttpResponse(obj.pk)
This caused the same error. One fix that I found floating around the internet was changing this:
def _get_content(self): if self.has_header('Content-Encoding'): return ''.join(self._container) return smart_str(''.join(self._container), self._charset)
to this:
def _get_content(self): if self.has_header('Content-Encoding'): return ''.join(self._container) return smart_str(''.join(map(str, self._container)), self._charset)
This would fix it as far as just forcing everything to a string. It worked for me when I tried it. However, in the interest of not modifying Django's libraries unnecessarily, I'll just be going through my code and fixing it there. It would be nice, though, if this were to make it into Django at some point. It seems unnecessary to have to call
HttpResponse(str(obj.pk))
Lastly, it seems likely that this appeared in Django 1.3. I upgraded not too long ago, and had no problems up until that point. This is the first time noticing this problem.
comment:7 by , 13 years ago
Owner: | changed from | to
---|---|
Status: | reopened → new |
@anonymous I looked through the changelog and didn't see any recent changes in django/http/__init__.py
that would have affected this. There may have been some elsewhere.
Your post points out that we have this bug in the _container
property as well. I'll work on a patch here - I don't see any good reason not to convert to a string when setting HttpResponse._container
, since we go on to use it as a string.
comment:8 by , 13 years ago
I put together an initial draft of a patch, which adds tests and raises an exception on non-string, non-iterator input:
https://github.com/PaulMcMillan/django/commit/75052eb95d4648e63def203eb1e2a42e1b261a85
After further consideration, I noticed that when the HttpResponse.content
is accessed via iterator, it DOES convert to a string:
https://code.djangoproject.com/browser/django/trunk/django/http/__init__.py#L664
Given this, the most consistent fix (in line with existing behavior) is to convert to string on output. I'm working on a comprehensive patch to both clean up the code and do this in a way that is backwards compatible.
by , 13 years ago
Attachment: | httpresponse_content.diff added |
---|
Converts non-string inputs to strings just before output. Adds tests.
comment:9 by , 13 years ago
Has patch: | set |
---|
I've added a patch that converts everything to strings just before output. It cleans up the logic and fixes the original edge-case logic error. It also adds tests since this component was relatively under-tested.
comment:10 by , 13 years ago
I'm not overwhelmingly familiar with this chunk of the codebase, but from a cursory glance it appears that the provided patch offers a more consistent approach to handling the content of HttpResponse
. The tests and docs look reasonable, but I'd like another set of eyes on the code itself prior to marking it RFC.
comment:11 by , 13 years ago
Patch needs improvement: | set |
---|
I haven't actually tested the patch, but according to my reading of it, I think there is one big problem and one smaller problem. Both are present already in current code.
- The big one: If the given content is iter(['a', 'b']), isn't it so that on first call of reponse.content you will get 'ab' and on second call . That is, the iter is consumed on first call and on the second one it is already consumed.
- The smaller one: If the content is already a string, doesn't the .join(self._container) create a new copy of the string? That seems to be non-necessary. The content can be large and there is no point of copying it.
Try the following (should "fail" on both old and new version of get_content).
from django.core.management import setup_environ import settings setup_environ(settings) from django.http import HttpResponse h = HttpResponse(iter(['a', 'b'])) print h.content print h.content
I think the best approach is to turn the given iterable to a string in set_content, store it in self._raw_content. Turn it into correctly encoded string when content is asked. Test and make sure there will be as little copying of the content as possible, including the conversion to the correct encoding. UTF-8 should be the expected result, so maybe ._raw_content could be in UTF-8?
comment:12 by , 13 years ago
Cc: | added |
---|---|
Patch needs improvement: | unset |
Forget about the string copying. It just doesn't matter performance wise, the _get_content isn't called when generating pages.
If I understand this correctly, which I hope is now the case, the normal (as in when running under Apache) path just iterates the response object, so that is the case which should be optimized, not the access to .content.
The multiple consuming of the iterator would be nice to fix, though. But even that should not be a blocker for this ticket. The problem exists already, and can be fixed separately. So, I removed the patch needs improvement flag.
comment:13 by , 13 years ago
As you say, the normal case is to iterate the HttpResponse
object. This is why it is inappropriate to turn it into a string in _set_content()
. Accessing content
directly is only the correct thing to do if you're consuming the content once and for all, or the original content is not an iterator. #7581 is the issue you describe as multiple consuming the iterator and has been outstanding for a long time. This patch fixes a different issue (though I'm sure the patch for that ticket will need to be updated once we commit this).
There's not enough information here to reproduce your error. What does your testcase look like? What does the rest of the project look like?
If you can provide enough information for someone else to reproduce your issue, please feel free to re-open the ticket.