#15785 closed Bug (fixed)
HttpRequest.read(NUM_BYTES) can read beyond the end of wsgi.input stream. (Violation of WSGI spec & under-defined behaviour)
Reported by: | Tom Christie | Owned by: | Tom Christie |
---|---|---|---|
Component: | HTTP handling | Version: | 1.3-rc |
Severity: | Normal | Keywords: | http, wsgi |
Cc: | Maniac@…, Tom Christie | Triage Stage: | Ready for checkin |
Has patch: | yes | Needs documentation: | no |
Needs tests: | no | Patch needs improvement: | no |
Easy pickings: | no | UI/UX: | no |
Description
Okay, this is the underlying cause behind #15762 and #15763.
I've marked those as duplicates of this ticket.
Please see discussion on this bug here: https://groups.google.com/forum/#!topic/django-developers/VG1ueWTSs_g
The problem is now that HttpRequest exposes a read() method, user code can do something like:
request_content = json.load(request)
at the moment that will:
- break the wsgi spec, as the client app is contracted not to attempt to read more than CONTENT-LENGTH bytes from wsgi.input
- result in under-defined behaviour, although it appears to work right now.
- break when used in the test client, as per #15762
I've attached a patch with tests for this issue, which:
- Changes WSGIRequest._stream to be a property that is (always) instantiated as a LimitedStream when first accessed.
- Removes some redundant code in HttpRequest and MultiPartParser.
- Fixes some minor bugs in tests/regressiontests/requests/tests.py
- Adds two tests for MultiPartParser to check graceful behaviour on truncated or empty multipart requests.
- Adds a test for TestClient request.read(LARGE_BUFFER) behaviour.
Attachments (6)
Change History (24)
by , 14 years ago
Attachment: | limited_stream.diff added |
---|
comment:1 by , 14 years ago
Cc: | added |
---|
by , 14 years ago
Attachment: | limited_stream_lazy.diff added |
---|
comment:2 by , 14 years ago
comment:3 by , 14 years ago
Tom, since you're hacking on it so actively :-) how about adding another test somewhere near read_limited_stream that would test reading from a GET request that doesn't have a Content-length header? It would test the situation with some generic view code always calling request.read() irregardless of the request type. The sensible thing to do here is not to break and just give it an empty string.
comment:4 by , 14 years ago
Keywords: | http wsgi added |
---|---|
Triage Stage: | Unreviewed → Accepted |
Good plan, I'll get on to that.
I'm assuming it's cool to drop this ticket into 'Accepted' now.
comment:5 by , 14 years ago
Cc: | added |
---|
by , 14 years ago
Attachment: | limited_stream_lazy_plus_extra_tests.diff added |
---|
by , 14 years ago
Attachment: | limited_stream_lazy_plus_extra_tests.2.diff added |
---|
comment:6 by , 14 years ago
Added some extra tests and cleaned up some test code to also include the test for #14753, which has similar logic.
I've added this as limited_stream_lazy_plus_extra_tests.2.diff.
As a final sanity check I reverted, applied the tests patches only, these two fail:
- test_read_numbytes_from_empty_request
- test_read_numbytes_from_nonempty_request
applied the code patches, checked they passed.
Ivan, could you review, I'm guessing this should be able to go into 'Ready for checkin' now.
Ta! :)
comment:7 by , 14 years ago
Couple of comments, small nits, actually:
+ r = { + 'CONTENT_LENGTH': len(payload), + 'CONTENT_TYPE': client.MULTIPART_CONTENT, + 'PATH_INFO': "/file_uploads/echo/", + 'REQUEST_METHOD': 'POST', + 'wsgi.input': client.FakePayload(payload), + }
Django uses a different style in such cases: values should not be vertically aligned but separated from keys with a colon and a single space.
+ try: + content_length = int(self.environ.get('CONTENT_LENGTH', 0)) + except (ValueError, TypeError): + content_length = 0
I don't see how TypeError can be raised here. Also, the default value is defined in two places. I'd do it like this instead:
try: content_length = int(self.environ.get('CONTENT_LENGTH')) except ValueError: content_length = 0
comment:8 by , 14 years ago
If you are going to do:
try: content_length = int(self.environ.get('CONTENT_LENGTH')) except ValueError: content_length = 0
then you must catch TypeError as lack of CONTENT_LENGTH would return None which yields a TypeError when int() is applied to it.
So yes, TypeError shouldn't have been needed in what they originally had, but is for what you have suggested.
by , 14 years ago
Attachment: | limited_stream_lazy_plus_extra_tests.3.diff added |
---|
comment:10 by , 14 years ago
Coolio, I've updated the patch to reflect those points.
It's my first contribution, so thanks for the feedback and helping this along...
comment:11 by , 14 years ago
Triage Stage: | Accepted → Ready for checkin |
---|
This now looks to me ready for checkin. Thanks, Tom!
comment:13 by , 13 years ago
Patch needs improvement: | set |
---|---|
UI/UX: | unset |
Not sure what's wrong, but one test doesn't pass anymore:
Creating test database for alias 'default'... Creating test database for alias 'other'... Destroying old test database 'other'... ....F........... ====================================================================== FAIL: test_empty_multipart_raises_error (regressiontests.file_uploads.tests.FileUploadTests) ---------------------------------------------------------------------- Traceback (most recent call last): File "/Users/jezdez/Code/git/django/tests/regressiontests/file_uploads/tests.py", line 217, in test_empty_multipart_raises_error self.assertRaises(MultiPartParserError, lambda r: self.client.request(**r), r) AssertionError: MultiPartParserError not raised ---------------------------------------------------------------------- Ran 16 tests in 1.351s FAILED (failures=1) Destroying test database for alias 'default'... Destroying test database for alias 'other'...
comment:14 by , 13 years ago
Patch needs improvement: | unset |
---|
The expected behavior of MultiPartParser when given a Content-Length of 0 has changed slightly since this ticket/patch was submitted.
Please see ticket #16201 for details of how & why.
I've updated the test_empty_multipart_raises_error()
test to instead be test_empty_multipart_handled_gracefully()
, and test that it returns an empty QueryDict, rather than test that it throws a MultiPartParseError
.
I've verified that the patch still applies okay to the latest revision, and that the tests are now all passing again.
Patch attached as limited_stream_lazy_plus_extra_tests.4.diff
by , 13 years ago
Attachment: | limited_stream_lazy_plus_extra_tests.4.diff added |
---|
comment:17 by , 13 years ago
It reverts it, but it's not needed any more since we wrap the wsgi.input in a LimitedStream in any case, whether development server or not.
Updated patch (limited_stream_lazy.diff)
Thanks to Ivan's testing it turns out the defered creation of the LimitedStream isn't necessary.
This should do the job nicely.