Opened 6 years ago

Closed 6 years ago

#10819 closed (wontfix)

Do not limit input data in multipartparser by content-length (in parse())

Reported by: astaley@… Owned by: nobody
Component: HTTP handling Version: 1.0
Severity: Keywords:
Cc: Triage Stage: Unreviewed
Has patch: no Needs documentation: no
Needs tests: no Patch needs improvement: no
Easy pickings: UI/UX:

Description

Currently, Django's multipart parser attempts to constrain the input stream by the content_length provided in the input request.

This is incorrect behavior; Content-Length only represents the post-processed message body and should NOT be used at this level (where unknown processing may have been done by middleware or apache sites). See RFC 2616 Section 14.13 for more detail: http://www.w3.org/Protocols/rfc2616/rfc2616-sec14.html

A specific example of where this errors is when the client submits gzip'd requests. mod_deflate will uncompress it, but it (correctly) does not change content_length; consequently, the multipart parser truncates the decompressed data. See http://httpd.apache.org/docs/2.0/mod/mod_deflate.html for more detail (If you evaluate the request body yourself, don't trust the Content-Length header! The Content-Length header reflects the length of the incoming data from the client and not the byte count of the decompressed data stream.)

See http://issues.apache.org/jira/browse/XMLRPC-153 for a good discussion on this.

Change History (2)

comment:1 Changed 6 years ago by grahamd

  • Needs documentation unset
  • Needs tests unset
  • Patch needs improvement unset

You don't have choice but to only use content length. At least this is the case when being hosted on WSGI hosting mechanism as the WSGI specification forbids you reading more than content length. Thus you technically can't just read to end of input stream. Because supplying an empty string to represent end of input stream is not mandatory for WSGI adapters, if you do read past content length, then some WSGI hosting mechanisms will hang as it would be attempting to read into what would be the data sent for a subsequent request on same socket connection permissible because of HTTP keep alive.

So, although some WSGI hosting mechanisms would work if you simply read to end of stream, such as Apache/mod_wsgi, others will not. Older versions of CherryPy WSGI server would fail in this respect and some newer versions of CherryPy WSGI server will actually raise an exception if you attempt to read more than content length. The wsgiref reference WSGI server would also fail.

So, basically impossible for WSGI 1.0. Is on list of issues that needs to be addressed for WSGI 2.0, although whether that will occur is a different matter.

Also, for mod_python you may also have issues due to bugs in mod_python in way that input stream is dealt with. In particular, it also uses content length when it shouldn't. See:

http://issues.apache.org/jira/browse/MODPYTHON-212

comment:2 Changed 6 years ago by jacob

  • Resolution set to wontfix
  • Status changed from new to closed

Given what Graham said, I'm marking this wontfix -- there's nothing we can do until WSGI 2 hits the streets.

Note: See TracTickets for help on using tickets.
Back to Top