Django

Code

Ticket #5898 (closed: fixed)

Opened 10 months ago

Last modified 9 months ago

HEAD processing should be compulsory

Reported by: Scott Barr <scott@divisionbyzero.com.au> Assigned to: nobody
Milestone: Component: HTTP handling
Version: SVN Keywords:
Cc: Triage Stage: Accepted
Has patch: 1 Needs documentation: 1
Needs tests: 0 Patch needs improvement: 0

Description

Here is a patch to django.middleware.http.py that (1) add's a "Content-Length" header to the HTTP response if the "Content-Length" header hasn't already been supplied, and (2) removes the content from the response, for HEAD requests where the response code is 200.

This makes the HTTP response useful for HEAD requests, which shouldn't have content in the response. The Content-Length header is desirable in the response headers.

Attachments

middleware.http.patch (0.9 kB) - added by Scott Barr <scott@divisionbyzero.com.au> on 11/08/07 06:33:46.
Patch
middleware.http.2.patch (0.9 kB) - added by Scott Barr <scott@divisionbyzero.com.au> on 11/09/07 00:46:30.
remove_unallowed_content.diff (2.1 kB) - added by arien <regexbot@gmail.com> on 11/09/07 06:34:56.
RemoveUnallowedResponseContent? as a response middleware that would automagically alway be run after all other reponse middleware.
http_requirements.diff (4.2 kB) - added by arien <regexbot@gmail.com> on 11/09/07 14:07:02.
Try to ensure responses meet basic HTTP requirements.
middleware-docs.diff (0.6 kB) - added by arien <regexbot@gmail.com> on 11/11/07 03:37:43.

Change History

11/08/07 06:33:46 changed by Scott Barr <scott@divisionbyzero.com.au>

  • attachment middleware.http.patch added.

Patch

11/08/07 20:52:05 changed by mtredinnick

  • needs_better_patch changed.
  • needs_tests changed.
  • needs_docs changed.

Both this ticket and #5897 duplicate what the conditional get middleware already does. There's no real disadvantage to including that middleware, so it might not be worth moving this functionality around, due to the disruption it will cause to existing setups.

Could probably be handled with a documentation improvement instead.

11/09/07 00:46:11 changed by Scott Barr <scott@divisionbyzero.com.au>

I did consider the duplicate functionality when submitting this patch, and it felt a bit wrong, so fair comment.

The latest patch addresses the removal of the duplicated functionality, and adds some extra documentation.

11/09/07 00:46:30 changed by Scott Barr <scott@divisionbyzero.com.au>

  • attachment middleware.http.2.patch added.

11/09/07 00:49:06 changed by mtredinnick

The latest patch still adds nothing new, nor does it add anything to the public documentation. The ConditionalGetMiddleware already handles HEAD requests. Have a look at line 32 of http.py.

11/09/07 01:17:04 changed by Scott Barr <scott@divisionbyzero.com.au>

  • status changed from new to closed.
  • resolution set to invalid.

Apologies, I didn't see that in there. Marking this ticket as invalid.

11/09/07 06:33:08 changed by arien <regexbot@gmail.com>

  • status changed from closed to reopened.
  • resolution deleted.

I think the point is that the conditional get middleware should not do this.

HEAD requests must not return content the same way that Location headers must use absolute URIs, which is what django.core.handlers.base.fix_location_header tries to ensure. When I tried to come up with a better patch for this issue, I found out fix_location_header fails, because fix_location_header fails is called (in the handler's get_response) before the response middleware is run.

The problem here is similar: this code should run after all response middleware, but there's not a really good place for it to hook into. If there were such a place both these issues could be easily resolved.

Since, there is no such hook, I'll attach a patch to show what this would mean for this responses to HEAD requests, written as a response middleware that would always run after all other response middleware.

11/09/07 06:34:56 changed by arien <regexbot@gmail.com>

  • attachment remove_unallowed_content.diff added.

RemoveUnallowedResponseContent? as a response middleware that would automagically alway be run after all other reponse middleware.

11/09/07 14:07:02 changed by arien <regexbot@gmail.com>

  • attachment http_requirements.diff added.

Try to ensure responses meet basic HTTP requirements.

11/09/07 14:23:58 changed by arien <regexbot@gmail.com>

This new patch changes the handlers so that they run two fix-ups (for absolute URIs in redirects and removing content from responses to HEAD requests or with status codes 1xx, 204, and 304) after any response middleware. These fix-ups will not run if converting the incoming request to the handler's request_class fails; doing this would IMHO needlessly complicate the code.

This change is backwards compatible.

Although I think the idea is sound, I'm not submitting patches for docs or tests because there may be a nicer way to implement this.

11/10/07 00:14:59 changed by mtredinnick

What is the actual problem you are trying to fix, arien? (Let's set aside co-opting an existing ticket for a different purpose). HEAD requests already don't return any content. So there's no problem there. Don't make hand-wavy analogies to absolute URLs, they have nothing to do with HEAD requests. What is an example of something that you think should work that does not work with the current code?

We already set content length to 0 for 304 response codes and Django itself never returns a 204: if somebody does return a response like that, it's up to them to also not return any content, we aren't going to do introspection on a response that is generated by the user.

11/10/07 15:45:25 changed by arien <regexbot@gmail.com>

The problem I was going to fix is that HEAD request do return content whenever the conditional get middleware isn't used. Using the conditional get middleware is optional (and not even the default), not returning a message body on HEAD requests is mandatory.

In doing this, I found that there was another piece of "look into the user generated response and make sure we're HTTP compliant in this respect" code in there called fix_location_header. The problem was that fix_location_header was called before response middleware, while this new piece of code obviously needed to go after. So I submitted a half-hearted attempt at a patch (remove_unallowed_content.diff) that deferred any design decision, waffling about automagical middlewares.

Later, I decided that the correct fix would be for both fix_location_headers and this new piece of code to hook into the request processing after the response middleware. (The way fix_location_header was used didn't correspond to its stated reason for existence anyway, and making this one change would solve that as well.) There might even be more fix-ups in the future if/when Django gets more ambitious about this sort of thing. Hence http_requirements.diff.

(The mischaracterizations are getting rather tiresome, mtredinnick.)

11/10/07 17:56:57 changed by Scott Barr <scott@divisionbyzero.com.au>

Thats the issue I attempted to address with this ticket. HEAD should not return a content body, and should have a Content-Length header showing the size of the body that would have been returned.

(follow-up: ↓ 13 ) 11/10/07 18:23:49 changed by mtredinnick

  • summary changed from HeadMiddleware - Modify response for HEAD requests to HEAD processing should be compulsory.
  • stage changed from Unreviewed to Accepted.

Changed title to reflect what is probably the real issue here.

Sorry if you feel I'm mischaracterising what you're saying, arien. That's obviously not my intention, but we do need to establish what the real problem is, now that it's veered away from Scott's original wish to add functionality we already have. It is true that HEAD handling really does have nothing to with with fix_location_header (which is being used as it is intended and documented: it fixed the Location header after everything else is done).

I agree that HEAD handling should probably be compulsory since people aren't in the habit of throwing an error on methods they don't understand. We can certainly fix that. If you have an example of something else that doesn't work that you believe should, again, please post it. It's difficult to ensure that I've understood all the points you are trying to raise. As I understand the code at the moment, and thinking through the possibilities, if you have the ConditionalGetMiddleware installed, there should be no HTTP violations occurring. Do you know of a case where that isn't true? If not, we'll make the HEAD stuff happen by default and that should fix the main problem.

11/10/07 19:06:04 changed by mtredinnick

Aah... now I see what you're trying to do with the patches. These two patches are meant to go together and neither works in the absence of the other. Please, please, please, use one patch per change in the future. Attempts to confuse the people reviewing the patches wors far more frequently than you might intend and there's no benefit (and quite a downside) to multiple patches that are co-dependent.

Now I have enough information to fix this.

11/10/07 19:23:47 changed by arien <regexbot@gmail.com>

mtredinnick, the patches are independent.

What I'm aiming for is in the last patch. It makes responses to HEAD requests correct independent of the middlewares in use. It goes beyond this in that it also removes content for responses that can't have any content (because of their status code), (re)setting the Content-Length header to match.

The patch also created one single point for response fix-ups and moves the call to fix_location_header there, because both it fix_location_header and remove_unallowed_response_content make sure Django always plays by some basic HTTP rules in what it emits.

(in reply to: ↑ 10 ) 11/10/07 20:07:11 changed by arien <regexbot@gmail.com>

Replying to mtredinnick:

The functionality is there, but is should be compulsory; you shouldn't have to turn on ConditionalGetMiddleware for that.

The reason I mention fix_location_header isn't because it has to do with handling of HEAD requests (it doesn't), it's because fix_location_header could also be in some optional middleware, yet it isn't. Why is that? The same reasons could be given for this issue IMHO.

The concrete bugs, if that helps:

1. HEAD requests can return content unless ConditionalGetMiddleware is installed and is run after all other middleware.

2. Response middleware can still happily generate invalid Location headers, because fix_location_header runs too early to catch that.

The last patch fixes both issues and also makes sure that responses with status codes 1xx, 204 and 304 will never return content. (Well, when conversion of the "raw" request to the handler's request_class fails you are stuck with a "raw" request, so there I punted.)

11/10/07 21:55:45 changed by mtredinnick

  • status changed from reopened to closed.
  • resolution set to fixed.

(In [6662]) Fixed #5898 -- Changed a few response processing paths to make things harder to get wrong and easier to get right. Previous behaviour wasn't buggy, but it was harder to use than necessary.

We now have automatic HEAD processing always (previously required ConditionalGetMiddleware?), middleware benefits from the Location header rewrite, so they can use relative URLs as well, and responses with response codes 1xx, 204 or 304 will always have their content removed, in accordance with the HTTP spec (so it's much harder to indavertently deliver invalid responses).

Based on a patch and diagnosis from regexbot@gmail.com.

11/11/07 03:37:43 changed by arien <regexbot@gmail.com>

  • attachment middleware-docs.diff added.

11/11/07 03:41:52 changed by arien <regexbot@gmail.com>

  • status changed from closed to reopened.
  • needs_docs set to 1.
  • resolution deleted.

Thank you for checking this in and writing the tests, mtredinnick. Could you also patch the middleware docs to remove the comment about ConditionalGetMiddleware's handling of HEAD requests? (I'll reopen so this doesn't get lost.)

11/15/07 03:26:57 changed by mtredinnick

  • status changed from reopened to closed.
  • resolution set to fixed.

(In [6678]) Fixed #5898 -- Updated docs for r6662, as pointed out by arien.


Add/Change #5898 (HEAD processing should be compulsory)




Change Properties
Action