Opened 3 years ago

Last modified 8 weeks ago

#19705 assigned Bug

CommonMiddleware handles If-None-Match incorrectly

Reported by: aaugustin Owned by: syphar
Component: HTTP handling Version: master
Severity: Normal Keywords:
Cc: hirokiky@…, real.human@… Triage Stage: Accepted
Has patch: yes Needs documentation: no
Needs tests: no Patch needs improvement: yes
Easy pickings: no UI/UX: no

Description (last modified by aaugustin)

Two middleware check ETags for unmodified responses: CommonMiddleware and ConditionalGetMiddleware and they do it inconsistently.

If the response's ETag matches the request's If-None-Match:

  • ConditionalGetMiddleware changes the response code to 304, preserving all headers; the content gets removed later on
  • CommonMiddleware creates a new HttpResponseNotModified without content and simply restores the cookies.

As a consequence, CommonMiddleware returns a response without ETag, which is wrong. I detected this with RedBot on a Django site I run. Any site with USE_ETAGS = True has this problem.

In general, wiping headers sounds like a bad idea. A 304 is supposed to have the same headers as the 200. (Well, the RFC is more complicated, but I think it's the general idea. Future versions of HTTP will likely require the Content-Length not to be 0.)

I believe that CommonMiddleware should simply generate the ETag and not handle conditional content removal; that's the job of ConditionalGetMiddleware.

For example, if one is using GzipMiddleware, the correct response chain is:

  • CommonMiddleware computes the ETag,
  • GzipMiddleware compresses the content and modifies the ETag,
  • ConditionalGetMiddleware uses the modified ETag to decide if the response was modified or not.

This is a good reason to keep "ETag generation" and "Etag checking" concerns separate. The same argument applies to any middleware that sets or modifies ETags.

Unfortunately, CommonMiddleware is documented to "take care of sending Not Modified responses, if appropriate", so this would be a backwards incompatible change.

Change History (13)

comment:1 Changed 3 years ago by aaugustin

  • Description modified (diff)
  • Summary changed from CommonMiddleware handles If-Modified-Since incorrectly to CommonMiddleware handles If-None-Match incorrectly
  • Triage Stage changed from Unreviewed to Accepted

comment:2 Changed 3 years ago by aaugustin

The discussion below was closed as a duplicate of this ticket.

comment:3 Changed 3 years ago by hirokiky

  • Cc hirokiky@… added
  • Has patch set
  • Owner changed from nobody to hirokiky
  • Status changed from new to assigned

I send a pull-request to solve this problem

comment:4 Changed 3 years ago by mrmachine

This still leaves a dependency on the order of middleware for correct operation. I don't think Etags should be set or checked in middleware at all. We already have a USE_ETAGS setting. Why not just return the HttpResponseNotModified after all middleware has executed, when USE_ETAGS is True?

We don't need middleware to create Etags, either. We could hook into content and header assignment for HttpResponse, so that Etags will always be set or updated whenever content or headers change, when USE_ETAGS is True.

Another benefit would be that we remove the repetitive requirement for ALL middleware that alters content or headers having to check the USE_ETAGS setting and then conditionally recalculate and update the Etag header. It's currently very easy for a middleware author to alter content without realising that they must do this, which means we could be serving responses with stale Etag headers.

comment:5 Changed 2 years ago by mrmachine

  • Cc real.human@… added

comment:6 Changed 2 years ago by timo

  • Patch needs improvement set

Setting as "patch needs improvement" given the previous comment and the fact that the PR no longer merges cleanly.

comment:7 Changed 2 years ago by hirokiky

  • Owner hirokiky deleted
  • Status changed from assigned to new

I de-assign once time. because I won't try to fix this problem these days. sorry.

comment:8 Changed 14 months ago by syphar

Since I stumbled on it while working with Django, I want to take care (if possible) of this issue.

What seems to be missing (apart from implementation) is this major design decision (of course there could be ways in between):

Way 1: Cleanup middlewares

  • as aaugustin proposed
  • change middleware behavior (backwards incompatible)
  • CommonMiddleware only calculates the ETag, if setting is set
  • GZipMiddleware changes it probably
  • ConditionalGetMiddleware takes care about the response code
  • perhaps move http.conditional_content_removal(), since it then removes the content based on the response code
  • document usage and needed order of the middlewares

Way 2: remove ETag handling in middlewares

  • as mrmachine proposed
  • also backwards incompatible
  • no ETag / Conditional Handling in middleware
  • ETags are always calculated in the request (when the setting is set)
  • conditional view processing is also moved, and always done (open on this, what about last-modified handling?)
  • perhaps easier to use for users, since it hooks into the response-content
  • open question would be, how the view decorators @etag and @condition would work then

comment:9 Changed 14 months ago by syphar

  • Owner set to syphar
  • Status changed from new to assigned

comment:10 Changed 14 months ago by syphar

after talking to apollo13 and mjtamlyn we lean towards way 1, for a first draft.

I'll look into the old PR and improve / finish it up.

comment:11 Changed 14 months ago by syphar

This will get funny.

There is also some duplication between:

  • CommonMiddleware
  • ConditionalGetMiddleware
  • the @condition decorator

also bugs fixed in only one of the places.

comment:12 Changed 14 months ago by mrmachine

Was there any comment from those who prefer option 1 about all middleware that is ever written in the future having to check the value of the USE_ETAGS setting if the middleware alters content/headers, and recalculate the etag? This seems very likely to not happen in a lot of cases. Not all middleware authors will know or care about etags, and if they write a middleware class that alters content/headers it will become buggy when deployed by other developers who have enabled this setting.

I think that option 1 is just patching the symptom of coupled middleware and repetitive etag handling in Django's own middleware, but doesn't do much for the wider middleware ecosystem or preventing developers from making the same mistakes in their own code.

comment:13 Changed 8 weeks ago by syphar

Just rediscovered this issue.

  • We refactored the conditional-get handling in 7a40fef17ab7918cbb1ddc3ba080f42b420f7a48. There we merged all the different implementations of conditional-get handling into one place.
  • there, based on the original code, a new response is generated as not-modified. Only cookies are kept if they are there.
  • I just started working on a new ticket that will deprecate / remove {{USE_ETAGS}} setting, and get the conditional-get handling out of {{CommonMiddleware}}. Then the {{ConditionalGetMiddleware}} will generate a {{ETag}} if there is none. By this middleware just being the last one, many of the edge-cases will be fixed.

Is there any knowledge on if we should keep all the headers in a 304 response? Before the refactor we had places that handled it differently.

At the moment we're returning a new 304 response all the time.

Note: See TracTickets for help on using tickets.
Back to Top