Opened 8 years ago

Last modified 10 months ago

#9249 assigned New feature

Google Analytics' Cookies break CacheMiddleware when SessionMiddleware turns on Vary: Cookie

Reported by: pixelcort Owned by: Łukasz Langa
Component: HTTP handling Version: 1.0
Severity: Normal Keywords: cache cookies
Cc: raymond.penners@…, django@…, harm.verhagen@…, trbs@… Triage Stage: Accepted
Has patch: yes Needs documentation: yes
Needs tests: no Patch needs improvement: yes
Easy pickings: no UI/UX: no

Description (last modified by Aymeric Augustin)

When using Google Analytics on a Django project with CacheMiddleware and SessionMiddleware turned on, the Cookies that Google Analytics apparently change on each reload, invalidating the Vary: Cookie parameter that SessionMiddleware is setting.

There should be a way to define cookie prefixes, such as 'utm', to ignore for cookie variation for caching.

Attachments (1)

cache_ignore_cookie.patch (5.4 KB) - added by Vladimir Dmitriev 8 years ago.

Download all attachments as: .zip

Change History (20)

comment:1 Changed 8 years ago by Jacob

Triage Stage: UnreviewedAccepted

comment:2 Changed 8 years ago by Thejaswi Puthraya

Component: UncategorizedHTTP handling

comment:3 Changed 8 years ago by Vladimir Dmitriev

Owner: changed from nobody to Vladimir Dmitriev

Changed 8 years ago by Vladimir Dmitriev

Attachment: cache_ignore_cookie.patch added

comment:4 Changed 8 years ago by Vladimir Dmitriev

Has patch: set
Keywords: cache cookies added
Status: newassigned

I've created a patch which introduce CACHE_MIDDLEWARE_IGNORE_COOKIES setting, to specify cookies not to be considered in building cache key.

comment:5 Changed 6 years ago by Carl Meyer

Patch needs improvement: set
Severity: Normal
Type: New feature

If we add a workaround for this, we need to make it clear in the docs that this is a workaround that will help Django's internal cache, but upstream HTTP caches will still be broken unless they also special-case Google Analytics cookies. See for instance

It's really Google Analytics that's breaking HTTP caching here (more specifically, any use of Vary: Cookie, which is part of HTTP caching). We can provide a Django-specific workaround, but that doesn't fix the core problem, which isn't in Django.

Categorizing as a "new feature," since what's under discussion here is not a bug in Django, but a new feature to ease working around a problem with Google Analytics.

The latest patch here looks pretty reasonable. The setting should be just a list of plain regex strings, though, to simplify using it - the compilation can be done by the middleware once at startup time.

comment:6 Changed 6 years ago by Luke Plant

Easy pickings: unset
UI/UX: unset

Regarding regexes - I would favour using compiled regexes - this is consistent with other settings that are regexes. (Only the URL conf appears to be different here).

comment:7 Changed 5 years ago by anonymous

Instead of listing individual cookies explicitly, I feel it would be better to have Django keep a record on whether or not a cookie was been accessed. This can be done in similar fashion as to how Django currently checks whether or not the session was accessed.


  • This would work without any configuration. Any additional cookies set by whatever frontend Javascript code that are not used by Django views would automatically be ignored.
  • No new setting & accompanying documentation

comment:8 Changed 5 years ago by raymond.penners@…

Cc: raymond.penners@… added

comment:9 Changed 5 years ago by jedie

Cc: django@… added

comment:10 Changed 5 years ago by harm.verhagen@…

Cc: harm.verhagen@… added

comment:11 Changed 5 years ago by harm

The suggestion mentioned in comment:7: "automatically only taking into account the actual used cookies in the cache key" that would work also in the following case:

  • csrf middleware is enabled (so user sends csrftoken cookie every request.
  • Some view A depends on cookie foobar_enabled (2 values), but that specific view does NOT use any csrf token. In the current situation caching view A does not work between clients (as different csrf tokens cooies, cause different cache keys in the view that doesn't use this cookie)

With suggestion comment:6 this would automatically work.

Last edited 3 years ago by harm (previous) (diff)

comment:12 Changed 5 years ago by KyleMac

It was easy to monkey patch django.http.parse_cookie to use a custom dictionary that logs gets and sets but I had to take into account some things.

  1. This CSRF middleware accesses CSRF_COOKIE_NAME on every request so ignore that and check request.META.get['CSRF_COOKIE_USED'] instead.
  2. The session middleware accesses SESSION_COOKIE_NAME on every request so ignore that and check request.session.accessed instead.
  3. There may be other contrib middleware I'm not using and the best solution would be for everything to not access their cookies until needed.

However, it turns out that IE and Firefox have the same issue but client side. The Google Analytics cookies will cause them to completely invalidate their cache and will not even send If-Modified-Since and so any site that uses Vary: Cookie and Google Analytics effectively has no client side caching for the majority of their users.

In Chrome requests go something like the following (Opera and Safari seem similar but I've tested them less).

  1. Chrome requests a page and gets the following:
    Cache-Control: public, max-age=600
    Vary: Cookie
  1. If you navigate away from the page and come back to it (the actual reload button sometimes behaves differently in Chrome to other browsers) and the page hasn't expired (i.e. max-age) then SOMETIMES it will just serve that from the cache. I don't really understand this and it's probably some kind of heuristics.
  2. Otherwise request a new page with the If-Modified-Since and the contents of our cookies. Due to my monkey patch and custom decorator it gets a 304 response and all is good.

In IE (I tested 9 and 10) and Firefox (currently 14) it's more like the following.

  1. Receive a page with the same headers as above.
  2. Once the page is loaded, Google Analytics updates it cookies and now the page is immediately completely invalidated.
  3. Nothing is ever served from the local cache and so a new request is sent. However since the page was invalidated in step 2 not even an If-Modified-Since header is sent and so you get a full 200 response every single time.

I might now go in the opposite direction and strip out Vary: Cookie on every response and raise an exception if any cached view tries to access cookies it wasn't meant to.

comment:13 Changed 5 years ago by KyleMac

I've now managed to get this to work in Internet Explorer and Firefox.

The fix for IE is quite simple and is due to it sending a non standard If-Modified-Since header that ConditionalGetMiddleware fails to parse. I've opened ticket #18648 for that.

While Firefox won't send If-Modified-Since with Vary: Cookie it will send back the Etag. So to get Firefox to work as expected all you need to do is use a hash of Last-Modified as the Etag (assuming there isn't already a proper Etag for the response).

comment:14 Changed 4 years ago by trbs

Cc: trbs@… added

comment:15 Changed 4 years ago by anonymous

Needs documentation: set
Type: New featureBug
Version: 1.01.4-rc-2

comment:16 Changed 4 years ago by Aymeric Augustin

Description: modified (diff)
Type: BugNew feature
Version: 1.4-rc-21.0

If you look at the discussion, this isn't a bug; it's really a new feature.

The version field tracks the version the bug was reported in.

Related: #15201. Caching is hard.

comment:17 Changed 4 years ago by anonymous

Owner: changed from Vladimir Dmitriev to anonymous

comment:18 Changed 4 years ago by Łukasz Langa

Owner: changed from anonymous to Łukasz Langa

comment:19 Changed 10 months ago by Collin Anderson

A work around is to create a middleware that deletes all the cookies in request.COOKIE that django should ignore. (Just don't delete the important cookies. :) Example from 2016:

Note: See TracTickets for help on using tickets.
Back to Top