Opened 16 years ago
Closed 2 years ago
#9249 closed New feature (wontfix)
Google Analytics' Cookies break CacheMiddleware when SessionMiddleware turns on Vary: Cookie
Reported by: | pixelcort | Owned by: | |
---|---|---|---|
Component: | HTTP handling | Version: | 1.0 |
Severity: | Normal | Keywords: | cache cookies |
Cc: | raymond.penners@…, django@…, harm.verhagen@…, trbs@… | Triage Stage: | Unreviewed |
Has patch: | no | Needs documentation: | no |
Needs tests: | no | Patch needs improvement: | no |
Easy pickings: | no | UI/UX: | no |
Description (last modified by )
When using Google Analytics on a Django project with CacheMiddleware and SessionMiddleware turned on, the Cookies that Google Analytics apparently change on each reload, invalidating the Vary: Cookie parameter that SessionMiddleware is setting.
There should be a way to define cookie prefixes, such as 'utm'
, to ignore for cookie variation for caching.
Attachments (1)
Change History (22)
comment:1 by , 16 years ago
Triage Stage: | Unreviewed → Accepted |
---|
comment:2 by , 16 years ago
Component: | Uncategorized → HTTP handling |
---|
comment:3 by , 16 years ago
Owner: | changed from | to
---|
by , 16 years ago
Attachment: | cache_ignore_cookie.patch added |
---|
comment:4 by , 16 years ago
Has patch: | set |
---|---|
Keywords: | cache cookies added |
Status: | new → assigned |
comment:5 by , 14 years ago
Patch needs improvement: | set |
---|---|
Severity: | → Normal |
Type: | → New feature |
If we add a workaround for this, we need to make it clear in the docs that this is a workaround that will help Django's internal cache, but upstream HTTP caches will still be broken unless they also special-case Google Analytics cookies. See for instance https://groups.google.com/group/analytics-help-integrations/browse_thread/thread/6c9d4a0fea1cc1d2
It's really Google Analytics that's breaking HTTP caching here (more specifically, any use of Vary: Cookie, which is part of HTTP caching). We can provide a Django-specific workaround, but that doesn't fix the core problem, which isn't in Django.
Categorizing as a "new feature," since what's under discussion here is not a bug in Django, but a new feature to ease working around a problem with Google Analytics.
The latest patch here looks pretty reasonable. The setting should be just a list of plain regex strings, though, to simplify using it - the compilation can be done by the middleware once at startup time.
comment:6 by , 13 years ago
Easy pickings: | unset |
---|---|
UI/UX: | unset |
Regarding regexes - I would favour using compiled regexes - this is consistent with other settings that are regexes. (Only the URL conf appears to be different here).
comment:7 by , 13 years ago
Instead of listing individual cookies explicitly, I feel it would be better to have Django keep a record on whether or not a cookie was been accessed. This can be done in similar fashion as to how Django currently checks whether or not the session was accessed.
Benefits:
- This would work without any configuration. Any additional cookies set by whatever frontend Javascript code that are not used by Django views would automatically be ignored.
- No new setting & accompanying documentation
comment:8 by , 13 years ago
Cc: | added |
---|
comment:9 by , 13 years ago
Cc: | added |
---|
comment:10 by , 13 years ago
Cc: | added |
---|
comment:11 by , 13 years ago
The suggestion mentioned in comment:7: "automatically only taking into account the actual used cookies in the cache key" that would work also in the following case:
- csrf middleware is enabled (so user sends csrftoken cookie every request.
- Some view A depends on cookie foobar_enabled (2 values), but that specific view does NOT use any csrf token. In the current situation caching view A does not work between clients (as different csrf tokens cooies, cause different cache keys in the view that doesn't use this cookie)
With suggestion comment:6 this would automatically work.
comment:12 by , 12 years ago
It was easy to monkey patch django.http.parse_cookie
to use a custom dictionary that logs gets and sets but I had to take into account some things.
- This CSRF middleware accesses
CSRF_COOKIE_NAME
on every request so ignore that and checkrequest.META.get['CSRF_COOKIE_USED']
instead. - The session middleware accesses
SESSION_COOKIE_NAME
on every request so ignore that and checkrequest.session.accessed
instead. - There may be other contrib middleware I'm not using and the best solution would be for everything to not access their cookies until needed.
However, it turns out that IE and Firefox have the same issue but client side. The Google Analytics cookies will cause them to completely invalidate their cache and will not even send If-Modified-Since
and so any site that uses Vary: Cookie
and Google Analytics effectively has no client side caching for the majority of their users.
In Chrome requests go something like the following (Opera and Safari seem similar but I've tested them less).
- Chrome requests a page and gets the following:
Cache-Control: public, max-age=600 Vary: Cookie
- If you navigate away from the page and come back to it (the actual reload button sometimes behaves differently in Chrome to other browsers) and the page hasn't expired (i.e.
max-age
) then SOMETIMES it will just serve that from the cache. I don't really understand this and it's probably some kind of heuristics. - Otherwise request a new page with the
If-Modified-Since
and the contents of our cookies. Due to my monkey patch and custom decorator it gets a 304 response and all is good.
In IE (I tested 9 and 10) and Firefox (currently 14) it's more like the following.
- Receive a page with the same headers as above.
- Once the page is loaded, Google Analytics updates it cookies and now the page is immediately completely invalidated.
- Nothing is ever served from the local cache and so a new request is sent. However since the page was invalidated in step 2 not even an
If-Modified-Since
header is sent and so you get a full 200 response every single time.
I might now go in the opposite direction and strip out Vary: Cookie
on every response and raise an exception if any cached view tries to access cookies it wasn't meant to.
comment:13 by , 12 years ago
I've now managed to get this to work in Internet Explorer and Firefox.
The fix for IE is quite simple and is due to it sending a non standard If-Modified-Since
header that ConditionalGetMiddleware fails to parse. I've opened ticket #18648 for that.
While Firefox won't send If-Modified-Since
with Vary: Cookie
it will send back the Etag. So to get Firefox to work as expected all you need to do is use a hash of Last-Modified
as the Etag (assuming there isn't already a proper Etag for the response).
comment:14 by , 12 years ago
Cc: | added |
---|
comment:15 by , 12 years ago
Needs documentation: | set |
---|---|
Type: | New feature → Bug |
Version: | 1.0 → 1.4-rc-2 |
comment:16 by , 12 years ago
Description: | modified (diff) |
---|---|
Type: | Bug → New feature |
Version: | 1.4-rc-2 → 1.0 |
If you look at the discussion, this isn't a bug; it's really a new feature.
The version field tracks the version the bug was reported in.
Related: #15201. Caching is hard.
comment:17 by , 12 years ago
Owner: | changed from | to
---|
comment:18 by , 12 years ago
Owner: | changed from | to
---|
comment:19 by , 9 years ago
A work around is to create a middleware that deletes all the cookies in request.COOKIE that django should ignore. (Just don't delete the important cookies. :) Example from DjangoCon.eu 2016: https://youtu.be/AZ4ISa1u-HE?t=12548
comment:20 by , 3 years ago
Owner: | removed |
---|---|
Status: | assigned → new |
comment:21 by , 2 years ago
Has patch: | unset |
---|---|
Needs documentation: | unset |
Patch needs improvement: | unset |
Resolution: | → wontfix |
Status: | new → closed |
Triage Stage: | Accepted → Unreviewed |
This issue is rather niche, moreover, the Collin's workaround is straightforward and can be implemented by any app on its own. As far as I'm aware it's not something that Django itself has to provide.
I've created a patch which introduce CACHE_MIDDLEWARE_IGNORE_COOKIES setting, to specify cookies not to be considered in building cache key.