Opened 8 years ago

Closed 4 years ago

Last modified 3 years ago

#4992 closed Uncategorized (fixed)

Alter cache key based on GET parameters

Reported by: anonymous Owned by: nobody
Component: Core (Cache system) Version: master
Severity: Normal Keywords:
Cc: miracle2k, jim@…, dane.springmeyer@…, sylvain.pasche@…, taavi@…, apasotti@… Triage Stage: Ready for checkin
Has patch: yes Needs documentation: no
Needs tests: no Patch needs improvement: no
Easy pickings: no UI/UX: no

Description

Why GET requests are not cached by CacheMiddleware?
For example paginator accepts page numer as a GET parametr and pages other than first were not cached in any way.
I changed this policy on my own in django source, but I think it should be merged in to django, maybe as an option.
Patch changing it is attached to this ticket.

Attachments (3)

diff (1.3 KB) - added by anonymous 8 years ago.
patch
cache_by_request_full_path.diff (3.9 KB) - added by PeterKz 6 years ago.
Cache by request.get_full_path() instead of request.path
cache_by_request_full_path_v2.diff (7.0 KB) - added by guettli 4 years ago.
Updated patch of PeterKz to apply cleanly to trunk. Added tests

Download all attachments as: .zip

Change History (30)

Changed 8 years ago by anonymous

patch

comment:1 Changed 8 years ago by jacob

  • Needs documentation unset
  • Needs tests unset
  • Patch needs improvement unset
  • Resolution set to invalid
  • Status changed from new to closed

GET requests are cached by the cache middleware. If you're having difficulties making it work, try asking on the django-users mailing list.

Oh, and you don't want to use this patch. It's gonna break your site pretty bad.

comment:2 Changed 8 years ago by Collin Grady <cgrady@…>

  • Resolution invalid deleted
  • Status changed from closed to reopened
  • Summary changed from Cache GET request. to Alter cache key based on GET parameters

Talking with the submitter on IRC, I think there was a miscommunication in his original summary. He's not saying that it won't cache GET requests at all, he wants the cache keys to vary based on the GET params, so you can cache a view and still use GET params.

For instance, with the current cache middleware, /?page=2, and /?page=3 would skip the cache entirely, while they could be cached based on the page=2 or page=3 bit as well.

The attached patch isn't a good way to do it, but I think the idea has merit :)

comment:3 Changed 8 years ago by ubernostrum

Probably the best way to get different cache keys from different GET parameters is to use an md5 of the contents of request.GET (so something like md5.new(str(request.GET)).hexdigest()).

comment:4 Changed 8 years ago by Simon G. <dev@…>

  • Triage Stage changed from Unreviewed to Design decision needed

comment:5 Changed 7 years ago by jacob

  • Component changed from Uncategorized to Cache system

comment:6 Changed 7 years ago by jacob

FWIW, a simple str(request.GET) isn't good enough -- if this ticket gets accepted, I'd expect /?a=1&b=2 to have the same cache key as /?b=2&a=1, so there's sorting business that needs to happen, too.

I'm currently -0 on this, though. For one, it increases the time to calculate caching in the most common case. On top of that, though GET with a query string is supposed to be cacheable, RFC 2616 notes that "some applications have traditionally used GETs [...] with query URLs [...] to perform operations with significant side effects" and suggests that caches treat such requests carefully (§13.9). I'm inclined to do the same.

comment:7 Changed 7 years ago by Daniel Pope <dan@…>

What about just request.get_full_path()?

HTTP caches don't consider /?a=1&b=2 to have the same cache key as /?b=2&a=1; they treat the URL as opaque. We know different, but developers shouldn't be surprised about having to use the same key for internal caching that would be used externally.

comment:8 Changed 7 years ago by julianb

  • milestone set to post-1.0

comment:9 Changed 6 years ago by PeterKz

If I understand http://www.w3.org/DesignIssues/Axioms.html correctly you should treat URL:s as opaque and get three different cache objects for:

  • example.com/list/
  • example.com/list/?a=1&b=2
  • example.com/list/?b=2&a=1

It is likely that there won't be many permutations for multiple URL parameters as they are typically constructed in a form controlled by the application developer.

Implementing this would likely give a decent performance boost for Django apps that filter large lists based on query parameters.

comment:10 Changed 6 years ago by PeterKz

Design-wise it would be great if the developer could be in control of which parameters that require a new cache item. After all, it is the developer who knows which parameters are likely to influence the result returned to the client. The currenct view cache decorator could be complemented like this:

Vary by the entire URL (should be default behaviour):

@cache_page(60 * 15)
@vary_by_param("*")
def slashdot_this(request):
    ...

Only vary by values for parameter a and b (ignore everything else):

@cache_page(60 * 15)
@vary_by_param(["a","b"])
def slashdot_this(request):
    ...

comment:11 Changed 6 years ago by Daniel Pope <dan@…>

Restricting this to a subset of the GET parameters would only reduce the number of duplicate copies in the cache. As such I think it's more beneficial to provide transparent caching with a simple decorator API. The reason we need to use request.get_full_path() is that the current behaviour allows Django to serve the wrong page for requests with a query string. Trying to reduce the number of duplicate entries in the cache to the bare minimum is more of a wishlist item.

Rather than building all of the possible ways you could generate cache keys into decorators, why not just allow cache_page to take a callable?

eg.

@cache_page(key=lambda request: request.path)
def slashdot_this(request):
  ...

This should still default to request.get_full_path() so that the default behaviour works for all views.

comment:12 Changed 6 years ago by PeterKz

get_full_path() seems to be the way to go. It returns the path and query, but ignores the fragment identifier. So, this would be a minor patch that improves performance for many sites.

Changed 6 years ago by PeterKz

Cache by request.get_full_path() instead of request.path

comment:13 Changed 6 years ago by PeterKz

  • Has patch set

Uploaded patch for the current trunk.

comment:14 Changed 6 years ago by garrison

  • Cc jim@… added

comment:15 Changed 6 years ago by springmeyer

  • Cc dane.springmeyer@… added

comment:16 Changed 6 years ago by sylvain

  • Cc sylvain.pasche@… added

comment:17 Changed 6 years ago by anonymous

  • milestone post-1.0 deleted

Milestone post-1.0 deleted

comment:18 Changed 6 years ago by kmike

I've implemented something like Daniel Pope 's suggestion and opened separate ticket for that: #11269.

comment:19 Changed 6 years ago by anonymous

  • Cc taavi@… added

comment:20 Changed 5 years ago by miracle2k

  • Cc miracle2k added
  • Resolution set to fixed
  • Status changed from reopened to closed

comment:21 Changed 5 years ago by miracle2k

  • Resolution fixed deleted
  • Status changed from closed to reopened

Sorry. This has happened to me before. Is it a bug in Trac?

comment:22 Changed 5 years ago by mtredinnick

  • Triage Stage changed from Design decision needed to Accepted

Even squid recently (last year) changed their default caching behaviour to respect GET params, so they think the web has moved far enough to udnerstand side-effects methods. We should do this now.

comment:23 Changed 4 years ago by guettli

  • Cc hv@… added

Changed 4 years ago by guettli

Updated patch of PeterKz to apply cleanly to trunk. Added tests

comment:24 Changed 4 years ago by elpaso66

  • Cc apasotti@… added

comment:25 Changed 4 years ago by jezdez

  • Triage Stage changed from Accepted to Ready for checkin

comment:26 Changed 4 years ago by jezdez

  • Resolution set to fixed
  • Status changed from reopened to closed

In [15705]:

Fixed #4992 -- Respect the GET request query string when creating cache keys. Thanks PeterKz and guettli for the initial patch.

comment:27 Changed 3 years ago by guettli

  • Cc hv@… removed
  • Easy pickings unset
  • Severity set to Normal
  • Type set to Uncategorized
  • UI/UX unset
Note: See TracTickets for help on using tickets.
Back to Top