| 1 |
======================== |
|---|
| 2 |
Django's cache framework |
|---|
| 3 |
======================== |
|---|
| 4 |
|
|---|
| 5 |
So, you got slashdotted_. Now what? |
|---|
| 6 |
|
|---|
| 7 |
Django's cache framework gives you three methods of caching dynamic pages in |
|---|
| 8 |
memory or in a database. You can cache the output of entire pages, you can |
|---|
| 9 |
cache only the pieces that are difficult to produce, or you can cache your |
|---|
| 10 |
entire site. |
|---|
| 11 |
|
|---|
| 12 |
.. _slashdotted: http://en.wikipedia.org/wiki/Slashdot_effect |
|---|
| 13 |
|
|---|
| 14 |
Setting up the cache |
|---|
| 15 |
==================== |
|---|
| 16 |
|
|---|
| 17 |
The cache framework allows for different "backends" -- different methods of |
|---|
| 18 |
caching data. There's a simple single-process memory cache (mostly useful as a |
|---|
| 19 |
fallback) and a memcached_ backend (the fastest option, by far, if you've got |
|---|
| 20 |
the RAM). |
|---|
| 21 |
|
|---|
| 22 |
Before using the cache, you'll need to tell Django which cache backend you'd |
|---|
| 23 |
like to use. Do this by setting the ``CACHE_BACKEND`` in your settings file. |
|---|
| 24 |
|
|---|
| 25 |
The ``CACHE_BACKEND`` setting is a "fake" URI (really an unregistered scheme). |
|---|
| 26 |
Examples: |
|---|
| 27 |
|
|---|
| 28 |
============================== =========================================== |
|---|
| 29 |
CACHE_BACKEND Explanation |
|---|
| 30 |
============================== =========================================== |
|---|
| 31 |
memcached://127.0.0.1:11211/ A memcached backend; the server is running |
|---|
| 32 |
on localhost port 11211. You can use |
|---|
| 33 |
multiple memcached servers by separating |
|---|
| 34 |
them with semicolons. |
|---|
| 35 |
|
|---|
| 36 |
db://tablename/ A database backend in a table named |
|---|
| 37 |
"tablename". This table should be created |
|---|
| 38 |
with "django-admin createcachetable". |
|---|
| 39 |
|
|---|
| 40 |
file:///var/tmp/django_cache/ A file-based cache stored in the directory |
|---|
| 41 |
/var/tmp/django_cache/. |
|---|
| 42 |
|
|---|
| 43 |
simple:/// A simple single-process memory cache; you |
|---|
| 44 |
probably don't want to use this except for |
|---|
| 45 |
testing. Note that this cache backend is |
|---|
| 46 |
NOT thread-safe! |
|---|
| 47 |
|
|---|
| 48 |
locmem:/// A more sophisticated local memory cache; |
|---|
| 49 |
this is multi-process- and thread-safe. |
|---|
| 50 |
============================== =========================================== |
|---|
| 51 |
|
|---|
| 52 |
All caches may take arguments -- they're given in query-string style. Valid |
|---|
| 53 |
arguments are: |
|---|
| 54 |
|
|---|
| 55 |
timeout |
|---|
| 56 |
Default timeout, in seconds, to use for the cache. Defaults to 5 |
|---|
| 57 |
minutes (300 seconds). |
|---|
| 58 |
|
|---|
| 59 |
max_entries |
|---|
| 60 |
For the simple and database backends, the maximum number of entries |
|---|
| 61 |
allowed in the cache before it is cleaned. Defaults to 300. |
|---|
| 62 |
|
|---|
| 63 |
cull_percentage |
|---|
| 64 |
The percentage of entries that are culled when max_entries is reached. |
|---|
| 65 |
The actual percentage is 1/cull_percentage, so set cull_percentage=3 to |
|---|
| 66 |
cull 1/3 of the entries when max_entries is reached. |
|---|
| 67 |
|
|---|
| 68 |
A value of 0 for cull_percentage means that the entire cache will be |
|---|
| 69 |
dumped when max_entries is reached. This makes culling *much* faster |
|---|
| 70 |
at the expense of more cache misses. |
|---|
| 71 |
|
|---|
| 72 |
For example:: |
|---|
| 73 |
|
|---|
| 74 |
CACHE_BACKEND = "memcached://127.0.0.1:11211/?timeout=60" |
|---|
| 75 |
|
|---|
| 76 |
Invalid arguments are silently ignored, as are invalid values of known |
|---|
| 77 |
arguments. |
|---|
| 78 |
|
|---|
| 79 |
.. _memcached: http://www.danga.com/memcached/ |
|---|
| 80 |
|
|---|
| 81 |
The per-site cache |
|---|
| 82 |
================== |
|---|
| 83 |
|
|---|
| 84 |
Once the cache is set up, the simplest way to use the cache is to cache your |
|---|
| 85 |
entire site. Just add ``django.middleware.cache.CacheMiddleware`` to your |
|---|
| 86 |
``MIDDLEWARE_CLASSES`` setting, as in this example:: |
|---|
| 87 |
|
|---|
| 88 |
MIDDLEWARE_CLASSES = ( |
|---|
| 89 |
"django.middleware.cache.CacheMiddleware", |
|---|
| 90 |
"django.middleware.common.CommonMiddleware", |
|---|
| 91 |
) |
|---|
| 92 |
|
|---|
| 93 |
(The order of ``MIDDLEWARE_CLASSES`` matters. See "Order of MIDDLEWARE_CLASSES" |
|---|
| 94 |
below.) |
|---|
| 95 |
|
|---|
| 96 |
Then, add the following required settings to your Django settings file: |
|---|
| 97 |
|
|---|
| 98 |
* ``CACHE_MIDDLEWARE_SECONDS`` -- The number of seconds each page should be |
|---|
| 99 |
cached. |
|---|
| 100 |
* ``CACHE_MIDDLEWARE_KEY_PREFIX`` -- If the cache is shared across multiple |
|---|
| 101 |
sites using the same Django installation, set this to the name of the site, |
|---|
| 102 |
or some other string that is unique to this Django instance, to prevent key |
|---|
| 103 |
collisions. Use an empty string if you don't care. |
|---|
| 104 |
|
|---|
| 105 |
The cache middleware caches every page that doesn't have GET or POST |
|---|
| 106 |
parameters. Additionally, ``CacheMiddleware`` automatically sets a few headers |
|---|
| 107 |
in each ``HttpResponse``: |
|---|
| 108 |
|
|---|
| 109 |
* Sets the ``Last-Modified`` header to the current date/time when a fresh |
|---|
| 110 |
(uncached) version of the page is requested. |
|---|
| 111 |
* Sets the ``Expires`` header to the current date/time plus the defined |
|---|
| 112 |
``CACHE_MIDDLEWARE_SECONDS``. |
|---|
| 113 |
* Sets the ``Cache-Control`` header to give a max age for the page -- again, |
|---|
| 114 |
from the ``CACHE_MIDDLEWARE_SECONDS`` setting. |
|---|
| 115 |
|
|---|
| 116 |
See the `middleware documentation`_ for more on middleware. |
|---|
| 117 |
|
|---|
| 118 |
.. _`middleware documentation`: http://www.djangoproject.com/documentation/middleware/ |
|---|
| 119 |
|
|---|
| 120 |
The per-page cache |
|---|
| 121 |
================== |
|---|
| 122 |
|
|---|
| 123 |
A more granular way to use the caching framework is by caching the output of |
|---|
| 124 |
individual views. ``django.views.decorators.cache`` defines a ``cache_page`` |
|---|
| 125 |
decorator that will automatically cache the view's response for you. It's easy |
|---|
| 126 |
to use:: |
|---|
| 127 |
|
|---|
| 128 |
from django.views.decorators.cache import cache_page |
|---|
| 129 |
|
|---|
| 130 |
def slashdot_this(request): |
|---|
| 131 |
... |
|---|
| 132 |
|
|---|
| 133 |
slashdot_this = cache_page(slashdot_this, 60 * 15) |
|---|
| 134 |
|
|---|
| 135 |
Or, using Python 2.4's decorator syntax:: |
|---|
| 136 |
|
|---|
| 137 |
@cache_page(60 * 15) |
|---|
| 138 |
def slashdot_this(request): |
|---|
| 139 |
... |
|---|
| 140 |
|
|---|
| 141 |
``cache_page`` takes a single argument: the cache timeout, in seconds. In the |
|---|
| 142 |
above example, the result of the ``slashdot_this()`` view will be cached for 15 |
|---|
| 143 |
minutes. |
|---|
| 144 |
|
|---|
| 145 |
The low-level cache API |
|---|
| 146 |
======================= |
|---|
| 147 |
|
|---|
| 148 |
Sometimes, however, caching an entire rendered page doesn't gain you very much. |
|---|
| 149 |
For example, you may find it's only necessary to cache the result of an |
|---|
| 150 |
intensive database. In cases like this, you can use the low-level cache API to |
|---|
| 151 |
store objects in the cache with any level of granularity you like. |
|---|
| 152 |
|
|---|
| 153 |
The cache API is simple:: |
|---|
| 154 |
|
|---|
| 155 |
# The cache module exports a cache object that's automatically |
|---|
| 156 |
# created from the CACHE_BACKEND setting. |
|---|
| 157 |
>>> from django.core.cache import cache |
|---|
| 158 |
|
|---|
| 159 |
# The basic interface is set(key, value, timeout_seconds) and get(key). |
|---|
| 160 |
>>> cache.set('my_key', 'hello, world!', 30) |
|---|
| 161 |
>>> cache.get('my_key') |
|---|
| 162 |
'hello, world!' |
|---|
| 163 |
|
|---|
| 164 |
# (Wait 30 seconds...) |
|---|
| 165 |
>>> cache.get('my_key') |
|---|
| 166 |
None |
|---|
| 167 |
|
|---|
| 168 |
# get() can take a default argument. |
|---|
| 169 |
>>> cache.get('my_key', 'has_expired') |
|---|
| 170 |
'has_expired' |
|---|
| 171 |
|
|---|
| 172 |
# There's also a get_many() interface that only hits the cache once. |
|---|
| 173 |
# Also, note that the timeout argument is optional and defaults to what |
|---|
| 174 |
# you've given in the settings file. |
|---|
| 175 |
>>> cache.set('a', 1) |
|---|
| 176 |
>>> cache.set('b', 2) |
|---|
| 177 |
>>> cache.set('c', 3) |
|---|
| 178 |
|
|---|
| 179 |
# get_many() returns a dictionary with all the keys you asked for that |
|---|
| 180 |
# actually exist in the cache (and haven't expired). |
|---|
| 181 |
>>> cache.get_many(['a', 'b', 'c']) |
|---|
| 182 |
{'a': 1, 'b': 2, 'c': 3} |
|---|
| 183 |
|
|---|
| 184 |
# There's also a way to delete keys explicitly. |
|---|
| 185 |
>>> cache.delete('a') |
|---|
| 186 |
|
|---|
| 187 |
That's it. The cache has very few restrictions: You can cache any object that |
|---|
| 188 |
can be pickled safely, although keys must be strings. |
|---|
| 189 |
|
|---|
| 190 |
Controlling cache: Using Vary headers |
|---|
| 191 |
===================================== |
|---|
| 192 |
|
|---|
| 193 |
The Django cache framework works with `HTTP Vary headers`_ to allow developers |
|---|
| 194 |
to instruct caching mechanisms to differ their cache contents depending on |
|---|
| 195 |
request HTTP headers. |
|---|
| 196 |
|
|---|
| 197 |
Essentially, the ``Vary`` response HTTP header defines which request headers a |
|---|
| 198 |
cache mechanism should take into account when building its cache key. |
|---|
| 199 |
|
|---|
| 200 |
By default, Django's cache system creates its cache keys using the requested |
|---|
| 201 |
path -- e.g., ``"/stories/2005/jun/23/bank_robbed/"``. This means every request |
|---|
| 202 |
to that URL will use the same cached version, regardless of user-agent |
|---|
| 203 |
differences such as cookies or language preferences. |
|---|
| 204 |
|
|---|
| 205 |
That's where ``Vary`` comes in. |
|---|
| 206 |
|
|---|
| 207 |
If your Django-powered page outputs different content based on some difference |
|---|
| 208 |
in request headers -- such as a cookie, or language, or user-agent -- you'll |
|---|
| 209 |
need to use the ``Vary`` header to tell caching mechanisms that the page output |
|---|
| 210 |
depends on those things. |
|---|
| 211 |
|
|---|
| 212 |
To do this in Django, use the convenient ``vary_on_headers`` view decorator, |
|---|
| 213 |
like so:: |
|---|
| 214 |
|
|---|
| 215 |
from django.views.decorators.vary import vary_on_headers |
|---|
| 216 |
|
|---|
| 217 |
# Python 2.3 syntax. |
|---|
| 218 |
def my_view(request): |
|---|
| 219 |
... |
|---|
| 220 |
my_view = vary_on_headers(my_view, 'User-Agent') |
|---|
| 221 |
|
|---|
| 222 |
# Python 2.4 decorator syntax. |
|---|
| 223 |
@vary_on_headers('User-Agent') |
|---|
| 224 |
def my_view(request): |
|---|
| 225 |
... |
|---|
| 226 |
|
|---|
| 227 |
In this case, a caching mechanism (such as Django's own cache middleware) will |
|---|
| 228 |
cache a separate version of the page for each unique user-agent. |
|---|
| 229 |
|
|---|
| 230 |
The advantage to using the ``vary_on_headers`` decorator rather than manually |
|---|
| 231 |
setting the ``Vary`` header (using something like |
|---|
| 232 |
``response['Vary'] = 'user-agent'``) is that the decorator adds to the ``Vary`` |
|---|
| 233 |
header (which may already exist) rather than setting it from scratch. |
|---|
| 234 |
|
|---|
| 235 |
Note that you can pass multiple headers to ``vary_on_headers()``:: |
|---|
| 236 |
|
|---|
| 237 |
@vary_on_headers('User-Agent', 'Cookie') |
|---|
| 238 |
def my_view(request): |
|---|
| 239 |
... |
|---|
| 240 |
|
|---|
| 241 |
Because varying on cookie is such a common case, there's a ``vary_on_cookie`` |
|---|
| 242 |
decorator. These two views are equivalent:: |
|---|
| 243 |
|
|---|
| 244 |
@vary_on_cookie |
|---|
| 245 |
def my_view(request): |
|---|
| 246 |
... |
|---|
| 247 |
|
|---|
| 248 |
@vary_on_headers('Cookie') |
|---|
| 249 |
def my_view(request): |
|---|
| 250 |
... |
|---|
| 251 |
|
|---|
| 252 |
Also note that the headers you pass to ``vary_on_headers`` are not case |
|---|
| 253 |
sensitive. ``"User-Agent"`` is the same thing as ``"user-agent"``. |
|---|
| 254 |
|
|---|
| 255 |
You can also use a helper function, ``patch_vary_headers()``, directly:: |
|---|
| 256 |
|
|---|
| 257 |
from django.utils.cache import patch_vary_headers |
|---|
| 258 |
def my_view(request): |
|---|
| 259 |
... |
|---|
| 260 |
response = render_to_response('template_name', context) |
|---|
| 261 |
patch_vary_headers(response, ['Cookie']) |
|---|
| 262 |
return response |
|---|
| 263 |
|
|---|
| 264 |
``patch_vary_headers`` takes an ``HttpResponse`` instance as its first argument |
|---|
| 265 |
and a list/tuple of header names as its second argument. |
|---|
| 266 |
|
|---|
| 267 |
.. _`HTTP Vary headers`: http://www.w3.org/Protocols/rfc2616/rfc2616-sec14.html#sec14.44 |
|---|
| 268 |
|
|---|
| 269 |
Controlling cache: Using other headers |
|---|
| 270 |
====================================== |
|---|
| 271 |
|
|---|
| 272 |
Another problem with caching is the privacy of data and the question of where |
|---|
| 273 |
data should be stored in a cascade of caches. |
|---|
| 274 |
|
|---|
| 275 |
A user usually faces two kinds of caches: his own browser cache (a private |
|---|
| 276 |
cache) and his provider's cache (a public cache). A public cache is used by |
|---|
| 277 |
multiple users and controlled by someone else. This poses problems with |
|---|
| 278 |
sensitive data: You don't want, say, your banking-account number stored in a |
|---|
| 279 |
public cache. So Web applications need a way to tell caches which data is |
|---|
| 280 |
private and which is public. |
|---|
| 281 |
|
|---|
| 282 |
The solution is to indicate a page's cache should be "private." To do this in |
|---|
| 283 |
Django, use the ``cache_control`` view decorator. Example:: |
|---|
| 284 |
|
|---|
| 285 |
from django.views.decorators.cache import cache_control |
|---|
| 286 |
@cache_control(private=True) |
|---|
| 287 |
def my_view(request): |
|---|
| 288 |
... |
|---|
| 289 |
|
|---|
| 290 |
This decorator takes care of sending out the appropriate HTTP header behind the |
|---|
| 291 |
scenes. |
|---|
| 292 |
|
|---|
| 293 |
There are a few other ways to control cache parameters. For example, HTTP |
|---|
| 294 |
allows applications to do the following: |
|---|
| 295 |
|
|---|
| 296 |
* Define the maximum time a page should be cached. |
|---|
| 297 |
* Specify whether a cache should always check for newer versions, only |
|---|
| 298 |
delivering the cached content when there are no changes. (Some caches |
|---|
| 299 |
might deliver cached content even if the server page changed -- simply |
|---|
| 300 |
because the cache copy isn't yet expired.) |
|---|
| 301 |
|
|---|
| 302 |
In Django, use the ``cache_control`` view decorator to specify these cache |
|---|
| 303 |
parameters. In this example, ``cache_control`` tells caches to revalidate the |
|---|
| 304 |
cache on every access and to store cached versions for, at most, 3600 seconds:: |
|---|
| 305 |
|
|---|
| 306 |
from django.views.decorators.cache import cache_control |
|---|
| 307 |
@cache_control(must_revalidate=True, max_age=3600) |
|---|
| 308 |
def my_view(request): |
|---|
| 309 |
... |
|---|
| 310 |
|
|---|
| 311 |
Any valid ``Cache-Control`` directive is valid in ``cache_control()``. For a |
|---|
| 312 |
full list, see the `Cache-Control spec`_. Just pass the directives as keyword |
|---|
| 313 |
arguments to ``cache_control()``, substituting underscores for hyphens. For |
|---|
| 314 |
directives that don't take an argument, set the argument to ``True``. |
|---|
| 315 |
|
|---|
| 316 |
Examples: |
|---|
| 317 |
|
|---|
| 318 |
* ``@cache_control(max_age=3600)`` turns into ``max-age=3600``. |
|---|
| 319 |
* ``@cache_control(public=True)`` turns into ``public``. |
|---|
| 320 |
|
|---|
| 321 |
(Note that the caching middleware already sets the cache header's max-age with |
|---|
| 322 |
the value of the ``CACHE_MIDDLEWARE_SETTINGS`` setting. If you use a custom |
|---|
| 323 |
``max_age`` in a ``cache_control`` decorator, the decorator will take |
|---|
| 324 |
precedence, and the header values will be merged correctly.) |
|---|
| 325 |
|
|---|
| 326 |
.. _`Cache-Control spec`: http://www.w3.org/Protocols/rfc2616/rfc2616-sec14.html#sec14.9 |
|---|
| 327 |
|
|---|
| 328 |
Other optimizations |
|---|
| 329 |
=================== |
|---|
| 330 |
|
|---|
| 331 |
Django comes with a few other pieces of middleware that can help optimize your |
|---|
| 332 |
apps' performance: |
|---|
| 333 |
|
|---|
| 334 |
* ``django.middleware.http.ConditionalGetMiddleware`` adds support for |
|---|
| 335 |
conditional GET. This makes use of ``ETag`` and ``Last-Modified`` |
|---|
| 336 |
headers. |
|---|
| 337 |
|
|---|
| 338 |
* ``django.middleware.gzip.GZipMiddleware`` compresses content for browsers |
|---|
| 339 |
that understand gzip compression (all modern browsers). |
|---|
| 340 |
|
|---|
| 341 |
Order of MIDDLEWARE_CLASSES |
|---|
| 342 |
=========================== |
|---|
| 343 |
|
|---|
| 344 |
If you use ``CacheMiddleware``, it's important to put it in the right place |
|---|
| 345 |
within the ``MIDDLEWARE_CLASSES`` setting, because the cache middleware needs |
|---|
| 346 |
to know which headers by which to vary the cache storage. Middleware always |
|---|
| 347 |
adds something the ``Vary`` response header when it can. |
|---|
| 348 |
|
|---|
| 349 |
Put the ``CacheMiddleware`` after any middlewares that might add something to |
|---|
| 350 |
the ``Vary`` header. The following middlewares do so: |
|---|
| 351 |
|
|---|
| 352 |
* ``SessionMiddleware`` adds ``Cookie`` |
|---|
| 353 |
* ``GZipMiddleware`` adds ``Accept-Encoding`` |
|---|