| 5 | | So, you got slashdotted_. Now what? |
|---|
| 6 | | |
|---|
| 7 | | Django's cache framework gives you three methods of caching dynamic pages in |
|---|
| 8 | | memory or in a database. You can cache the output of specific views, you can |
|---|
| 9 | | cache only the pieces that are difficult to produce, or you can cache your |
|---|
| 10 | | entire site. |
|---|
| 11 | | |
|---|
| 12 | | .. _slashdotted: http://en.wikipedia.org/wiki/Slashdot_effect |
|---|
| | 5 | A fundamental tradeoff in dynamic Web sites is, well, they're dynamic. Each |
|---|
| | 6 | time a user requests a page, the Web server makes all sorts of calculations -- |
|---|
| | 7 | from database queries to template rendering to business logic -- to create the |
|---|
| | 8 | page that your site's visitor sees. This is a lot more expensive, from a |
|---|
| | 9 | processing-overhead perspective, than your standard read-a-file-off-the-filesystem |
|---|
| | 10 | server arrangement. |
|---|
| | 11 | |
|---|
| | 12 | For most Web applications, this overhead isn't a big deal. Most Web |
|---|
| | 13 | applications aren't washingtonpost.com or slashdot.org; they're simply small- |
|---|
| | 14 | to medium-sized sites with so-so traffic. But for medium- to high-traffic |
|---|
| | 15 | sites, it's essential to cut as much overhead as possible. |
|---|
| | 16 | |
|---|
| | 17 | That's where caching comes in. |
|---|
| | 18 | |
|---|
| | 19 | To cache something is to save the result of an expensive calculation so that |
|---|
| | 20 | you don't have to perform the calculation next time. Here's some pseudocode |
|---|
| | 21 | explaining how this would work for a dynamically generated Web page: |
|---|
| | 22 | |
|---|
| | 23 | given a URL, try finding that page in the cache |
|---|
| | 24 | if the page is in the cache: |
|---|
| | 25 | return the cached page |
|---|
| | 26 | else: |
|---|
| | 27 | generate the page |
|---|
| | 28 | save the generated page in the cache (for next time) |
|---|
| | 29 | return the generated page |
|---|
| | 30 | |
|---|
| | 31 | Django comes with a robust cache system that lets you save dynamic pages so |
|---|
| | 32 | they don't have to be calculated for each request. For convenience, Django |
|---|
| | 33 | offers different levels of cache granularity: You can cache the output of |
|---|
| | 34 | specific views, you can cache only the pieces that are difficult to produce, or |
|---|
| | 35 | you can cache your entire site. |
|---|
| | 36 | |
|---|
| | 37 | Django also works well with "upstream" caches, such as Squid |
|---|
| | 38 | (http://www.squid-cache.org/) and browser-based caches. These are the types of |
|---|
| | 39 | caches that you don't directly control but to which you can provide hints (via |
|---|
| | 40 | HTTP headers) about which parts of your site should be cached, and how. |
|---|
| 17 | | The cache framework allows for different "backends" -- different methods of |
|---|
| 18 | | caching data. There's a simple single-process memory cache (mostly useful as a |
|---|
| 19 | | fallback) and a memcached_ backend (the fastest option, by far, if you've got |
|---|
| 20 | | the RAM). |
|---|
| 21 | | |
|---|
| 22 | | Before using the cache, you'll need to tell Django which cache backend you'd |
|---|
| 23 | | like to use. Do this by setting the ``CACHE_BACKEND`` in your settings file. |
|---|
| 24 | | |
|---|
| 25 | | The ``CACHE_BACKEND`` setting is a "fake" URI (really an unregistered scheme). |
|---|
| 26 | | Examples: |
|---|
| 27 | | |
|---|
| 28 | | ============================== =========================================== |
|---|
| 29 | | CACHE_BACKEND Explanation |
|---|
| 30 | | ============================== =========================================== |
|---|
| 31 | | memcached://127.0.0.1:11211/ A memcached backend; the server is running |
|---|
| 32 | | on localhost port 11211. You can use |
|---|
| 33 | | multiple memcached servers by separating |
|---|
| 34 | | them with semicolons. |
|---|
| 35 | | |
|---|
| 36 | | This backend requires the |
|---|
| 37 | | `Python memcached bindings`_. |
|---|
| 38 | | |
|---|
| 39 | | db://tablename/ A database backend in a table named |
|---|
| 40 | | "tablename". This table should be created |
|---|
| 41 | | with "django-admin createcachetable". |
|---|
| 42 | | |
|---|
| 43 | | file:///var/tmp/django_cache/ A file-based cache stored in the directory |
|---|
| 44 | | /var/tmp/django_cache/. |
|---|
| 45 | | |
|---|
| 46 | | simple:/// A simple single-process memory cache; you |
|---|
| 47 | | probably don't want to use this except for |
|---|
| 48 | | testing. Note that this cache backend is |
|---|
| 49 | | NOT thread-safe! |
|---|
| 50 | | |
|---|
| 51 | | locmem:/// A more sophisticated local memory cache; |
|---|
| 52 | | this is multi-process- and thread-safe. |
|---|
| 53 | | |
|---|
| 54 | | dummy:/// Doesn't actually cache; just implements the |
|---|
| 55 | | cache backend interface and doesn't do |
|---|
| 56 | | anything. This is an easy way to turn off |
|---|
| 57 | | caching for a test environment. |
|---|
| 58 | | ============================== =========================================== |
|---|
| 59 | | |
|---|
| 60 | | All caches may take arguments -- they're given in query-string style. Valid |
|---|
| 61 | | arguments are: |
|---|
| | 45 | The cache system requires a small amount of setup. Namely, you have to tell it |
|---|
| | 46 | where your cached data should live -- whether in a database, on the filesystem |
|---|
| | 47 | or directly in memory. This is an important decision that affects your cache's |
|---|
| | 48 | performance; yes, some cache types are faster than others. |
|---|
| | 49 | |
|---|
| | 50 | Your cache preference goes in the ``CACHE_BACKEND`` setting in your settings |
|---|
| | 51 | file. Here's an explanation of all available values for CACHE_BACKEND. |
|---|
| | 52 | |
|---|
| | 53 | Memcached |
|---|
| | 54 | --------- |
|---|
| | 55 | |
|---|
| | 56 | By far the fastest, most efficient type of cache available to Django, Memcached |
|---|
| | 57 | is an entirely memory-based cache framework originally developed to handle high |
|---|
| | 58 | loads at LiveJournal.com and subsequently open-sourced by Danga Interactive. |
|---|
| | 59 | It's used by sites such as Slashdot and Wikipedia to reduce database access and |
|---|
| | 60 | dramatically increase site performance. |
|---|
| | 61 | |
|---|
| | 62 | Memcached is available for free at http://danga.com/memcached/ . It runs as a |
|---|
| | 63 | daemon and is allotted a specified amount of RAM. All it does is provide an |
|---|
| | 64 | interface -- a *super-lightning-fast* interface -- for adding, retrieving and |
|---|
| | 65 | deleting arbitrary data in the cache. All data is stored directly in memory, |
|---|
| | 66 | so there's no overhead of database or filesystem usage. |
|---|
| | 67 | |
|---|
| | 68 | After installing Memcached itself, you'll need to install the Memcached Python |
|---|
| | 69 | bindings. They're in a single Python module, memcache.py, available at |
|---|
| | 70 | ftp://ftp.tummy.com/pub/python-memcached/ . If that URL is no longer valid, |
|---|
| | 71 | just go to the Memcached Web site (http://www.danga.com/memcached/) and get the |
|---|
| | 72 | Python bindings from the "Client APIs" section. |
|---|
| | 73 | |
|---|
| | 74 | To use Memcached with Django, set ``CACHE_BACKEND`` to |
|---|
| | 75 | ``memcached://ip:port/``, where ``ip`` is the IP address of the Memcached |
|---|
| | 76 | daemon and ``port`` is the port on which Memcached is running. |
|---|
| | 77 | |
|---|
| | 78 | In this example, Memcached is running on localhost (127.0.0.1) port 11211:: |
|---|
| | 79 | |
|---|
| | 80 | CACHE_BACKEND = 'memcached://127.0.0.1:11211/' |
|---|
| | 81 | |
|---|
| | 82 | One excellent feature of Memcached is its ability to share cache over multiple |
|---|
| | 83 | servers. To take advantage of this feature, include all server addresses in |
|---|
| | 84 | ``CACHE_BACKEND``, separated by semicolons. In this example, the cache is |
|---|
| | 85 | shared over Memcached instances running on IP address 172.19.26.240 and |
|---|
| | 86 | 172.19.26.242, both on port 11211:: |
|---|
| | 87 | |
|---|
| | 88 | CACHE_BACKEND = 'memcached://172.19.26.240:11211;172.19.26.242:11211/' |
|---|
| | 89 | |
|---|
| | 90 | Memory-based caching has one disadvantage: Because the cached data is stored in |
|---|
| | 91 | memory, the data will be lost if your server crashes. Clearly, memory isn't |
|---|
| | 92 | intended for permanent data storage, so don't rely on memory-based caching as |
|---|
| | 93 | your only data storage. Actually, none of the Django caching backends should be |
|---|
| | 94 | used for permanent storage -- they're all intended to be solutions for caching, |
|---|
| | 95 | not storage -- but we point this out here because memory-based caching is |
|---|
| | 96 | particularly temporary. |
|---|
| | 97 | |
|---|
| | 98 | Database caching |
|---|
| | 99 | ---------------- |
|---|
| | 100 | |
|---|
| | 101 | To use a database table as your cache backend, first create a cache table in |
|---|
| | 102 | your database by running this command:: |
|---|
| | 103 | |
|---|
| | 104 | python manage.py createcachetable [cache_table_name] |
|---|
| | 105 | |
|---|
| | 106 | ...where ``[cache_table_name]`` is the name of the database table to create. |
|---|
| | 107 | (This name can be whatever you want, as long as it's a valid table name that's |
|---|
| | 108 | not already being used in your database.) This command creates a single table |
|---|
| | 109 | in your database that is in the proper format that Django's database-cache |
|---|
| | 110 | system expects. |
|---|
| | 111 | |
|---|
| | 112 | Once you've created that database table, set your ``CACHE_BACKEND`` setting to |
|---|
| | 113 | ``"db://tablename/"``, where ``tablename`` is the name of the database table. |
|---|
| | 114 | In this example, the cache table's name is ``my_cache_table``: |
|---|
| | 115 | |
|---|
| | 116 | CACHE_BACKEND = 'db://my_cache_table' |
|---|
| | 117 | |
|---|
| | 118 | Database caching works best if you've got a fast, well-indexed database server. |
|---|
| | 119 | |
|---|
| | 120 | Filesystem caching |
|---|
| | 121 | ------------------ |
|---|
| | 122 | |
|---|
| | 123 | To store cached items on a filesystem, use the ``"file://"`` cache type for |
|---|
| | 124 | ``CACHE_BACKEND``. For example, to store cached data in ``/var/tmp/django_cache``, |
|---|
| | 125 | use this setting:: |
|---|
| | 126 | |
|---|
| | 127 | CACHE_BACKEND = 'file:///var/tmp/django_cache' |
|---|
| | 128 | |
|---|
| | 129 | Note that there are three forward slashes toward the beginning of that example. |
|---|
| | 130 | The first two are for ``file://``, and the third is the first character of the |
|---|
| | 131 | directory path, ``/var/tmp/django_cache``. |
|---|
| | 132 | |
|---|
| | 133 | The directory path should be absolute -- that is, it should start at the root |
|---|
| | 134 | of your filesystem. It doesn't matter whether you put a slash at the end of the |
|---|
| | 135 | setting. |
|---|
| | 136 | |
|---|
| | 137 | Make sure the directory pointed-to by this setting exists and is readable and |
|---|
| | 138 | writable by the system user under which your Web server runs. Continuing the |
|---|
| | 139 | above example, if your server runs as the user ``apache``, make sure the |
|---|
| | 140 | directory ``/var/tmp/django_cache`` exists and is readable and writable by the |
|---|
| | 141 | user ``apache``. |
|---|
| | 142 | |
|---|
| | 143 | Local-memory caching |
|---|
| | 144 | -------------------- |
|---|
| | 145 | |
|---|
| | 146 | If you want the speed advantages of in-memory caching but don't have the |
|---|
| | 147 | capability of running Memcached, consider the local-memory cache backend. This |
|---|
| | 148 | cache is multi-process and thread-safe. To use it, set ``CACHE_BACKEND`` to |
|---|
| | 149 | ``"locmem:///"``. For example:: |
|---|
| | 150 | |
|---|
| | 151 | CACHE_BACKEND = 'locmem:///' |
|---|
| | 152 | |
|---|
| | 153 | Simple caching (for development) |
|---|
| | 154 | -------------------------------- |
|---|
| | 155 | |
|---|
| | 156 | A simple, single-process memory cache is available as ``"simple:///"``. This |
|---|
| | 157 | merely saves cached data in-process, which means it should only be used in |
|---|
| | 158 | development or testing environments. For example:: |
|---|
| | 159 | |
|---|
| | 160 | CACHE_BACKEND = 'simple:///' |
|---|
| | 161 | |
|---|
| | 162 | Dummy caching (for development) |
|---|
| | 163 | ------------------------------- |
|---|
| | 164 | |
|---|
| | 165 | Finally, Django comes with a "dummy" cache that doesn't actually cache -- it |
|---|
| | 166 | just implements the cache interface without doing anything. |
|---|
| | 167 | |
|---|
| | 168 | This is useful if you have a production site that uses heavy-duty caching in |
|---|
| | 169 | various places but a development/test environment on which you don't want to |
|---|
| | 170 | cache. In that case, set ``CACHE_BACKEND`` to ``"dummy:///"`` in the settings |
|---|
| | 171 | file for your development environment. As a result, your development |
|---|
| | 172 | environment won't use caching and your production environment still will. |
|---|
| | 173 | |
|---|
| | 174 | CACHE_BACKEND arguments |
|---|
| | 175 | ----------------------- |
|---|
| | 176 | |
|---|
| | 177 | All caches may take arguments. They're given in query-string style on the |
|---|
| | 178 | ``CACHE_BACKEND`` setting. Valid arguments are: |
|---|
| 199 | | Controlling cache: Using Vary headers |
|---|
| 200 | | ===================================== |
|---|
| 201 | | |
|---|
| 202 | | The Django cache framework works with `HTTP Vary headers`_ to allow developers |
|---|
| 203 | | to instruct caching mechanisms to differ their cache contents depending on |
|---|
| 204 | | request HTTP headers. |
|---|
| 205 | | |
|---|
| 206 | | Essentially, the ``Vary`` response HTTP header defines which request headers a |
|---|
| 207 | | cache mechanism should take into account when building its cache key. |
|---|
| | 328 | Upstream caches |
|---|
| | 329 | =============== |
|---|
| | 330 | |
|---|
| | 331 | So far, this document has focused on caching your *own* data. But another type |
|---|
| | 332 | of caching is relevant to Web development, too: caching performed by "upstream" |
|---|
| | 333 | caches. These are systems that cache pages for users even before the request |
|---|
| | 334 | reaches your Web site. |
|---|
| | 335 | |
|---|
| | 336 | Here are a few examples of upstream caches: |
|---|
| | 337 | |
|---|
| | 338 | * Your ISP may cache certain pages, so if you requested a page from |
|---|
| | 339 | somedomain.com, your ISP would send you the page without having to access |
|---|
| | 340 | somedomain.com directly. |
|---|
| | 341 | |
|---|
| | 342 | * Your Django Web site may site behind a Squid Web proxy |
|---|
| | 343 | (http://www.squid-cache.org/) that caches pages for performance. In this |
|---|
| | 344 | case, each request first would be handled by Squid, and it'd only be |
|---|
| | 345 | passed to your application if needed. |
|---|
| | 346 | |
|---|
| | 347 | * Your Web browser caches pages, too. If a Web page sends out the right |
|---|
| | 348 | headers, your browser will use the local (cached) copy for subsequent |
|---|
| | 349 | requests to that page. |
|---|
| | 350 | |
|---|
| | 351 | Upstream caching is a nice efficiency boost, but there's a danger to it: |
|---|
| | 352 | Many Web pages' contents differ based on authentication and a host of other |
|---|
| | 353 | variables, and cache systems that blindly save pages based purely on URLs could |
|---|
| | 354 | expose incorrect or sensitive data to subsequent visitors to those pages. |
|---|
| | 355 | |
|---|
| | 356 | For example, say you operate a Web e-mail system, and the contents of the |
|---|
| | 357 | "inbox" page obviously depend on which user is logged in. If an ISP blindly |
|---|
| | 358 | cached your site, then the first user who logged in through that ISP would have |
|---|
| | 359 | his user-specific inbox page cached for subsequent visitors to the site. That's |
|---|
| | 360 | not cool. |
|---|
| | 361 | |
|---|
| | 362 | Fortunately, HTTP provides a solution to this problem: A set of HTTP headers |
|---|
| | 363 | exist to instruct caching mechanisms to differ their cache contents depending |
|---|
| | 364 | on designated variables, and to tell caching mechanisms not to cache particular |
|---|
| | 365 | pages. |
|---|
| | 366 | |
|---|
| | 367 | Using Vary headers |
|---|
| | 368 | ================== |
|---|
| | 369 | |
|---|
| | 370 | One of these headers is ``Vary``. It defines which request headers a cache |
|---|
| | 371 | mechanism should take into account when building its cache key. For example, if |
|---|
| | 372 | the contents of a Web page depend on a user's language preference, the page is |
|---|
| | 373 | said to "vary on language." |
|---|