Changes between Version 1 and Version 2 of ObjectLevelCaching
- Timestamp:
- May 27, 2007, 12:44:42 PM (18 years ago)
Legend:
- Unmodified
- Added
- Removed
- Modified
-
ObjectLevelCaching
v1 v2 1 1 ''This is the original Django GSoC proposal. There have been quite a few 2 3 2 revisions since, but I'm posting this first for reference.'' 4 5 6 3 7 4 = Abstract = 8 5 9 10 11 6 This addition to Django's ORM adds **simple drop-in caching**, compatible with 12 13 7 nearly all existing `QuerySet` methods. It emphasizes 14 15 8 performance and compatibility, and providing configuration options with sane 16 17 9 defaults. All that is required for basic functionality is a suitable 18 19 10 `CACHE_BACKEND` setting and the addition of `.cache()` to the appropriate 20 21 11 `QuerySet` chains. It also speeds up the lookup of related objects, and even 22 23 12 that of [http://www.djangoproject.com/documentation/models/generic_relations generic relations]. 24 25 26 27 13 28 14 29 15 = Proposed Design = 30 16 31 32 33 17 The `QuerySet` class grows two new methods to add object caching: 34 18 35 36 19 == cache() == 37 20 {{{ 38 39 cache(timeout=None, prefix='qscache:', smart=False) 40 21 cache(timeout=None, prefix='qscache:', smart=False) 41 22 }}} 42 23 43 24 `timeout` defaults to the amount specified in `CACHE_BACKEND`. 44 45 25 `prefix` is in addition to `CACHE_MIDDLEWARE_KEY_PREFIX`. 46 26 47 48 49 27 Cache keys are calculated with the content-type id and instance id, to 50 51 28 accomodate generic relations. 52 29 53 54 55 30 Internally, `QuerySet` grows some new attributes that affect how SQL is 56 57 31 generated. When in effect, they cause the query to only retrieve primary 58 59 32 keys of selected objects. `in_bulk()` uses the cache directly, although 60 61 33 cache misses will still require database hits, as usual. Methods such as 62 63 34 `delete()` and `count()` are largely unaffected by `cache()`, but 64 65 35 methods such as `distinct()` are a more difficult case and will require 66 67 36 some design decisions. Using `extra(select=...)` is also a possibly 68 69 37 unsolvable case. 70 38 71 72 73 39 If `values()` has been used in the query, `cache()` takes precedence 74 75 40 and creates the values dictionary from cache. If a list of fields is 76 77 41 specified in `values()`, `cache()` will still perform the equivalent of a 78 79 42 `SELECT *`. Perhaps another option could be added to allow retrieval 80 81 43 of only the specified fields, which would break any regular cached lookup 82 83 44 for that object. 84 45 85 86 87 46 `select_related()` is supported by the caching mechanism. The appropriate 88 89 47 joins are still performed by the database; if joins were calculated with 90 91 48 cached object foreign key values, cache misses could be very costly. 92 49 93 94 50 == cache_generic() == 95 51 {{{ 96 97 cache_generic(field, timeout=None, prefix='qscache:', smart=False) 98 52 cache_generic(field, timeout=None, prefix='qscache:', smart=False) 99 53 }}} 100 101 102 54 103 55 `field` is the name of the generic foreign key field. 104 56 105 106 107 57 Without database-specific trickery it is non-trivial to perform SQL JOINs 108 109 58 with generic relations. Currently, a database query is required for each 110 111 59 generic foreign key relationship. The cache framework, while unable to 112 113 60 reduce the initial number of database hits, greatly alleviates load when 114 115 61 lists of generic objects are required. Using this method still loads 116 117 62 generic foreign keys lazily, but more quickly, and also uses objects cached 118 119 63 with `cache()`. 120 64 121 65 == Background logic == 122 66 123 67 To achieve as much transparency as possible, the `QuerySet` methods quietly 124 125 68 establish `post_save` and `post_delete` signal listeners the first time a 126 127 69 model is cached. Object deletion is trivial. On object creation or 128 129 modification, the preferred behaviour is to create or update the cached key 130 70 modification, the preferred behavior is to create or update the cached key 131 71 rather than simply deleting the key and letting the cache regenerate it; 132 133 72 the rationale is that the object is most likely to be viewed immediately after 134 135 73 and caching it at `post_save` is cheap. However, specific cases may not be 136 137 as accomodating. This is likely subject to debate or may need a global setting. 138 139 74 as accommodating. This is likely subject to debate or may need a global setting. 140 75 141 76 To reduce the number of cache misses, additional "smart" logic can be added. 142 143 77 For example, the first time a model is registered to the cache signal listener, 144 145 78 its model instances are expected to be uncached. In this case, rather than 146 147 79 fetching only primary keys, the objects are retrieved as normal (and cached). 148 80 149 81 By storing the expiration time, this can also take effect whenever the 150 151 82 cached objects have likely timed out. All "smart" functionality is enabled 152 153 83 using the `smart` keyword argument. 154 155 156 157 84 158 85 159 86 = Implementation Notes = 160 87 88 * All caching code lives in a contrib app at first. A custom `QuerySet` class 89 derives from the official class, overriding where appropriate. A `Manager` 90 class with an overriden `get_query_set()` is used for testing, and 91 additional middleware, etc. are located in the same folder. Near or upon 92 completion, the new code can be merged to trunk as Django proper. Hopefully 93 the code will not be too invasive, but quite a few `QuerySet` methods will 94 have to be hijacked. 161 95 162 163 * All caching code lives in a contrib app at first. A custom `QuerySet` class 164 165 derives from the official class, overriding where appropriate. A `Manager` 166 167 class with an overriden `get_query_set()` is used for testing, and 168 169 additional middleware, etc. are located in the same folder. Near or upon 170 171 completion, the new code can be merged to trunk as Django proper. Hopefully 172 173 the code will not be too invasive, but quite a few `QuerySet` methods will 174 175 have to be hijacked. 176 177 178 179 * If the transaction middleware is enabled, it is desirable to have the cache 180 181 only update when the transaction succeeds. This is simple in implementation 182 183 but will couple the transaction middleware to the cache if not designed 184 185 properly. An additional middleware class can be created to handle this 186 187 case; however, it will have to stipulate placement immediately after the 188 189 `TransactionMiddleware` in settings.py, and might be confused with the 190 191 existing `CacheMiddleware`. 192 193 194 195 96 * If the transaction middleware is enabled, it is desirable to have the cache 97 only update when the transaction succeeds. This is simple in implementation 98 but will couple the transaction middleware to the cache if not designed 99 properly. An additional middleware class can be created to handle this 100 case; however, it will have to stipulate placement immediately after the 101 `TransactionMiddleware` in settings.py, and might be confused with the 102 existing `CacheMiddleware`. 196 103 197 104 = Timeline = 198 105 199 200 201 106 == First Month == 202 107 108 * Write preliminary tests. Initial implementation of `cache()` for single 109 objects. Support almost all typical `QuerySet` methods. 203 110 204 205 * Write preliminary tests. Initial implementation of `cache()` for single 206 207 objects. Support almost all typical `QuerySet` methods. 208 209 210 211 * Devise a generic idiom for testing cache-related code. Work on agregates; 212 213 implement `select_related()`, `values()`, `in_bulk()` cases, and 214 215 `cache_generic()` method. 216 217 111 * Devise a generic idiom for testing cache-related code. Work on agregates; 112 implement `select_related()`, `values()`, `in_bulk()` cases, and 113 `cache_generic()` method. 218 114 219 115 == Second Month == 220 116 117 * Work on signal dispatching, cache coherency. Write more tests and preliminary 118 documentation. 221 119 120 * Write "smart" cache logic. Explore other possible optimizations. 121 122 * Add transaction support. Design decision needed about extra middleware. 222 123 223 * Work on signal dispatching, cache coherency. Write more tests and preliminary 224 225 documentation. 226 227 228 229 * Write "smart" cache logic. Explore other possible optimizations. 230 231 232 233 * Add transaction support. Design decision needed about extra middleware. 234 235 236 237 * Implement extra features if possible (`distinct()`, `extra(select=...)`, ...) 238 239 124 * Implement extra features if possible (`distinct()`, `extra(select=...)`, ...) 240 125 241 126 == Last Month == 242 127 128 * Write up documentation, extensive tests, and example code. Possibly move from 129 contrib into the main cache module. 243 130 131 * Refactor, especially if the new `QuerySet` has been released. Continue 132 merging with changes to trunk and testing. 244 133 245 * Write up documentation, extensive tests, and example code. Possibly move from 246 247 contrib into the main cache module. 248 249 250 251 * Refactor, especially if the new `QuerySet` has been released. Continue 252 253 merging with changes to trunk and testing. 254 255 256 257 * Allow for wiggle room, `QuerySet` refactoring work, cleanup, etc. 258 259 260 134 * Allow for wiggle room, `QuerySet` refactoring work, cleanup, etc.