Changes between Version 2 and Version 3 of ObjectLevelCaching
- Timestamp:
- May 28, 2007, 12:19:03 PM (17 years ago)
Legend:
- Unmodified
- Added
- Removed
- Modified
-
ObjectLevelCaching
v2 v3 1 ''This is the original Django GSoC proposal. There have been quite a few2 revisions since, but I'm posting this first for reference.''3 4 1 = Abstract = 5 2 6 This addition to Django's ORM adds **simple drop-in caching**, compatible with3 This addition to Django's ORM adds simple drop-in caching, compatible with 7 4 nearly all existing `QuerySet` methods. It emphasizes 8 5 performance and compatibility, and providing configuration options with sane … … 15 12 = Proposed Design = 16 13 17 The `QuerySet` class grows twonew methods to add object caching:14 The `QuerySet` class grows new methods to add object caching: 18 15 19 == cache() ==16 == .cache() == 20 17 {{{ 21 18 cache(timeout=None, prefix='qscache:', smart=False) 22 19 }}} 23 20 21 This method causes models instances found in the returned 22 `QuerySet` to be cached individually; the cache key is 23 calculated using the contrib.contenttypes model id and the 24 instance's pk value. (This is all done lazily and the position 25 of `cache()` does not matter, to be consistent with other methods.) 26 24 27 `timeout` defaults to the amount specified in `CACHE_BACKEND`. 25 28 `prefix` is in addition to `CACHE_MIDDLEWARE_KEY_PREFIX`. 26 29 27 Cache keys are calculated with the content-type id and instance id, to28 accomodate generic relations.29 30 30 Internally, `QuerySet` grows some new attributes that affect how SQL is 31 generated. When in effect, they cause the query to only retrieveprimary31 generated. Use of `cache()` causes the query to retrieve only primary 32 32 keys of selected objects. `in_bulk()` uses the cache directly, although 33 33 cache misses will still require database hits, as usual. Methods such as … … 40 40 and creates the values dictionary from cache. If a list of fields is 41 41 specified in `values()`, `cache()` will still perform the equivalent of a 42 `SELECT *`. Perhaps another option could be added to allow retrieval 43 of only the specified fields, which would break any regular cached lookup 44 for that object. 42 `SELECT *`. 45 43 46 44 `select_related()` is supported by the caching mechanism. The appropriate 47 45 joins are still performed by the database; if joins were calculated with 48 cached object foreign key values, cache misses could be very costly.46 cached foreign key values, cache misses could become very costly. 49 47 50 == cache_generic() ==48 == .cache_related() == 51 49 {{{ 52 cache_ generic(field, timeout=None, prefix='qscache:', smart=False)50 cache_related(fields, timeout=None, prefix='qscache:', smart=False) 53 51 }}} 52 `fields` is a name or list of foreign keys, many-to-many/one-to-one fields, 53 reverse-lookup fields, or generic foreign keys on the model. Model instances 54 pointed to by the given relation will be cached similarly to `cache()`. 54 55 55 `field` is the name of the generic foreign key field. 56 I'm not sold on the signature of this method... *args would be nice 57 but then the other defaulted arguments would be replaced by **kwargs. 56 58 59 Also, the special string `'*'` could be accepted to cache all relations. 60 Either that or another method `cache_all_relations()`. 61 62 === Aside === 57 63 Without database-specific trickery it is non-trivial to perform SQL JOINs 58 64 with generic relations. Currently, a database query is required for each … … 63 69 with `cache()`. 64 70 71 == .cache_set() == 72 {{{ 73 cache_set(cache_key, timeout=None, smart=False, depth=1) 74 }}} 75 Similar to taking the resulting QuerySet and storing it directly in the 76 cache. Overrides `cache()`, but does not cache relations. 77 78 If `select_related()` is used in the same `QuerySet`, `cache_set()` will 79 also cache the 80 If `cache_related()` is used in the same `QuerySet`, it overrides use of 81 `select_related()`. 82 83 == Sample usage == 84 85 {{{ 86 >>> article.comment_set.cache_relation('author') 87 >>> my_city.restaurant_set.cache(smart=True) 88 >>> Article.objects.filter(created__gte=yesterday).cache_set('todaysarticles') 89 >>> tag = Tag.objects.cache_relation('content_object').get(slug='news') 90 }}} 91 65 92 == Background logic == 93 94 The implementation class contains a registry of models that have been requested 95 to cache (directly or via a relation). 66 96 67 97 To achieve as much transparency as possible, the `QuerySet` methods quietly 68 98 establish `post_save` and `post_delete` signal listeners the first time a 69 model is cached. Object deletion is trivial. On object creation or99 model is cached. Object deletion is handled trivially. On object creation or 70 100 modification, the preferred behavior is to create or update the cached key 71 101 rather than simply deleting the key and letting the cache regenerate it; 72 102 the rationale is that the object is most likely to be viewed immediately after 73 and caching it at `post_save` is cheap. However, specific cases may not be74 as accommodating. This is likely subject to debate or may need a global setting.103 and caching it at `post_save` is cheap. However, this may not be desirable in 104 certain cases. 75 105 76 106 To reduce the number of cache misses, additional "smart" logic can be added. … … 84 114 85 115 86 = ImplementationNotes =116 = Notes = 87 117 88 * All caching code lives in a contrib app at first. A custom `QuerySet` class 118 == Code layout == 119 120 * All caching code lives in a separate app at first. A custom `QuerySet` class 89 121 derives from the official class, overriding where appropriate. A `Manager` 90 122 class with an overriden `get_query_set()` is used for testing, and 91 additional middleware, etc. are located in the same folder. Near or upon92 completion, the new code can be merged to trunk as Django proper. Hopefully123 additional middleware, etc. are located in the same folder. Perhaps 124 eventually, the new code can be merged to trunk as Django proper. Hopefully 93 125 the code will not be too invasive, but quite a few `QuerySet` methods will 94 have to be hijacked. 126 have to be hijacked. `QuerySet` refactoring would be an ideal merge time. 95 127 96 128 * If the transaction middleware is enabled, it is desirable to have the cache … … 102 134 existing `CacheMiddleware`. 103 135 136 * I've been thinking quite a lot about the multitude of combinations of 137 methods I've got here... I'm going to implement the simplest things I 138 had in the original proposal first and branch out from there. I'll 139 likely post some sort of map of the combinations later once I get it 140 down on paper. 141 142 == Interface changes == 143 144 * I'm considering just making "smart" behaviour standard, or at least default. 145 146 * Perhaps the default cache key prefix should be specifiable in settings? 147 148 * Should `cache_related()` lose the `depth` argument and merely steal it 149 from `select_related()` instead, if given? 150 151 * When `cache()` is used with `values()`, perhaps another option could be 152 added to allow retrieval of only the specified fields--however, this would 153 break any regular cached lookup for that object. 154 104 155 = Timeline = 105 156 … … 107 158 108 159 * Write preliminary tests. Initial implementation of `cache()` for single 109 objects. Support almost alltypical `QuerySet` methods.160 objects. Support typical `QuerySet` methods. 110 161 111 * Devise a generic idiom for testing cache-related code. Work on agregates; 112 implement `select_related()`, `values()`, `in_bulk()` cases, and 113 `cache_generic()` method. 162 * Devise a generic idiom for testing cache-related code. 163 164 * Later in the month, work on `cache_related()`. Work on agregates; 165 implement `select_related()`, `values()`, and `in_bulk()` cases. 114 166 115 167 == Second Month == … … 122 174 * Add transaction support. Design decision needed about extra middleware. 123 175 124 * Implement extra features if possible (`distinct()`, `extra(select=...)`, ...) 176 * Implement extra features (`distinct()`, `extra(select=...)`, ...) 177 in conjunction with `cache_set()`. 125 178 126 179 == Last Month == 127 180 128 * Write up documentation, extensive tests, and example code. Possibly move from 129 contrib into the main cache module. 181 * Write up documentation, extensive tests, and example code. 182 183 * Edge cases, corner cases... there are going to be quite a few! 130 184 131 185 * Refactor, especially if the new `QuerySet` has been released. Continue … … 133 187 134 188 * Allow for wiggle room, `QuerySet` refactoring work, cleanup, etc. 189 190 = class Meta: = 191 192 I'm definitely wide open for comments and criticisms! You can contact me at 193 [mailto:paul@paul-collier.com paul@paulcollier.com].