Changes between Version 2 and Version 3 of ObjectLevelCaching

05/28/2007 12:19:03 PM (8 years ago)
Paul Collier <paul@…>

Some thoughts...


  • ObjectLevelCaching

    v2 v3  
    1 ''This is the original Django GSoC proposal. There have been quite a few
    2 revisions since, but I'm posting this first for reference.''
    41= Abstract =
    6 This addition to Django's ORM adds **simple drop-in caching**, compatible with
     3This addition to Django's ORM adds simple drop-in caching, compatible with
    74nearly all existing `QuerySet` methods. It emphasizes
    85performance and compatibility, and providing configuration options with sane
    1512= Proposed Design =
    17 The `QuerySet` class grows two new methods to add object caching:
     14The `QuerySet` class grows new methods to add object caching:
    19 == cache() ==
     16== .cache() ==
    2118cache(timeout=None, prefix='qscache:', smart=False)
     21    This method causes models instances found in the returned
     22    `QuerySet` to be cached individually; the cache key is
     23    calculated using the contrib.contenttypes model id and the
     24    instance's pk value. (This is all done lazily and the position
     25    of `cache()` does not matter, to be consistent with other methods.)
    2427    `timeout` defaults to the amount specified in `CACHE_BACKEND`.
    2528    `prefix` is in addition to `CACHE_MIDDLEWARE_KEY_PREFIX`.
    27     Cache keys are calculated with the content-type id and instance id, to
    28     accomodate generic relations.
    3030    Internally, `QuerySet` grows some new attributes that affect how SQL is
    31     generated. When in effect, they cause the query to only retrieve primary
     31    generated. Use of `cache()` causes the query to retrieve only primary
    3232    keys of selected objects. `in_bulk()` uses the cache directly, although
    3333    cache misses will still require database hits, as usual.  Methods such as
    4040    and creates the values dictionary from cache. If a list of fields is
    4141    specified in `values()`, `cache()` will still perform the equivalent of a
    42     `SELECT *`. Perhaps another option could be added to allow retrieval
    43     of only the specified fields, which would break any regular cached lookup
    44     for that object.
     42    `SELECT *`.
    4644    `select_related()` is supported by the caching mechanism. The appropriate
    4745    joins are still performed by the database; if joins were calculated with
    48     cached object foreign key values, cache misses could be very costly.
     46    cached foreign key values, cache misses could become very costly.
    50 == cache_generic() ==
     48== .cache_related() ==
    52 cache_generic(field, timeout=None, prefix='qscache:', smart=False)
     50cache_related(fields, timeout=None, prefix='qscache:', smart=False)
     52    `fields` is a name or list of foreign keys, many-to-many/one-to-one fields,
     53    reverse-lookup fields, or generic foreign keys on the model. Model instances
     54    pointed to by the given relation will be cached similarly to `cache()`.
    55     `field` is the name of the generic foreign key field.
     56    I'm not sold on the signature of this method... *args would be nice
     57    but then the other defaulted arguments would be replaced by **kwargs.
     59    Also, the special string `'*'` could be accepted to cache all relations.
     60    Either that or another method `cache_all_relations()`.
     62=== Aside ===
    5763    Without database-specific trickery it is non-trivial to perform SQL JOINs
    5864    with generic relations. Currently, a database query is required for each
    6369    with `cache()`.
     71== .cache_set() ==
     73cache_set(cache_key, timeout=None, smart=False, depth=1)
     75    Similar to taking the resulting QuerySet and storing it directly in the
     76    cache. Overrides `cache()`, but does not cache relations.
     78    If `select_related()` is used in the same `QuerySet`, `cache_set()` will
     79    also cache the
     80    If `cache_related()` is used in the same `QuerySet`, it overrides use of
     81    `select_related()`.
     83== Sample usage ==
     86>>> article.comment_set.cache_relation('author')
     87>>> my_city.restaurant_set.cache(smart=True)
     88>>> Article.objects.filter(created__gte=yesterday).cache_set('todaysarticles')
     89>>> tag = Tag.objects.cache_relation('content_object').get(slug='news')
    6592== Background logic ==
     94The implementation class contains a registry of models that have been requested
     95to cache (directly or via a relation).
    6797To achieve as much transparency as possible, the `QuerySet` methods quietly
    6898establish `post_save` and `post_delete` signal listeners the first time a
    69 model is cached. Object deletion is trivial. On object creation or
     99model is cached. Object deletion is handled trivially. On object creation or
    70100modification, the preferred behavior is to create or update the cached key
    71101rather than simply deleting the key and letting the cache regenerate it;
    72102the rationale is that the object is most likely to be viewed immediately after
    73 and caching it at `post_save` is cheap. However, specific cases may not be
    74 as accommodating. This is likely subject to debate or may need a global setting.
     103and caching it at `post_save` is cheap. However, this may not be desirable in
     104certain cases.
    76106To reduce the number of cache misses, additional "smart" logic can be added.
    86 = Implementation Notes =
     116= Notes =
    88  * All caching code lives in a contrib app at first. A custom `QuerySet` class
     118== Code layout ==
     120 * All caching code lives in a separate app at first. A custom `QuerySet` class
    89121   derives from the official class, overriding where appropriate. A `Manager`
    90122   class with an overriden `get_query_set()` is used for testing, and
    91    additional middleware, etc. are located in the same folder. Near or upon
    92    completion, the new code can be merged to trunk as Django proper. Hopefully
     123   additional middleware, etc. are located in the same folder. Perhaps
     124   eventually, the new code can be merged to trunk as Django proper. Hopefully
    93125   the code will not be too invasive, but quite a few `QuerySet` methods will
    94    have to be hijacked.
     126   have to be hijacked. `QuerySet` refactoring would be an ideal merge time.
    96128 * If the transaction middleware is enabled, it is desirable to have the cache
    102134   existing `CacheMiddleware`.
     136 * I've been thinking quite a lot about the multitude of combinations of
     137   methods I've got here... I'm going to implement the simplest things I
     138   had in the original proposal first and branch out from there. I'll
     139   likely post some sort of map of the combinations later once I get it
     140   down on paper.
     142== Interface changes ==
     144 * I'm considering just making "smart" behaviour standard, or at least default.
     146 * Perhaps the default cache key prefix should be specifiable in settings?
     148 * Should `cache_related()` lose the `depth` argument and merely steal it
     149   from `select_related()` instead, if given?
     151 * When `cache()` is used with `values()`, perhaps another option could be
     152   added to allow retrieval of only the specified fields--however, this would
     153   break any regular cached lookup for that object.
    104155= Timeline =
    108159 * Write preliminary tests. Initial implementation of `cache()` for single
    109    objects. Support almost all typical `QuerySet` methods.
     160   objects. Support typical `QuerySet` methods.
    111  * Devise a generic idiom for testing cache-related code. Work on agregates;
    112    implement `select_related()`, `values()`, `in_bulk()` cases, and
    113    `cache_generic()` method.
     162 * Devise a generic idiom for testing cache-related code.
     164 * Later in the month, work on `cache_related()`. Work on agregates;
     165   implement `select_related()`, `values()`, and `in_bulk()` cases.
    115167== Second Month ==
    122174 * Add transaction support. Design decision needed about extra middleware.
    124  * Implement extra features if possible (`distinct()`, `extra(select=...)`, ...)
     176 * Implement extra features (`distinct()`, `extra(select=...)`, ...)
     177   in conjunction with `cache_set()`.
    126179== Last Month ==
    128  * Write up documentation, extensive tests, and example code. Possibly move from
    129    contrib into the main cache module.
     181 * Write up documentation, extensive tests, and example code.
     183 * Edge cases, corner cases... there are going to be quite a few!
    131185 * Refactor, especially if the new `QuerySet` has been released. Continue
    134188 * Allow for wiggle room, `QuerySet` refactoring work, cleanup, etc.
     190= class Meta: =
     192I'm definitely wide open for comments and criticisms! You can contact me at
Back to Top