Changes between Version 1 and Version 2 of ObjectLevelCaching


Ignore:
Timestamp:
05/27/2007 12:44:42 PM (8 years ago)
Author:
Paul Collier
Comment:

Seems my copy-to-clipboard script is broken...

Legend:

Unmodified
Added
Removed
Modified
  • ObjectLevelCaching

    v1 v2  
    11''This is the original Django GSoC proposal. There have been quite a few
    2 
    32revisions since, but I'm posting this first for reference.''
    4 
    5 
    63
    74= Abstract =
    85
    9 
    10 
    116This addition to Django's ORM adds **simple drop-in caching**, compatible with
    12 
    137nearly all existing `QuerySet` methods. It emphasizes
    14 
    158performance and compatibility, and providing configuration options with sane
    16 
    179defaults. All that is required for basic functionality is a suitable
    18 
    1910`CACHE_BACKEND` setting and the addition of `.cache()` to the appropriate
    20 
    2111`QuerySet` chains. It also speeds up the lookup of related objects, and even
    22 
    2312that of [http://www.djangoproject.com/documentation/models/generic_relations generic relations].
    24 
    25 
    26 
    2713
    2814
    2915= Proposed Design =
    3016
    31 
    32 
    3317The `QuerySet` class grows two new methods to add object caching:
    3418
    35 
    36 
     19== cache() ==
    3720{{{
    38 
    39     cache(timeout=None, prefix='qscache:', smart=False)
    40 
     21cache(timeout=None, prefix='qscache:', smart=False)
    4122}}}
    4223
    4324    `timeout` defaults to the amount specified in `CACHE_BACKEND`.
    44 
    4525    `prefix` is in addition to `CACHE_MIDDLEWARE_KEY_PREFIX`.
    4626
    47 
    48 
    4927    Cache keys are calculated with the content-type id and instance id, to
    50 
    5128    accomodate generic relations.
    5229
    53 
    54 
    5530    Internally, `QuerySet` grows some new attributes that affect how SQL is
    56 
    5731    generated. When in effect, they cause the query to only retrieve primary
    58 
    5932    keys of selected objects. `in_bulk()` uses the cache directly, although
    60 
    6133    cache misses will still require database hits, as usual.  Methods such as
    62 
    6334    `delete()` and `count()` are largely unaffected by `cache()`, but
    64 
    6535    methods such as `distinct()` are a more difficult case and will require
    66 
    6736    some design decisions. Using `extra(select=...)` is also a possibly
    68 
    6937    unsolvable case.
    7038
    71 
    72 
    7339    If `values()` has been used in the query, `cache()` takes precedence
    74 
    7540    and creates the values dictionary from cache. If a list of fields is
    76 
    7741    specified in `values()`, `cache()` will still perform the equivalent of a
    78 
    7942    `SELECT *`. Perhaps another option could be added to allow retrieval
    80 
    8143    of only the specified fields, which would break any regular cached lookup
    82 
    8344    for that object.
    8445
    85 
    86 
    8746    `select_related()` is supported by the caching mechanism. The appropriate
    88 
    8947    joins are still performed by the database; if joins were calculated with
    90 
    9148    cached object foreign key values, cache misses could be very costly.
    9249
    93 
    94 
     50== cache_generic() ==
    9551{{{
    96 
    97     cache_generic(field, timeout=None, prefix='qscache:', smart=False)
    98 
     52cache_generic(field, timeout=None, prefix='qscache:', smart=False)
    9953}}}
    100 
    101 
    10254
    10355    `field` is the name of the generic foreign key field.
    10456
    105 
    106 
    10757    Without database-specific trickery it is non-trivial to perform SQL JOINs
    108 
    10958    with generic relations. Currently, a database query is required for each
    110 
    11159    generic foreign key relationship. The cache framework, while unable to
    112 
    11360    reduce the initial number of database hits, greatly alleviates load when
    114 
    11561    lists of generic objects are required. Using this method still loads
    116 
    11762    generic foreign keys lazily, but more quickly, and also uses objects cached
    118 
    11963    with `cache()`.
    12064
    121 
     65== Background logic ==
    12266
    12367To achieve as much transparency as possible, the `QuerySet` methods quietly
    124 
    12568establish `post_save` and `post_delete` signal listeners the first time a
    126 
    12769model is cached. Object deletion is trivial. On object creation or
    128 
    129 modification, the preferred behaviour is to create or update the cached key
    130 
     70modification, the preferred behavior is to create or update the cached key
    13171rather than simply deleting the key and letting the cache regenerate it;
    132 
    13372the rationale is that the object is most likely to be viewed immediately after
    134 
    13573and caching it at `post_save` is cheap. However, specific cases may not be
    136 
    137 as accomodating. This is likely subject to debate or may need a global setting.
    138 
    139 
     74as accommodating. This is likely subject to debate or may need a global setting.
    14075
    14176To reduce the number of cache misses, additional "smart" logic can be added.
    142 
    14377For example, the first time a model is registered to the cache signal listener,
    144 
    14578its model instances are expected to be uncached. In this case, rather than
    146 
    14779fetching only primary keys, the objects are retrieved as normal (and cached).
    14880
    14981By storing the expiration time, this can also take effect whenever the
    150 
    15182cached objects have likely timed out. All "smart" functionality is enabled
    152 
    15383using the `smart` keyword argument.
    154 
    155 
    156 
    15784
    15885
    15986= Implementation Notes =
    16087
     88 * All caching code lives in a contrib app at first. A custom `QuerySet` class
     89   derives from the official class, overriding where appropriate. A `Manager`
     90   class with an overriden `get_query_set()` is used for testing, and
     91   additional middleware, etc. are located in the same folder. Near or upon
     92   completion, the new code can be merged to trunk as Django proper. Hopefully
     93   the code will not be too invasive, but quite a few `QuerySet` methods will
     94   have to be hijacked.
    16195
    162 
    163 * All caching code lives in a contrib app at first. A custom `QuerySet` class
    164 
    165   derives from the official class, overriding where appropriate. A `Manager`
    166 
    167   class with an overriden `get_query_set()` is used for testing, and
    168 
    169   additional middleware, etc. are located in the same folder. Near or upon
    170 
    171   completion, the new code can be merged to trunk as Django proper. Hopefully
    172 
    173   the code will not be too invasive, but quite a few `QuerySet` methods will
    174 
    175   have to be hijacked.
    176 
    177 
    178 
    179 * If the transaction middleware is enabled, it is desirable to have the cache
    180 
    181   only update when the transaction succeeds. This is simple in implementation
    182 
    183   but will couple the transaction middleware to the cache if not designed
    184 
    185   properly. An additional middleware class can be created to handle this
    186 
    187   case; however, it will have to stipulate placement immediately after the
    188 
    189   `TransactionMiddleware` in settings.py, and might be confused with the
    190 
    191   existing `CacheMiddleware`.
    192 
    193 
    194 
    195 
     96 * If the transaction middleware is enabled, it is desirable to have the cache
     97   only update when the transaction succeeds. This is simple in implementation
     98   but will couple the transaction middleware to the cache if not designed
     99   properly. An additional middleware class can be created to handle this
     100   case; however, it will have to stipulate placement immediately after the
     101   `TransactionMiddleware` in settings.py, and might be confused with the
     102   existing `CacheMiddleware`.
    196103
    197104= Timeline =
    198105
    199 
    200 
    201106== First Month ==
    202107
     108 * Write preliminary tests. Initial implementation of `cache()` for single
     109   objects. Support almost all typical `QuerySet` methods.
    203110
    204 
    205 * Write preliminary tests. Initial implementation of `cache()` for single
    206 
    207   objects. Support almost all typical `QuerySet` methods.
    208 
    209 
    210 
    211 * Devise a generic idiom for testing cache-related code. Work on agregates;
    212 
    213   implement `select_related()`, `values()`, `in_bulk()` cases, and
    214 
    215   `cache_generic()` method.
    216 
    217 
     111 * Devise a generic idiom for testing cache-related code. Work on agregates;
     112   implement `select_related()`, `values()`, `in_bulk()` cases, and
     113   `cache_generic()` method.
    218114
    219115== Second Month ==
    220116
     117 * Work on signal dispatching, cache coherency. Write more tests and preliminary
     118   documentation.
    221119
     120 * Write "smart" cache logic. Explore other possible optimizations.
     121 
     122 * Add transaction support. Design decision needed about extra middleware.
    222123
    223 * Work on signal dispatching, cache coherency. Write more tests and preliminary
    224 
    225   documentation.
    226 
    227 
    228 
    229 * Write "smart" cache logic. Explore other possible optimizations.
    230 
    231 
    232 
    233 * Add transaction support. Design decision needed about extra middleware.
    234 
    235 
    236 
    237 * Implement extra features if possible (`distinct()`, `extra(select=...)`, ...)
    238 
    239 
     124 * Implement extra features if possible (`distinct()`, `extra(select=...)`, ...)
    240125
    241126== Last Month ==
    242127
     128 * Write up documentation, extensive tests, and example code. Possibly move from
     129   contrib into the main cache module.
    243130
     131 * Refactor, especially if the new `QuerySet` has been released. Continue
     132   merging with changes to trunk and testing.
    244133
    245 * Write up documentation, extensive tests, and example code. Possibly move from
    246 
    247   contrib into the main cache module.
    248 
    249 
    250 
    251 * Refactor, especially if the new `QuerySet` has been released. Continue
    252 
    253   merging with changes to trunk and testing.
    254 
    255 
    256 
    257 * Allow for wiggle room, `QuerySet` refactoring work, cleanup, etc.
    258 
    259 
    260 
     134 * Allow for wiggle room, `QuerySet` refactoring work, cleanup, etc.
Back to Top