Code

Changes between Version 1 and Version 2 of ObjectLevelCaching


Ignore:
Timestamp:
05/27/07 10:44:42 (7 years ago)
Author:
Paul Collier
Comment:

Seems my copy-to-clipboard script is broken...

Legend:

Unmodified
Added
Removed
Modified
  • ObjectLevelCaching

    v1 v2  
    11''This is the original Django GSoC proposal. There have been quite a few 
    2  
    32revisions since, but I'm posting this first for reference.'' 
    4  
    5  
    63 
    74= Abstract = 
    85 
    9  
    10  
    116This addition to Django's ORM adds **simple drop-in caching**, compatible with 
    12  
    137nearly all existing `QuerySet` methods. It emphasizes 
    14  
    158performance and compatibility, and providing configuration options with sane 
    16  
    179defaults. All that is required for basic functionality is a suitable 
    18  
    1910`CACHE_BACKEND` setting and the addition of `.cache()` to the appropriate 
    20  
    2111`QuerySet` chains. It also speeds up the lookup of related objects, and even 
    22  
    2312that of [http://www.djangoproject.com/documentation/models/generic_relations generic relations]. 
    24  
    25  
    26  
    2713 
    2814 
    2915= Proposed Design = 
    3016 
    31  
    32  
    3317The `QuerySet` class grows two new methods to add object caching: 
    3418 
    35  
    36  
     19== cache() == 
    3720{{{ 
    38  
    39     cache(timeout=None, prefix='qscache:', smart=False) 
    40  
     21cache(timeout=None, prefix='qscache:', smart=False) 
    4122}}} 
    4223 
    4324    `timeout` defaults to the amount specified in `CACHE_BACKEND`. 
    44  
    4525    `prefix` is in addition to `CACHE_MIDDLEWARE_KEY_PREFIX`. 
    4626 
    47  
    48  
    4927    Cache keys are calculated with the content-type id and instance id, to 
    50  
    5128    accomodate generic relations. 
    5229 
    53  
    54  
    5530    Internally, `QuerySet` grows some new attributes that affect how SQL is 
    56  
    5731    generated. When in effect, they cause the query to only retrieve primary 
    58  
    5932    keys of selected objects. `in_bulk()` uses the cache directly, although 
    60  
    6133    cache misses will still require database hits, as usual.  Methods such as 
    62  
    6334    `delete()` and `count()` are largely unaffected by `cache()`, but 
    64  
    6535    methods such as `distinct()` are a more difficult case and will require 
    66  
    6736    some design decisions. Using `extra(select=...)` is also a possibly 
    68  
    6937    unsolvable case. 
    7038 
    71  
    72  
    7339    If `values()` has been used in the query, `cache()` takes precedence 
    74  
    7540    and creates the values dictionary from cache. If a list of fields is 
    76  
    7741    specified in `values()`, `cache()` will still perform the equivalent of a 
    78  
    7942    `SELECT *`. Perhaps another option could be added to allow retrieval 
    80  
    8143    of only the specified fields, which would break any regular cached lookup 
    82  
    8344    for that object. 
    8445 
    85  
    86  
    8746    `select_related()` is supported by the caching mechanism. The appropriate 
    88  
    8947    joins are still performed by the database; if joins were calculated with 
    90  
    9148    cached object foreign key values, cache misses could be very costly. 
    9249 
    93  
    94  
     50== cache_generic() == 
    9551{{{ 
    96  
    97     cache_generic(field, timeout=None, prefix='qscache:', smart=False) 
    98  
     52cache_generic(field, timeout=None, prefix='qscache:', smart=False) 
    9953}}} 
    100  
    101  
    10254 
    10355    `field` is the name of the generic foreign key field. 
    10456 
    105  
    106  
    10757    Without database-specific trickery it is non-trivial to perform SQL JOINs 
    108  
    10958    with generic relations. Currently, a database query is required for each 
    110  
    11159    generic foreign key relationship. The cache framework, while unable to 
    112  
    11360    reduce the initial number of database hits, greatly alleviates load when 
    114  
    11561    lists of generic objects are required. Using this method still loads 
    116  
    11762    generic foreign keys lazily, but more quickly, and also uses objects cached 
    118  
    11963    with `cache()`. 
    12064 
    121  
     65== Background logic == 
    12266 
    12367To achieve as much transparency as possible, the `QuerySet` methods quietly 
    124  
    12568establish `post_save` and `post_delete` signal listeners the first time a 
    126  
    12769model is cached. Object deletion is trivial. On object creation or 
    128  
    129 modification, the preferred behaviour is to create or update the cached key 
    130  
     70modification, the preferred behavior is to create or update the cached key 
    13171rather than simply deleting the key and letting the cache regenerate it; 
    132  
    13372the rationale is that the object is most likely to be viewed immediately after 
    134  
    13573and caching it at `post_save` is cheap. However, specific cases may not be 
    136  
    137 as accomodating. This is likely subject to debate or may need a global setting. 
    138  
    139  
     74as accommodating. This is likely subject to debate or may need a global setting. 
    14075 
    14176To reduce the number of cache misses, additional "smart" logic can be added. 
    142  
    14377For example, the first time a model is registered to the cache signal listener, 
    144  
    14578its model instances are expected to be uncached. In this case, rather than 
    146  
    14779fetching only primary keys, the objects are retrieved as normal (and cached). 
    14880 
    14981By storing the expiration time, this can also take effect whenever the 
    150  
    15182cached objects have likely timed out. All "smart" functionality is enabled 
    152  
    15383using the `smart` keyword argument. 
    154  
    155  
    156  
    15784 
    15885 
    15986= Implementation Notes = 
    16087 
     88 * All caching code lives in a contrib app at first. A custom `QuerySet` class 
     89   derives from the official class, overriding where appropriate. A `Manager` 
     90   class with an overriden `get_query_set()` is used for testing, and 
     91   additional middleware, etc. are located in the same folder. Near or upon 
     92   completion, the new code can be merged to trunk as Django proper. Hopefully 
     93   the code will not be too invasive, but quite a few `QuerySet` methods will 
     94   have to be hijacked. 
    16195 
    162  
    163 * All caching code lives in a contrib app at first. A custom `QuerySet` class 
    164  
    165   derives from the official class, overriding where appropriate. A `Manager` 
    166  
    167   class with an overriden `get_query_set()` is used for testing, and 
    168  
    169   additional middleware, etc. are located in the same folder. Near or upon 
    170  
    171   completion, the new code can be merged to trunk as Django proper. Hopefully 
    172  
    173   the code will not be too invasive, but quite a few `QuerySet` methods will 
    174  
    175   have to be hijacked. 
    176  
    177  
    178  
    179 * If the transaction middleware is enabled, it is desirable to have the cache 
    180  
    181   only update when the transaction succeeds. This is simple in implementation 
    182  
    183   but will couple the transaction middleware to the cache if not designed 
    184  
    185   properly. An additional middleware class can be created to handle this 
    186  
    187   case; however, it will have to stipulate placement immediately after the 
    188  
    189   `TransactionMiddleware` in settings.py, and might be confused with the 
    190  
    191   existing `CacheMiddleware`. 
    192  
    193  
    194  
    195  
     96 * If the transaction middleware is enabled, it is desirable to have the cache 
     97   only update when the transaction succeeds. This is simple in implementation 
     98   but will couple the transaction middleware to the cache if not designed 
     99   properly. An additional middleware class can be created to handle this 
     100   case; however, it will have to stipulate placement immediately after the 
     101   `TransactionMiddleware` in settings.py, and might be confused with the 
     102   existing `CacheMiddleware`. 
    196103 
    197104= Timeline = 
    198105 
    199  
    200  
    201106== First Month == 
    202107 
     108 * Write preliminary tests. Initial implementation of `cache()` for single 
     109   objects. Support almost all typical `QuerySet` methods. 
    203110 
    204  
    205 * Write preliminary tests. Initial implementation of `cache()` for single 
    206  
    207   objects. Support almost all typical `QuerySet` methods. 
    208  
    209  
    210  
    211 * Devise a generic idiom for testing cache-related code. Work on agregates; 
    212  
    213   implement `select_related()`, `values()`, `in_bulk()` cases, and 
    214  
    215   `cache_generic()` method. 
    216  
    217  
     111 * Devise a generic idiom for testing cache-related code. Work on agregates; 
     112   implement `select_related()`, `values()`, `in_bulk()` cases, and 
     113   `cache_generic()` method. 
    218114 
    219115== Second Month == 
    220116 
     117 * Work on signal dispatching, cache coherency. Write more tests and preliminary 
     118   documentation. 
    221119 
     120 * Write "smart" cache logic. Explore other possible optimizations. 
     121  
     122 * Add transaction support. Design decision needed about extra middleware. 
    222123 
    223 * Work on signal dispatching, cache coherency. Write more tests and preliminary 
    224  
    225   documentation. 
    226  
    227  
    228  
    229 * Write "smart" cache logic. Explore other possible optimizations. 
    230  
    231  
    232  
    233 * Add transaction support. Design decision needed about extra middleware. 
    234  
    235  
    236  
    237 * Implement extra features if possible (`distinct()`, `extra(select=...)`, ...) 
    238  
    239  
     124 * Implement extra features if possible (`distinct()`, `extra(select=...)`, ...) 
    240125 
    241126== Last Month == 
    242127 
     128 * Write up documentation, extensive tests, and example code. Possibly move from 
     129   contrib into the main cache module. 
    243130 
     131 * Refactor, especially if the new `QuerySet` has been released. Continue 
     132   merging with changes to trunk and testing. 
    244133 
    245 * Write up documentation, extensive tests, and example code. Possibly move from 
    246  
    247   contrib into the main cache module. 
    248  
    249  
    250  
    251 * Refactor, especially if the new `QuerySet` has been released. Continue 
    252  
    253   merging with changes to trunk and testing. 
    254  
    255  
    256  
    257 * Allow for wiggle room, `QuerySet` refactoring work, cleanup, etc. 
    258  
    259  
    260  
     134 * Allow for wiggle room, `QuerySet` refactoring work, cleanup, etc.