Changes between Version 4 and Version 5 of DjangoSpecifications/Core/SingleInstance


Ignore:
Timestamp:
Mar 23, 2008, 11:16:44 PM (16 years ago)
Author:
Philippe Raoult
Comment:

--

Legend:

Unmodified
Added
Removed
Modified
  • DjangoSpecifications/Core/SingleInstance

    v4 v5  
    11
    2 = Single Instance =
     2= Singleton Instances =
    33
    4 This page describes the issues and proposals pertaining to the fact that django's ORM will create as many instances of the same DB object as there are request.
    5 
     4This page describes the issues and proposals pertaining to the fact that django's ORM will create as many instances of the same DB object as there are requests.
    65
    76== Issues ==
    8  * #5514 describes what I think is the main issue with the current behavior of the ORM. Having multiple instances can result in silent data losses. Even worse the actual result would depend on whether one used select_related or not:
     7
     8=== Inconsistency ===
     9#5514 describes what I think is the main issue with the current behavior of the ORM. Having multiple instances can result in silent data losses. Even worse the actual result would depend on whether one used select_related or not:
    910{{{
    10 i'll add this example later!
     11class Operand(models.Model):
     12    value = models.IntField()
     13
     14class Operation(models.Model):
     15    arg = models.ForeignKey(Operand)
     16
     17for operation in Operation.objects.all():
     18    operation.arg.value += 1
     19    operation.arg.save()
     20
     21# so far so good, but I want to make it tighter ! Let's use select_related to save a DB query per iteration
     22for operation in Operation.objects.all().select_related():
     23    operation.arg.value += 1
     24    operation.arg.save()
    1125}}}
    12  * The original goal of #17 was reducing memory use and query count by reusing the same instance instead of having a new one. Let's see a typical example of that:
     26Here we have a problem if two operations point to the same operand. The first version works well but the second preloads all operands from the DB and the same DB operand will result in multiple operand instances in memory. So we're basically modifying and saving the original operand every time instead of cumulating the changes. Here's a band-aid for your foot.
     27
     28Models with self FKs will exhibit exactly the same issues:
     29{{{
     30
     31class Directory(models.Model):
     32   name = models.CharField()
     33   parent = models.ForeignKey('self')
     34
     35for dir in Directory.objects.all():
     36    if condition(dir):
     37        dir.modify()
     38        dir.parent.modify()
     39        dir.parent.save()
     40}}}
     41Now let's see, what if condition returns True for a directory and its parent ? If the parent comes first in the main query, it will be modified, saved, reloaded when its child comes up later. If the child comes up first the parent is loaded, modified, saved, and then later on the original value from the outer query will be modified and saved, thus erasing the first change.
     42
     43=== Performance ===
     44The original goal of #17 was reducing memory use and query count by reusing the same instance instead of having a new one. Let's see a typical example of that:
    1345{{{
    1446class ArticleType(models.Model):
     
    2860}}}
    2961
    30 If you have a great number of Articles and a smaller number of ArticleTypes the performance/memory hit is staggering because:
    31  * you generate a DB query per Article to get the type
    32  * you have as many ArticleType instances in memory as there are articles
     62If you have a great number of articles and a smaller number of articletypes the performance/memory hit is staggering because:
     63 * you generate a DB query per article to get the type
     64 * you instantiate the type once per article instead of once per type
     65 * you have as many articletype instances in memory as there are articles
    3366
    3467== Proposals ==
    35 The basic idea of #17 is to simply reuse existing instances. This works by keeping track of instantiated objects on a per model basis. Whenever we try to instantiate a model, we check if there is already an instance in memory and reuse it if so. This would solve both issues mentioned above. Please note that the proposal is absolutely NOT a caching system. Whenever an instance goes out of scope it is still discarded. Also, it would most likely be turned off by default, and only work for Models which have this feature explicitly enabled.
     68=== Overview ===
     69The basic idea is to simply reuse existing instances. This works by keeping track of instantiated objects on a per model basis. Whenever we try to instantiate a model, we check if there is already an instance in memory and reuse it if so. This would solve both issues mentioned above.
     70
     71Please note that the proposal is absolutely NOT a caching system: whenever an instance goes out of scope it is discarded and will have to be reloaded from the DB next time.
     72
     73This feature would be turned off by default, and only operate for Models which have this feature explicitly enabled. Good candidates for this feature are Models like the articletype above, or models with self references.
    3674
    3775=== Threads ===
    3876No sharing would occur between threads, as is the case currently. Instances will be unique within each thread.
     77
     78=== API ===
     79The public API for this feature will be very simple. It will be a single optional parameter in the Meta class:
     80
     81{{{
     82class ArticleType(models.Model):
     83    name = models.CharField(maxlength=200)
     84    categories = models.CharField(maxlength=200)
     85
     86    class Meta:
     87        singleton_instances = True
     88}}}
     89The default value for the parameter is False, which means that the feature has to be explicitly enabled.
    3990
    4091== Implementation ==
     
    51102 * there is no doc as to how to enable/disable this feature and what to expect from it.
    52103 * the API for enabling/disabling the feature is very crude and consists only of two class methods. It should at the very least be possible to set the value as a member of the Meta class.
    53  * the Model dict should be set threadlocally
    54 
    55 
    56 
    57 
     104 * the Model dict should be set threadlocally.
     105 * it should be possible to force instantiation, because the serializers want that (solved by next item).
     106 * the internals of the patch should be changed to be less magic: overriding __call__ is a neat trick but I'd rather have something like Model.get_singleton(pk, kwargs = None) and use this where needed (the places are few!). If the user wants to do Model(id = something, kwargs) he should get a fresh instance. If he wanted to have the DB one he'd have used get or something like that.
     107 * For the sake of completeness, it should be possible to do Model.objects.get_fresh_instance(id=something) to bypass the singleton
Back to Top