|Version 2 (modified by PhiR, 8 years ago) (diff)|
This page describes the issues and proposals pertaining to the fact that django's ORM will create as many instances of the same DB object as there are request.
- #5514 describes what I think is the main issue with the current behavior of the ORM. Having multiple instances can result in silent data losses. Even worse the actual result would depend on whether one used select_related or not.
- The original goal of #17 was reducing memory use by reusing the same instance instead of having a new one.
The basic idea of #17 is to simply reuse existing instances. This works by keeping track of instanciated objects on a per model basis. Whenever we try to instanciate a model, we check if there is already an instance in memory and reuse it if so. This would solve both issues mentionned above. Please note that the proposal is absolutely NOT a caching system. Whenever an instance goes out of scope it is still discarded.
To understand the benefits of the proposed patch, let's see a small example:
class ArticleType(models.Model): name = models.CharField(maxlength=200) categories = models.CharField(maxlength=200) class Article(models.Model): title = models.CharField(maxlength=200) type_of_article = models.ForeignKey(ArticleType) for article in Article.objects.all(): print "%s (%s)" % (article.title, article.type_of_article.name)
If you have a great number of Articles and a smaller number of ArticleTypes the performance/memory hit is staggering:
- you do a request per Article to get the type
- you have as many ArticleType instances in memory as there are articles
No sharing would occur between threads, as is the case currently. Instances will be unique within each thread.
#17 currently has a working patch with the following:
- Per-model WeakValueDictionary with dict[pk_val] = instance.
- call classmethod which either gets a instance from the dict or return a newly created one.
- a simple interface for enabling or disabling the feature for a given Model (default is disabled).
- a slightly modified related manager which will first try to get the instance from the Model dict instead of blindly querying the DB.
- slightly modified serializers to force the instantiation when reading from an outside source.
- a slightly modified query to force flushing instances when they are delete'd from the DB.
but some things are missing:
- there is no doc as to how to enable/disable this feature and what to expect from it.
- the API for enabling/disabling the feature is very crude and consists only of two class methods. It should at the very least be possible to set the value as a member of the Meta class.
- the Model dict should be set threadlocally