Code


Version 1 (modified by PhiR, 6 years ago) (diff)

--

Single Instance

This page describes the issues and proposals pertaining to the fact that django's ORM will create as many instances of the same DB object as there are request.

Issues

  • #5514 describes what I think is the main issue with the current behavior of the ORM. Having multiple instances can result in silent data losses. Even worse the actual result would depend on whether one used select_related or not.
  • The original goal of #17 was reducing memory use by reusing the same instance instead of having a new one.

Proposals

The basic idea of #17 is to simply reuse existing instances. This works by keeping track of instanciated objects on a per model basis. Whenever we try to instanciate a model, we check if there is already an instance in memory and reuse it if so. This would solve both issues mentionned above. Please note that the proposal is absolutely NOT a caching system. Whenever an instance goes out of scope it is still discarded.

To understand the benefits of the proposed patch, let's see a small example:

class ArticleType(models.Model):
    name = models.CharField(maxlength=200)
    categories = models.CharField(maxlength=200)

class Article(models.Model):
    title = models.CharField(maxlength=200)
    type_of_article = models.ForeignKey(ArticleType)

for article in Article.objects.all():
    print "%s (%s)" % (article.title, article.type_of_article.name)

If you have a great number of Articles and a smaller number of ArticleTypes the perfomance/memory hit is staggering:

  • you do a request per Article to get the type
  • you have as many ArticleType instances in memory as there are articles

Implementation

#17 currently has a working patch doing the following:

but some things are missing:

  • more detailled docs
  • API cleanup

Threads

The current #17 patch does not handle threads but the design calls for per-thread instance uniqueness only. No sharing would occur between threads, as is the case currently.