The previous version of this page is obsolete due to the model syntax change. Additionally, there's now a better reference for this topic: http://www.djangoproject.com/documentation/models/subclassing/ = magic-removal: Model Inheritance = This is a proposal for how subclassing should work in Django. There a lot of details to get right, so this proposal should be very specific and detailed. Most of the ideas here come from the thread linked below: http://groups.google.com/group/django-developers/browse_frm/thread/ea5e0bf903058fac/9a68ac0d99cb6d7d?q=semantics&rnum=1#9a68ac0d99cb6d7d Here are some additional notes from Robert Wittams on allowing different storage models. http://groups.google.com/group/django-developers/browse_frm/thread/9a9a2cfacce690d/dfcd5a71ba523534?q=Just+a+few+vague+notes&rnum=1#dfcd5a71ba523534 == Subclassing == For subclassing, there are 3 main issues: 1. How do we model the relations in SQL? 2. How do joins work? 3. How does the API work? Note that I have only provided examples for single inheritance here. Is multiple inheritance worth supporting? The examples will use the following models: {{{ #!python class Place(models.Model): name = models.CharField(maxlength=50) class Restaurant(Place): description = models.TextField() class ItalianRestaurant(Restaurant): has_decent_gnocchi = models.BooleanField() }}} === 1. Modeling parent relations in SQL? === The general consesus seems to be this: {{{ #!sql CREATE TABLE "myapp_place" ( "id" integer NOT NULL PRIMARY KEY, "name" varchar(50) NOT NULL ); CREATE TABLE "myapp_restaurant" ( /* PRIMARY KEY REFERENCES "myapp_places" ("id") works for postgres, what about others? */ "id" integer NOT NULL PRIMARY KEY REFERENCES "myapp_places" ("id"), "description" text NOT NULL ); CREATE TABLE "myapp_italianrestaurant" ( "id" integer NOT NULL PRIMARY KEY REFERENCES "myapp_restaurant" ("id"), "has_decent_gnocchi" bool NOT NULL ); }}} === 2. Modeling joins in SQL === When we want a list of {{{ItalianRestaurant}}}s, we obviously need all the fields from myapp_restaurant and myapp_place as well. This could be accomplished by inner joins. It would look something like this: {{{ #!sql SELECT ... FROM myapp_italianrestaurant as ir INNER JOIN myapp_restaurant as r ON ir.restaurant_id=r.id INNER JOIN myapp_place as p ON r.place_id=p.id }}} But what if we want a list of {{{Place}}}s, what should we do? We can either ''just'' get the places: {{{ #!sql SELECT ... FROM myapp_place }}} Or we can get ''everything'' with left joins (this allows the iterator to return objects of the appropriate type, rather than just a bunch of {{{Places}}}): {{{ #!sql SELECT ... FROM myapp_place as p LEFT JOIN myapp_restaurant as r ON r.place_id=p.id LEFT JOIN myapp_italianrestaurant as ir ON ir.restaurant_id=r.id }}} Imagine we have more than one subclass of {{{Place}}} though. The join clause and the column list would get pretty hefty. This could obviously get unmanageable pretty quickly. '' I think some dbs have a maximum number of joins (something like 16), and even within the maximum, the query optimizer will either spend a while deciding which way to best join the tables or it will give up and choose the wrong way quickly. This wording is FUD-- I'll try to find specifics. --jdunck '' Another option is to lazily load objects like {{{Restaurant}}} and {{{ItalianRestaurant}}} while we're iterating over {{{Place.objects.all()}}}, but that requires a ''lot'' of database queries. Either way, doing this will be expensive, and api should reflect that. You're much better off ''just'' using {{{Place}}}s fields if you are going to iterate over {{{Place.objects.all()}}} === 3. API === The following API examples assume we have created these objects: {{{ #!python p = Place(name='My House') r = Restaurant(name='Road Kill Cafe', description='Yuck!') ir = ItalianRestaurant(name="Ristorante Miron", description="Italian's best mushrooms restaurant", has_decent_gnocchi=True) }}} For the following examples, assume {{{Place.objects.get(2)}}} returns {{{r}}} and {{{Place.objects.get(3)}}} returns {{{ir}}}. || || python || result || || A. || {{{Place.objects.count()}}} || 3 || || B. || {{{Restaurant.objects.count()}}} || 2 || || C. || {{{ItalianRestaurant.objects.count()}}} || 1 || || D. || {{{Place.objects.get(2).description}}} || 'Yuck!' or {{{AttributeError}}}? || || E. || {{{Restaurant.objects.get(2).description}}} || 'Yuck!' || == Change the current usage of subclassing == {{{ #!python class MyArticle(Article): ...fields... class META: module_name = 'my_articles' remove_fields = ...some fields... }}} would change to: {{{ #!python class MyArticle(meta.Model): ...fields... class META: copy_from = Article remove_fields = ...some fields... }}} == Ramblings on Magic Removal Subclassing from the Pycon Sprint== Consider the following variation on the above restaurant example: {{{ class Place(models.Model): name = models.CharField(maxlength=50) foo = 1 def bar(self): return 2 }}} If Restaurant were to inherit from this, it would not automatically have a 'name' CharField. This is because Django uses a metaclass to modify the default class creation behavior. The ModelBase metaclass creates a new class from scratch, and then selectively pulls items from the Place class as defined above and adds them to this new class, which allows it to handle Field objects specially. For each of the class's attributes, add_to_class() is called. If add_to_class finds a 'contribute_to_class' attribute, ModelBase knows it is dealing with a Field object, and calls contribute_to_class. Otherwise, it just adds it to the new class via setattr(). Thus, by the time the Restaurant class is created, the Place class which it inherits from actually looks more like this: {{{ class Place(models.Model): _meta = ... foo = 1 def bar(self): return 2 }}} Thus, there is simply no 'name' field for it to inherit. So, we need to have ModelBase walk through the parent classes and call contribute_to_class on each of the fields found in _meta.fields. As we walk the inheritance tree, we look for a '_meta' attribute to determine if our current node is a Model. Otherwise, it is either a mixin class or the Model class iteself. We can keep track of the parent hierarchy by creating _meta.parents, and having each ancestor of Model add to it in a recursive fasion, by adding the following to ModelBase.__new__(): {{{ new_class._meta.parents = bases for base in bases: if '_meta' in dir(base): new_class._meta.parents.extend(base._meta.parents) }}}