|Version 12 (modified by 11 years ago) (diff),|
The previous version of this page is obsolete due to the model syntax change. Additionally, there's now a better reference for this topic:
magic-removal: Model Inheritance
This is a proposal for how subclassing should work in Django. There a lot of details to get right, so this proposal should be very specific and detailed. Most of the ideas here come from the thread linked below:
Here are some additional notes from Robert Wittams on allowing different storage models.
For subclassing, there are 3 main issues:
- How do we model the relations in SQL?
- How do joins work?
- How does the API work?
Note that I have only provided examples for single inheritance here. Is multiple inheritance worth supporting?
The examples will use the following models:
class Place(models.Model): name = models.CharField(maxlength=50) class Restaurant(Place): description = models.TextField() class ItalianRestaurant(Restaurant): has_decent_gnocchi = models.BooleanField()
1. Modeling parent relations in SQL?
The general consesus seems to be this:
CREATE TABLE "myapp_place" ( "id" integer NOT NULL PRIMARY KEY, "name" varchar(50) NOT NULL ); CREATE TABLE "myapp_restaurant" ( /* PRIMARY KEY REFERENCES "myapp_places" ("id") works for postgres, what about others? */ "id" integer NOT NULL PRIMARY KEY REFERENCES "myapp_places" ("id"), "description" text NOT NULL ); CREATE TABLE "myapp_italianrestaurant" ( "id" integer NOT NULL PRIMARY KEY REFERENCES "myapp_restaurant" ("id"), "has_decent_gnocchi" bool NOT NULL );
2. Modeling joins in SQL
When we want a list of
ItalianRestaurants, we obviously need all the fields from myapp_restaurant and myapp_place as well. This could be accomplished by inner joins. It would look something like this:
SELECT ... FROM myapp_italianrestaurant as ir INNER JOIN myapp_restaurant as r ON ir.restaurant_id=r.id INNER JOIN myapp_place as p ON r.place_id=p.id
But what if we want a list of
Places, what should we do? We can either just get the places:
SELECT ... FROM myapp_place
Or we can get everything with left joins (this allows the iterator to return objects of the appropriate type, rather than just a bunch of
SELECT ... FROM myapp_place as p LEFT JOIN myapp_restaurant as r ON r.place_id=p.id LEFT JOIN myapp_italianrestaurant as ir ON ir.restaurant_id=r.id
Imagine we have more than one subclass of
Place though. The join clause and the column list would get pretty hefty. This could obviously get unmanageable pretty quickly.
I think some dbs have a maximum number of joins (something like 16), and even within the maximum, the query optimizer will either spend a while deciding which way to best join the tables or it will give up and choose the wrong way quickly. This wording is FUD-- I'll try to find specifics. --jdunck
Another option is to lazily load objects like
ItalianRestaurant while we're iterating over
Place.objects.all(), but that requires a lot of database queries. Either way, doing this will be expensive, and api should reflect that. You're much better off just using
Places fields if you are going to iterate over
The following API examples assume we have created these objects:
p = Place(name='My House') r = Restaurant(name='Road Kill Cafe', description='Yuck!') ir = ItalianRestaurant(name="Ristorante Miron", description="Italian's best mushrooms restaurant", has_decent_gnocchi=True)
For the following examples, assume
|D.|| || 'Yuck!' or |
Change the current usage of subclassing
class MyArticle(Article): ...fields... class META: module_name = 'my_articles' remove_fields = ...some fields...
would change to:
class MyArticle(meta.Model): ...fields... class META: copy_from = Article remove_fields = ...some fields...
Ramblings on Magic Removal Subclassing
For the above Restaurant example:
class Place(models.Model): name = models.CharField(maxlength=50) class Restaurant(Place): description = models.TextField()
we want Restaurant to have a 'name' CharField. Looking at our above example, it would seem that 'name' should automatically be inherited by Restaurant. However, this is not the case, as Django is using metaclasses to modify the default class creation behavior. 'Place' is not created strictly as defined above. The ModelBase metaclass instead creates a new class from scratch (see dm/models/base.py), and each of the field attributes are added to the class in such as way that they are not inherited by subclasses. Note how 'name' would not show up under a call to dir(Place).
Each of the fields is added to the class via a call to add_to_class(). This in turn calls contribute_to_class in the case of field objects, rather than calling setattr(), which is why the fields of a parent class are not available to the child class for inheriting. In other words, by the time Restaurant is created, the definition of it's parent would change from this:
class Place(models.Model): foo = 1 def bar(self): return 2 name = models.CharField(maxlength=50)
to something more like this:
class Place(models.Model): foo = 1 def bar(self): return 2 _meta = ...