Opened 10 years ago
Last modified 2 years ago
#23319 new Cleanup/optimization
Django uses unnecessary join in concrete inheritance
Reported by: | avidi | Owned by: | nobody |
---|---|---|---|
Component: | Database layer (models, ORM) | Version: | |
Severity: | Normal | Keywords: | |
Cc: | Triage Stage: | Accepted | |
Has patch: | no | Needs documentation: | no |
Needs tests: | no | Patch needs improvement: | no |
Easy pickings: | no | UI/UX: | no |
Description
In some situations django use JOIN to get data from parent model when it's unnecessary.
First case
Using select to get only fields of children.
class Parent(models.Model): field_of_parent = models.TextField() class Child(Parent): field_of_child = models.TextField()
>>> str(Child.objects.only("field_of_child").all().query) 'SELECT "homettt_child"."parent_ptr_id", "homettt_child"."field_of_child" FROM "homettt_child" INNER JOIN "homettt_parent" ON ( "homettt_child"."parent_ptr_id" = "homettt_parent"."id" )'
Second case
Get all fields but parent hasn't any field
class EmptyParent(models.Model): pass class ChildB(EmptyParent): field_of_child = models.TextField()
>>> str(ChildB.objects.all().query) 'SELECT "homettt_emptyparent"."id", "homettt_childb"."emptyparent_ptr_id", "homettt_childb"."field_of_child" FROM "homettt_childb" INNER JOIN "homettt_emptyparent" ON ( "homettt_childb"."emptyparent_ptr_id" = "homettt_emptyparent"."id" )'
Third case
In similar way through relations:
For example, ManyToMany relations django could use _ptr_id column to join but uses id column of parent. If you don't need any fields (or they don't exist) the join is unnecessary.
class Parent(models.Model): field_of_parent = models.TextField() class Child(Parent): field_of_child = models.TextField() class OtherModel(models.Model): m2mrelation = models.ManyToManyField(Child)
other=OtherModel.objects.create() >>> other.m2mrelation.create() <Child: Child object> str(other.m2mrelation.only("field_of_child").all().query) >>> str(other.m2mrelation.only("field_of_child").all().query) 'SELECT "homettt_child"."parent_ptr_id", "homettt_child"."field_of_child" FROM "homettt_child" INNER JOIN "homettt_othermodel_m2mrelation" ON ( "homettt_child"."parent_ptr_id" = "homettt_othermodel_m2mrelation"."child_id" ) INNER JOIN "homettt_parent" ON ( "homettt_child"."parent_ptr_id" = "homettt_parent"."id" ) WHERE "homettt_othermodel_m2mrelation"."othermodel_id" = 1'
Maybe these could be solved using _ptr_id as "default pk" in child models and only use id of parent if is necessary information of table's parent model.
Tested in 1.7rc2 but maybe this appears in previous versions.
Change History (3)
comment:1 by , 10 years ago
comment:2 by , 10 years ago
Triage Stage: | Unreviewed → Accepted |
---|---|
Type: | Uncategorized → Cleanup/optimization |
I guess all of these cases could be optimized. The reasoning is that if foreign key constraints are in effect (and Django's ORM assumes that they always are), then the joins are non-necessary.
I expect fixing these require a lot of work for marginal gain. In other words this isn't a high priority item to fix.
comment:3 by , 2 years ago
Small note that the first case is no longer an issue but the other ones remain.
First and third cases might be valid, but if there's a possibility you could end up with a
Child
that doesn't have aParent
, then the join is necessary to filter out those objects and changing this would be backwards incompatible.Second case doesn't seem like a bug as
EmptyParent
isn't really empty (it has the auto-generatedid
field).The suggestion of "Maybe these could be solved using _ptr_id as "default pk"" is definitely backwards incompatible and would be unexpected behavior IMO. I'll leave this open so that someone with more knowledge of the ORM can point out if/why these joins are necessary (or accept the ticket if they aren't).