Opened 10 years ago

Last modified 19 months ago

#23319 new Cleanup/optimization

Django uses unnecessary join in concrete inheritance

Reported by: avidi Owned by: nobody
Component: Database layer (models, ORM) Version:
Severity: Normal Keywords:
Cc: Triage Stage: Accepted
Has patch: no Needs documentation: no
Needs tests: no Patch needs improvement: no
Easy pickings: no UI/UX: no

Description

In some situations django use JOIN to get data from parent model when it's unnecessary.

First case

Using select to get only fields of children.

class Parent(models.Model):
    field_of_parent = models.TextField()

class Child(Parent):
    field_of_child = models.TextField()
>>> str(Child.objects.only("field_of_child").all().query)
'SELECT "homettt_child"."parent_ptr_id", "homettt_child"."field_of_child" FROM "homettt_child" INNER JOIN "homettt_parent" ON ( "homettt_child"."parent_ptr_id" = "homettt_parent"."id" )'

Second case

Get all fields but parent hasn't any field

class EmptyParent(models.Model):
    pass

class ChildB(EmptyParent):
    field_of_child = models.TextField()
>>> str(ChildB.objects.all().query)
'SELECT "homettt_emptyparent"."id", "homettt_childb"."emptyparent_ptr_id", "homettt_childb"."field_of_child" FROM "homettt_childb" INNER JOIN "homettt_emptyparent" ON ( "homettt_childb"."emptyparent_ptr_id" = "homettt_emptyparent"."id" )'

Third case

In similar way through relations:
For example, ManyToMany relations django could use _ptr_id column to join but uses id column of parent. If you don't need any fields (or they don't exist) the join is unnecessary.

class Parent(models.Model):
    field_of_parent = models.TextField()

class Child(Parent):
    field_of_child = models.TextField()

class OtherModel(models.Model):
    m2mrelation = models.ManyToManyField(Child)
other=OtherModel.objects.create()
>>> other.m2mrelation.create()
<Child: Child object>
str(other.m2mrelation.only("field_of_child").all().query)
>>> str(other.m2mrelation.only("field_of_child").all().query)

'SELECT "homettt_child"."parent_ptr_id", "homettt_child"."field_of_child" FROM "homettt_child" INNER JOIN "homettt_othermodel_m2mrelation" ON ( "homettt_child"."parent_ptr_id" = "homettt_othermodel_m2mrelation"."child_id" ) INNER JOIN "homettt_parent" ON ( "homettt_child"."parent_ptr_id" = "homettt_parent"."id" ) WHERE "homettt_othermodel_m2mrelation"."othermodel_id" = 1'

Maybe these could be solved using _ptr_id as "default pk" in child models and only use id of parent if is necessary information of table's parent model.

Tested in 1.7rc2 but maybe this appears in previous versions.

Change History (3)

comment:1 by Tim Graham, 10 years ago

First and third cases might be valid, but if there's a possibility you could end up with a Child that doesn't have a Parent, then the join is necessary to filter out those objects and changing this would be backwards incompatible.

Second case doesn't seem like a bug as EmptyParent isn't really empty (it has the auto-generated id field).

The suggestion of "Maybe these could be solved using _ptr_id as "default pk"" is definitely backwards incompatible and would be unexpected behavior IMO. I'll leave this open so that someone with more knowledge of the ORM can point out if/why these joins are necessary (or accept the ticket if they aren't).

comment:2 by Anssi Kääriäinen, 10 years ago

Triage Stage: UnreviewedAccepted
Type: UncategorizedCleanup/optimization

I guess all of these cases could be optimized. The reasoning is that if foreign key constraints are in effect (and Django's ORM assumes that they always are), then the joins are non-necessary.

I expect fixing these require a lot of work for marginal gain. In other words this isn't a high priority item to fix.

comment:3 by Simon Charette, 19 months ago

Small note that the first case is no longer an issue but the other ones remain.

Note: See TracTickets for help on using tickets.
Back to Top