Opened 16 years ago

Closed 8 years ago

#9394 closed Cleanup/optimization (duplicate)

Reverse relation lookups with a multi-table inherited model produces extraneous queries

Reported by: Erin Kelly Owned by:
Component: Database layer (models, ORM) Version: dev
Severity: Normal Keywords:
Cc: Triage Stage: Accepted
Has patch: yes Needs documentation: no
Needs tests: no Patch needs improvement: yes
Easy pickings: no UI/UX: no

Description (last modified by Ramiro Morales)

With the following models:

class Place(models.Model):
    name = models.CharField(max_length=50)

class Business(Place):
    owner = models.CharField(max_length=50)

class Restaurant(Business):
    rating = models.IntegerField()

class Chef(models.Model):
    name = models.CharField(max_length=50)
    restaurants = models.ManyToManyField(Restaurant, through='Employee')

class Employee(models.Model):
    restaurant = models.ForeignKey(Restaurant)
    chef = models.ForeignKey(Chef)
    years_of_service = models.IntegerField()

we can do some_restaurant.employee_set.all(), which results in three queries. The first two queries are just retrieving the attributes of the inherited Business and Place models, which is unnecessary since all that information already exists on the some_restaurant object. Only one query should be needed.

Attachments (1)

pk-trace.diff (1.6 KB ) - added by Alex Gaynor 16 years ago.
optimization, only tested on forward relations, but it's a 2 query optimization for this case, all tests pass

Download all attachments as: .zip

Change History (21)

comment:1 by Erin Kelly, 16 years ago

The occurrences of ChineseRestaurant in the models should just be Restaurant. Sorry about that.

comment:2 by Ramiro Morales, 16 years ago

Description: modified (diff)

(edited code example in the description as per Ian's note above, replacing ChineseRestaurant with Restaurant.)

comment:3 by Malcolm Tredinnick, 16 years ago

Owner: changed from nobody to Malcolm Tredinnick
Status: newassigned
Triage Stage: UnreviewedAccepted

Turns out this doesn't have anything in particular to do with many-to-many and "through". It's actually a general wart in the way reverse relations work with model inheritance. From a bit of poking around, it might be a little fiddly to get it working efficiently, too. However, I've got a test case now that demonstrates the problem, so I'll poke at it some more.

comment:4 by Malcolm Tredinnick, 16 years ago

Summary: Querying a many-to-many intermediate model from a manager on a multi-table inherited model produces extraneous queriesReverse relation lookups with a multi-table inherited model produces extraneous queries

comment:5 by JasonCreighton, 16 years ago

I'm running into what I think is the same bug. I'm using 1.0.2. Given these models:

class Base(models.Model):
    base_info = models.CharField(max_length=50)

class Extra(Base):
    extra_info = models.CharField(max_length=50)

class ExtraRelatedData(models.Model):
    extra = models.ForeignKey(Extra)
    related_data = models.CharField(max_length=50)

And some pre-populated data:

>>> e = Extra(base_info="foo", extra_info="bar")
>>> e.save()
>>> ed = ExtraRelatedData(extra=e, related_data="quux")
>>> ed.save()

If I later run:

>>> e = Extra.objects.get(pk=1)
>>> ExtraRelatedData.objects.filter(extra=e)
[<ExtraRelatedData: ExtraRelatedData object>]
>>> for q in connection.queries: print repr(q)
... 
{'time': '0.001', 'sql': u'SELECT
"inheritance_base"."id", "inheritance_base"."base_info", "inheritance_extra"."base_ptr_id", "inheritance_extra"."extra_info"
FROM "inheritance_extra"
INNER JOIN "inheritance_base" ON ("inheritance_extra"."base_ptr_id" = "inheritance_base"."id")
WHERE "inheritance_extra"."base_ptr_id" = 1 '}
{'time': '0.000', 'sql': u'SELECT
"inheritance_base"."id", "inheritance_base"."base_info"
FROM "inheritance_base" WHERE "inheritance_base"."id" = 1 '}
{'time': '0.000', 'sql': u'SELECT
"inheritance_extrarelateddata"."id", "inheritance_extrarelateddata"."extra_id", "inheritance_extrarelateddata"."related_data"
FROM "inheritance_extrarelateddata"
WHERE "inheritance_extrarelateddata"."extra_id" = 1  LIMIT 21'}

Even though "e" is already loaded, Django loads it again. (To get the pk?) I can workaround it like so:

>>> e = Extra.objects.get(pk=1)
>>> ExtraRelatedData.objects.filter(extra__pk=e.pk)
[<ExtraRelatedData: ExtraRelatedData object>]
>>> for q in connection.queries: print repr(q)
... 
{'time': '0.001', 'sql': u'SELECT
"inheritance_base"."id", "inheritance_base"."base_info", "inheritance_extra"."base_ptr_id", "inheritance_extra"."extra_info"
FROM "inheritance_extra"
INNER JOIN "inheritance_base" ON ("inheritance_extra"."base_ptr_id" = "inheritance_base"."id")
WHERE "inheritance_extra"."base_ptr_id" = 1 '}
{'time': '0.000', 'sql': u'SELECT
"inheritance_extrarelateddata"."id", "inheritance_extrarelateddata"."extra_id", "inheritance_extrarelateddata"."related_data"
FROM "inheritance_extrarelateddata"
WHERE "inheritance_extrarelateddata"."extra_id" = 1  LIMIT 21'}

[ SQL reformatted slightly ]

It seems completely wrong to me that there's a case where filter(foo=foo) is not that same as filter(foo__pk=foo.pk).

So is this the same bug, or should I open a new ticket?

by Alex Gaynor, 16 years ago

Attachment: pk-trace.diff added

optimization, only tested on forward relations, but it's a 2 query optimization for this case, all tests pass

comment:6 by Alex Gaynor, 16 years ago

On further review this patch would break when you have an subclass that has a primary key that isn't the parent link, is that allowed?

comment:7 by JasonCreighton, 16 years ago

See #10251 in relation to having a separate primary key for the child class.

comment:8 by Alex Gaynor, 16 years ago

If the decision on 10251 is that it's disallowed(which I personally favor) then the patch I supplied should work fine and remedy the issue.

comment:9 by Malcolm Tredinnick, 16 years ago

milestone: 1.1

comment:10 by JasonCreighton, 16 years ago

It's worth noting that in some circumstances, this bug can drastically increase the number of queries run. For instance, each time an instance of a model that uses inheritance is displayed in a form using a ModelChoiceField, an extra query is run. So if you display 50 items in select box, 51 queries will be run. (1 for the initial list, and an extra query for each item in the list.)

It becomes even worse if you have multiple select boxes, as you might with an inline formset, since the extra query will be run for each item in each select box, which can easily lead to hundreds of unnecessary queries being run.

comment:11 by Florian Apolloner, 16 years ago

Has patch: set

comment:12 by Jacob, 16 years ago

milestone: 1.11.2

Pushing to 1.2: this is an optimization, not a bug.

comment:13 by Alex Gaynor, 15 years ago

milestone: 1.21.3

Likewise, pushing to 1.3: this is an optimization, not a bug.

comment:14 by Julien Phalip, 14 years ago

Patch needs improvement: set
Severity: Normal
Type: Cleanup/optimization

The tests would need to be rewritten using unittests since this is now Django's preferred way.

comment:15 by Jacob, 13 years ago

milestone: 1.3

Milestone 1.3 deleted

comment:11 by Aymeric Augustin, 13 years ago

UI/UX: unset

Change UI/UX from NULL to False.

comment:12 by Aymeric Augustin, 13 years ago

Easy pickings: unset

Change Easy pickings from NULL to False.

comment:13 by Anssi Kääriäinen, 12 years ago

I just verified that this we still generate extra queries in this case.

comment:14 by Claude Paroz, 12 years ago

Owner: Malcolm Tredinnick removed
Status: assignednew

comment:15 by Tim Graham, 8 years ago

Resolution: duplicate
Status: newclosed

Looks like a duplicate of #15844.

(Fixed in Django 1.6 with 97774429aeb54df4c09895c07cd1b09e70201f7d and a test added in fa2e1e633ac2073906ed3f1f32107d02331107aa.)

Note: See TracTickets for help on using tickets.
Back to Top