Opened 11 years ago

Closed 11 years ago

Last modified 8 years ago

#7246 closed (fixed)

select_related doesn't join parent chain for ForeignKeys on inherited models

Reported by: Ivan Sagalaev <Maniac@…> Owned by: Jacob
Component: Database layer (models, ORM) Version: master
Severity: Keywords: qsrf-cleanup select_related inheritance
Cc: Malcolm Tredinnick Triage Stage: Accepted
Has patch: yes Needs documentation: no
Needs tests: no Patch needs improvement: no
Easy pickings: no UI/UX: no


Given this model where Pizza refers to ItalianRestaurant which is inherited from Restaurant:

class Restaurant(models.Model):
    # ...

class ItalianRestaurant(Restaurant):
    # ...

class Pizza(models.Model):
    restaurant = models.ForeignKey(ItalianRestaurant)

Queries to Pizza with select_related don't join Restaurant table and try to attach its fields to ItalianRestaurant table instead. Which results in wrong SQL.

Patch with tests follows. Though I believe that the patch is correct, logic of fill_related_selections seems to play on coincidental similarities between parent links and ordinary links and it may be better to refactor those into separate code paths. But I didn't penetrate it deep enough to do it.

Attachments (3)

7246.diff (2.7 KB) - added by Ivan Sagalaev <Maniac@…> 11 years ago.
7246-model-inheritance-select-related.patch (3.3 KB) - added by Jacob 11 years ago.
7426_inherited_models_select_related_r7722.patch (4.4 KB) - added by George Vilches 11 years ago.
Patch against r7722 to make inherited models .select_related() safe, reference implementation.

Download all attachments as: .zip

Change History (11)

Changed 11 years ago by Ivan Sagalaev <Maniac@…>

Attachment: 7246.diff added


comment:1 Changed 11 years ago by George Vilches

Keywords: qsrf-cleanup added

comment:2 Changed 11 years ago by Jacob

milestone: 1.0

comment:3 Changed 11 years ago by Jacob

Owner: changed from nobody to Jacob
Status: newassigned
Triage Stage: UnreviewedAccepted

comment:4 Changed 11 years ago by Jacob

Status: assignednew

I've refreshed this patch against trunk and took the opportunity to move the test into its own regression test module. Unfortunately, the patch doesn't seem to fix the problem:

FAIL: Doctest: regressiontests.model_inheritance_select_related.models.__test__.API_TESTS
Traceback (most recent call last):
  File "/Users/jacob/Projects/Django/upstream/django/test/", line 2180, in runTest
    raise self.failureException(self.format_failure(new.getvalue()))
AssertionError: Failed doctest test for regressiontests.model_inheritance_select_related.models.__test__.API_TESTS
  File "/Users/jacob/Projects/Django/upstream/tests/regressiontests/model_inheritance_select_related/", line unknown line number, in API_TESTS

File "/Users/jacob/Projects/Django/upstream/tests/regressiontests/model_inheritance_select_related/", line ?, in regressiontests.model_inheritance_select_related.models.__test__.API_TESTS
Failed example:

As you can see, something's wonky with the list of selected fields; attributes are getting the wrong data.

Changed 11 years ago by George Vilches

Patch against r7722 to make inherited models .select_related() safe, reference implementation.

comment:5 Changed 11 years ago by George Vilches

The patch I have just provided is a reference implementation. Although it does successfully pass all the existing Django unit tests, I believe it exposes some larger flaws in the way model inheritance is treated, select_related() and otherwise, that might need to be reviewed at a much higher level. That's not to say this isn't safe to check in (I would suggest removing the comments about hacks first, but as a reference implementation I left them in to trigger discourse), in fact I would recommend it if for no other reason than to get more eyes on using inherited models, as this is somewhat of a blocker for anyone doing more than very basic tasks with them, but that separate from checking in we may want to review how inherited models are handled in the ORM.

Here's the things that I found touchy:

1) A few code paths are currently blocked by checks for Field.rel.parent_link, and it appears this is done to avoid double-inclusion of some fields. However, it seems that if more of the SQL internals were to switch to using _meta.local_fields more regularly, this would be a safe activity; however...

2) By using _meta.local_fields, the way the fields are pulled out of the database generally ends up in an unintuitive order in any place where recursion plays a role (primarily fill_related_selections()). Instead of getting the fields the way they are ordered in the parent model, they get them as they inherit down the tree. This is rather tedious to work with when we get to get_cached_row().

3) In fill_related_selections, the bit that does {{ self.related_select_cols.extend(...) }} is a tremendous cheat that causes us problems down the road with the field ordering above. If we put all fields in there ({{ _meta.fields }}), they have to share an alias, which causes an invalid table choice for the inherited fields. If we only put the {{ ._meta.local_fields }} in there, then we have the ordering problem. One solution to fix this might be that we do the {{ self.related_select_* }} on the recursion against the .parent_link, but I have not yet explored the implications of changing it in that manner (for instance, this population would have to occur before the depth checks). Another solution would be to replace the .extend with a loop that does all the expected alias generation if we detect that we are in an inherited model case, to make sure the right table prefixes are attached, but this has performance implications at the very least.

I'll be giving the inherited models and relationship population a more thorough going over separate from this ticket, but this patch at least does successfully handle all the unit tests currently available, and with a bit of touchup should probably be a good stopgap measure.

comment:6 Changed 11 years ago by Malcolm Tredinnick

Resolution: fixed
Status: newclosed

(In [7781]) Fixed #7246 -- Pull in the all the necessary data when using select_related() with multi-table inheritance.

comment:7 Changed 11 years ago by Malcolm Tredinnick

A couple of comments on the final fix for this:

@jacob : Dude, your test was wrong. Please try harder next time. :-)

@gav: I think you're missing the point for using _meta.fields in the (very few) places we do. The query construction code goes to lengths to make sure that what it passes back to django.db.models.query.QuerySet can be used as positional arguments for creating a model class. This means all the parent fields have to be in order from oldest to youngest. We only use _meta.fields in three places in the query construction code, so it's not really a problem: once in the iterator, when things are already organised, once as a sanity check in setup_joins() and once in fill_related_selections() to populate the field names (we were previously using it in four places, which was the bogus column creation in fill_related_selections(), but that was what this bug was about.

If creating things via keyword arguments wasn't so much slower than using positional args, we could treat parent classes as compulsory select_related() columns, but the performance difference is measurable and what we've got now should be fairly robust.

comment:8 Changed 8 years ago by Jacob

milestone: 1.0

Milestone 1.0 deleted

Note: See TracTickets for help on using tickets.
Back to Top