Opened 9 years ago

Closed 8 years ago

#3275 closed enhancement (duplicate)

[patch] select_related() additions (depth=N, fields=[])

Reported by: David Cramer <dcramer@…> Owned by: nobody
Component: Database layer (models, ORM) Version: master
Severity: normal Keywords: select_related()
Cc: matthias@…, gary.wilson@…, gabor@… Triage Stage: Accepted
Has patch: yes Needs documentation: yes
Needs tests: yes Patch needs improvement: yes
Easy pickings: UI/UX:

Description

While I was happy with django's db api, to an extent, it did not have everything needed for the basics. So here is a quick solution to some related fields using select_related().

It adds to parameters to select_related(), depth, and fields.

depth: A numerical field and represents the recursion depth for keys, by default, django recurses infinitely on any keys that are not blank=True
fields: A list of field names in the base model to join with. It does not support children, ie relatedfieldfieldname, but I'd like to add this later, this will also set depth to 1.

I've done several unit tests (several being all of curse-gaming.com) and we're pushing the changes live on the site now. It's helping performance out quite a bit in areas where it was too difficult to manually join with just one other table, or areas where we just wanted top level results.

Attachments (3)

query.py.diff (5.9 KB) - added by David Cramer <dcramer@…> 9 years ago.
diffs for django/db/models/query.py
test_depth_bug.diff (622 bytes) - added by marcin@… 8 years ago.
Test case for the bug with non-zero depth
fix_depth_bug.diff (443 bytes) - added by marcin@… 8 years ago.
Fix for the bug with non-zero depth

Download all attachments as: .zip

Change History (27)

Changed 9 years ago by David Cramer <dcramer@…>

diffs for django/db/models/query.py

comment:1 Changed 9 years ago by Jeremy Dunck <jdunck@…>

  • Summary changed from select_related() additions (depth=N, fields=[]) to [patch] select_related() additions (depth=N, fields=[])

comment:2 Changed 9 years ago by Gary Wilson <gary.wilson@…>

  • Cc gary.wilson@… added

comment:3 Changed 9 years ago by David Cramer <dcramer@…>

It still seems to have a bug when just doing .select_related(depth=1), sometimes its filling the field w/ the wrong data, looking into it.

comment:4 Changed 9 years ago by David Cramer <dcramer@…>

Ok the bug is not with my code, but select_related() in general. If your field order doesn't match django's field order it will bug. Code should be good :)

comment:5 Changed 9 years ago by SmileyChris

  • Needs documentation set
  • Needs tests set
  • Patch needs improvement set
  • Triage Stage changed from Unreviewed to Accepted

If it can "bug" because of this new feature (supplying fields), then it needs improvement.

Something this big also needs tests and probably some documentation. But thanks for the first cut, David!

comment:6 Changed 9 years ago by anonymous

Well the bug wasn't select_related, it was my code (I believe). As I was pulling fields in the wrong order and creating an object off it

comment:7 Changed 9 years ago by Michael Radziej <mir@…>

#2765 is a related ticket that suggests to handle select_related() with ValuesQuerySet.

comment:8 Changed 9 years ago by Michael Radziej <mir@…>

Args, it was #3358 (I had too many tickets open ...) Sorry for the noise!

comment:9 Changed 9 years ago by David Cramer <dcramer@…>

I've come to the conclusion I don't have enough time to please everyone, if someone wants to do the docs and unit tests for this it's fine, otherwise we'll just continue to merge our branch with the live branch. It works 100% for us as we are running 300k+ visits per hour using this solution to optimize SQL queries and have been for over two weeks now.

comment:10 follow-up: Changed 8 years ago by jacob

(In [4645]) Added a "depth" argument to select_related() to control how many "levels" of relations select_related() is willing to follow (refs #3275).

Also added unit tests for select_related().

comment:11 Changed 8 years ago by boris.erdmann@…

select_related (sometimes?) produces too broad queries for *depth=1*.
In these cases the resulting data set confuses django to the point
where the according QuerySet has only empty attributes.

Changing the declaration of fill_table_cache (db/models/query.py)
from

def fill_table_cache(opts, select, tables, where, old_prefix, cache_tables_seen, max_depth=0, cur_depth=0):

to

def fill_table_cache(opts, select, tables, where, old_prefix, cache_tables_seen, max_depth=0, cur_depth=1):

Fixes this for me. So it MIGHT solve the problem generally.

Boris

comment:12 in reply to: ↑ 10 Changed 8 years ago by wolfram.kriesing@…

Replying to jacob:

(In [4645]) Added a "depth" argument to select_related() to control how many "levels" of relations select_related() is willing to follow (refs #3275).

Also added unit tests for select_related().

Thanks for adding this. But why not add the entire patch? The fields parameter is the one that really makes sense. Imagine:

MyModel(models.Model):
    field1 = ForeignKey()
    field2 = ForeignKey()
    field3 = ForeignKey()

But I only want field1 to be considered in select_related(), so i would call

     ...select_related(fields=["field1"]).all()

That would leave out two joins in the query, that might have a huge effect, and might reduce load by lots!
Please add the fields parameter too!
Thanks

comment:13 Changed 8 years ago by David Cramer <dcramer@…>

I believe the reason they didn't add it is because there could be a better way to implement it, and it's not a full-featured solution (as you can't select child tables)

comment:14 Changed 8 years ago by wolfram.kriesing@…

mmmh, Django is in development, so why not this feature? :-)
I see it as very useful, so get it started, please!

comment:15 Changed 8 years ago by marcin@…

There is a bug in the way depth is handled in fill_table_cache, see test_depth_bug.diff for a test case. When you specify a non-zero depth fill_table_cache goes one relationship too deep which causes get_cached_row not to handle all the columns from select_related.

This bug shows up when you use extra on that QuerySet: the extra fields to select should immediately follow columns handled by get_cached_row, but instead they get data from the relationship that fill_table_cache added and get_cached_row skipped.

See fix_depth_bug.diff for a solution.

Changed 8 years ago by marcin@…

Test case for the bug with non-zero depth

Changed 8 years ago by marcin@…

Fix for the bug with non-zero depth

comment:16 Changed 8 years ago by PhiR

#3623 and #4789 are duplicates

comment:17 Changed 8 years ago by Gábor Farkas <gabor@…>

  • Cc gabor@… added

comment:18 Changed 8 years ago by Matthias Urlichs <smurf@…>

  • Cc matthias@… added

The fix_depth_bug.diff patch is not yet in mainline. What gives?

comment:19 Changed 8 years ago by russellm

What gives? The same thing that 'gives' for every other contributed patch that hasn't been committed.

This is that this is a volunteer project, and the core contributors are all very busy people. Unfortunately, our priorities don't correspond with yours in this case.

It also doesn't help that it is currently triaged as 'requiring docs, needing tests and patch needs improvement'. A patch with all 3 of those labels isn't going to get much attention from anyone in a position to commit.

Further complication comes from the queryset refactor that is currently underway, which will potentially tread on the toes of this contribution.

You're free to use this patch in your own Django checkouts, and you're free to act as an advocate for including this patch in trunk by attempting to raise a productive discussion on django-developers. Complaining that your pet patch hasn't been included in trunk is neither helpful nor productive.

comment:20 follow-up: Changed 8 years ago by Gábor Farkas <gabor@…>

unfortunately in this case i think improving the patch won't help.
there was a patch in ticket 4879, that had tests and everything, and it was not committed.

as Malcom described here: http://groups.google.com/group/django-developers/browse_thread/thread/eb3e315abdba5ce9/d0401b4d31717e70 ,

it will be fixed when the queryset-refactor branch is merged.

btw. my opinion is that this bug should get a little higher priority, because without this patch select_related(depth) simply returns wrong data,
but probably different priorities for different people :-)

p.s: please note, that the "patch needs improvement" flag is probably meant for the other patches in this ticket. the fix_depth_bug.diff patch is a one-line-patch,
which adds exactly one character. it's hard to imagine how such a patch could be improved :)

comment:21 in reply to: ↑ 20 Changed 8 years ago by russellm

Replying to Gábor Farkas <gabor@nekomancer.net>:

btw. my opinion is that this bug should get a little higher priority, because without this patch select_related(depth) simply returns wrong data,
but probably different priorities for different people :-)

Caveat - I haven't looked into this particular issue or patch, and I probably won't have a chance to in the near future (the "what gives" comment just caught my ire).

However, you will find that the overwhelming philosophy of Django development is that once a serious problem has been identified, we put all of our efforts into fixing the underlying problem rather than patching a solution known to be broken. The core developers have limited resources, and every hour we spend identifying, inspecting and applying a partial fix is an hour we don't spend fixing the underlying problem. Edit-inline is one area where this is currently the case; anything touching querysets is another.

p.s: please note, that the "patch needs improvement" flag is probably meant for the other patches in this ticket. the fix_depth_bug.diff patch is a one-line-patch,
which adds exactly one character. it's hard to imagine how such a patch could be improved :)

It's also hard to know how the core developers are supposed to know this without tracking every single comment on every single ticket. If there is some subset of this ticket that constitutes a one character fix that needs to be made, it should be a separate ticket describing the single issue that will be resolved by applying the patch.

I should also point out that in reality, there isn't any such thing as a 'one character fix'. You need to convince me (or any other core developer) that the one character fix is (1) required and (2) correct. You can't do that in 1 character :-)

comment:22 Changed 8 years ago by Matthias Urlichs <smurf@…>

I'm sorry if you misunderstood my "what gives". It wasn't directed at you specifically, and I admittedly overlooked the fact that the last two patches (which I'm concerned about) address a problem that ends up being completely unrelated to the first one (to which the needs-* bits apply).

I'll re-file these patches into a new ticket.

comment:23 Changed 8 years ago by Matthias Urlichs <smurf@…>

See #6018 for a re-post of the select_related(depth...).extra() problem.

comment:24 Changed 8 years ago by mtredinnick

  • Resolution set to duplicate
  • Status changed from new to closed

Marking as a dupe of #6018, since the root cause has been fixed as part of that ticket.

Note: See TracTickets for help on using tickets.
Back to Top