Opened 18 years ago

Closed 17 years ago

#3275 closed enhancement (duplicate)

[patch] select_related() additions (depth=N, fields=[])

Reported by: David Cramer <dcramer@…> Owned by: nobody
Component: Database layer (models, ORM) Version: dev
Severity: normal Keywords: select_related()
Cc: matthias@…, gary.wilson@…, gabor@… Triage Stage: Accepted
Has patch: yes Needs documentation: yes
Needs tests: yes Patch needs improvement: yes
Easy pickings: no UI/UX: no

Description

While I was happy with django's db api, to an extent, it did not have everything needed for the basics. So here is a quick solution to some related fields using select_related().

It adds to parameters to select_related(), depth, and fields.

depth: A numerical field and represents the recursion depth for keys, by default, django recurses infinitely on any keys that are not blank=True
fields: A list of field names in the base model to join with. It does not support children, ie relatedfieldfieldname, but I'd like to add this later, this will also set depth to 1.

I've done several unit tests (several being all of curse-gaming.com) and we're pushing the changes live on the site now. It's helping performance out quite a bit in areas where it was too difficult to manually join with just one other table, or areas where we just wanted top level results.

Attachments (3)

query.py.diff (5.9 KB ) - added by David Cramer <dcramer@…> 18 years ago.
diffs for django/db/models/query.py
test_depth_bug.diff (622 bytes ) - added by marcin@… 18 years ago.
Test case for the bug with non-zero depth
fix_depth_bug.diff (443 bytes ) - added by marcin@… 18 years ago.
Fix for the bug with non-zero depth

Download all attachments as: .zip

Change History (27)

by David Cramer <dcramer@…>, 18 years ago

Attachment: query.py.diff added

diffs for django/db/models/query.py

comment:1 by Jeremy Dunck <jdunck@…>, 18 years ago

Summary: select_related() additions (depth=N, fields=[])[patch] select_related() additions (depth=N, fields=[])

comment:2 by Gary Wilson <gary.wilson@…>, 18 years ago

Cc: gary.wilson@… added

comment:3 by David Cramer <dcramer@…>, 18 years ago

It still seems to have a bug when just doing .select_related(depth=1), sometimes its filling the field w/ the wrong data, looking into it.

comment:4 by David Cramer <dcramer@…>, 18 years ago

Ok the bug is not with my code, but select_related() in general. If your field order doesn't match django's field order it will bug. Code should be good :)

comment:5 by Chris Beaven, 18 years ago

Needs documentation: set
Needs tests: set
Patch needs improvement: set
Triage Stage: UnreviewedAccepted

If it can "bug" because of this new feature (supplying fields), then it needs improvement.

Something this big also needs tests and probably some documentation. But thanks for the first cut, David!

comment:6 by anonymous, 18 years ago

Well the bug wasn't select_related, it was my code (I believe). As I was pulling fields in the wrong order and creating an object off it

comment:7 by Michael Radziej <mir@…>, 18 years ago

#2765 is a related ticket that suggests to handle select_related() with ValuesQuerySet.

comment:8 by Michael Radziej <mir@…>, 18 years ago

Args, it was #3358 (I had too many tickets open ...) Sorry for the noise!

comment:9 by David Cramer <dcramer@…>, 18 years ago

I've come to the conclusion I don't have enough time to please everyone, if someone wants to do the docs and unit tests for this it's fine, otherwise we'll just continue to merge our branch with the live branch. It works 100% for us as we are running 300k+ visits per hour using this solution to optimize SQL queries and have been for over two weeks now.

comment:10 by Jacob, 18 years ago

(In [4645]) Added a "depth" argument to select_related() to control how many "levels" of relations select_related() is willing to follow (refs #3275).

Also added unit tests for select_related().

comment:11 by boris.erdmann@…, 18 years ago

select_related (sometimes?) produces too broad queries for *depth=1*.
In these cases the resulting data set confuses django to the point
where the according QuerySet has only empty attributes.

Changing the declaration of fill_table_cache (db/models/query.py)
from

def fill_table_cache(opts, select, tables, where, old_prefix, cache_tables_seen, max_depth=0, cur_depth=0):

to

def fill_table_cache(opts, select, tables, where, old_prefix, cache_tables_seen, max_depth=0, cur_depth=1):

Fixes this for me. So it MIGHT solve the problem generally.

Boris

in reply to:  10 comment:12 by wolfram.kriesing@…, 18 years ago

Replying to jacob:

(In [4645]) Added a "depth" argument to select_related() to control how many "levels" of relations select_related() is willing to follow (refs #3275).

Also added unit tests for select_related().

Thanks for adding this. But why not add the entire patch? The fields parameter is the one that really makes sense. Imagine:

MyModel(models.Model):
    field1 = ForeignKey()
    field2 = ForeignKey()
    field3 = ForeignKey()

But I only want field1 to be considered in select_related(), so i would call

     ...select_related(fields=["field1"]).all()

That would leave out two joins in the query, that might have a huge effect, and might reduce load by lots!
Please add the fields parameter too!
Thanks

comment:13 by David Cramer <dcramer@…>, 18 years ago

I believe the reason they didn't add it is because there could be a better way to implement it, and it's not a full-featured solution (as you can't select child tables)

comment:14 by wolfram.kriesing@…, 18 years ago

mmmh, Django is in development, so why not this feature? :-)
I see it as very useful, so get it started, please!

comment:15 by marcin@…, 18 years ago

There is a bug in the way depth is handled in fill_table_cache, see test_depth_bug.diff for a test case. When you specify a non-zero depth fill_table_cache goes one relationship too deep which causes get_cached_row not to handle all the columns from select_related.

This bug shows up when you use extra on that QuerySet: the extra fields to select should immediately follow columns handled by get_cached_row, but instead they get data from the relationship that fill_table_cache added and get_cached_row skipped.

See fix_depth_bug.diff for a solution.

by marcin@…, 18 years ago

Attachment: test_depth_bug.diff added

Test case for the bug with non-zero depth

by marcin@…, 18 years ago

Attachment: fix_depth_bug.diff added

Fix for the bug with non-zero depth

comment:16 by Philippe Raoult, 17 years ago

#3623 and #4789 are duplicates

comment:17 by Gábor Farkas <gabor@…>, 17 years ago

Cc: gabor@… added

comment:18 by Matthias Urlichs <smurf@…>, 17 years ago

Cc: matthias@… added

The fix_depth_bug.diff patch is not yet in mainline. What gives?

comment:19 by Russell Keith-Magee, 17 years ago

What gives? The same thing that 'gives' for every other contributed patch that hasn't been committed.

This is that this is a volunteer project, and the core contributors are all very busy people. Unfortunately, our priorities don't correspond with yours in this case.

It also doesn't help that it is currently triaged as 'requiring docs, needing tests and patch needs improvement'. A patch with all 3 of those labels isn't going to get much attention from anyone in a position to commit.

Further complication comes from the queryset refactor that is currently underway, which will potentially tread on the toes of this contribution.

You're free to use this patch in your own Django checkouts, and you're free to act as an advocate for including this patch in trunk by attempting to raise a productive discussion on django-developers. Complaining that your pet patch hasn't been included in trunk is neither helpful nor productive.

comment:20 by Gábor Farkas <gabor@…>, 17 years ago

unfortunately in this case i think improving the patch won't help.
there was a patch in ticket 4879, that had tests and everything, and it was not committed.

as Malcom described here: http://groups.google.com/group/django-developers/browse_thread/thread/eb3e315abdba5ce9/d0401b4d31717e70 ,

it will be fixed when the queryset-refactor branch is merged.

btw. my opinion is that this bug should get a little higher priority, because without this patch select_related(depth) simply returns wrong data,
but probably different priorities for different people :-)

p.s: please note, that the "patch needs improvement" flag is probably meant for the other patches in this ticket. the fix_depth_bug.diff patch is a one-line-patch,
which adds exactly one character. it's hard to imagine how such a patch could be improved :)

in reply to:  20 comment:21 by Russell Keith-Magee, 17 years ago

Replying to Gábor Farkas <gabor@nekomancer.net>:

btw. my opinion is that this bug should get a little higher priority, because without this patch select_related(depth) simply returns wrong data,
but probably different priorities for different people :-)

Caveat - I haven't looked into this particular issue or patch, and I probably won't have a chance to in the near future (the "what gives" comment just caught my ire).

However, you will find that the overwhelming philosophy of Django development is that once a serious problem has been identified, we put all of our efforts into fixing the underlying problem rather than patching a solution known to be broken. The core developers have limited resources, and every hour we spend identifying, inspecting and applying a partial fix is an hour we don't spend fixing the underlying problem. Edit-inline is one area where this is currently the case; anything touching querysets is another.

p.s: please note, that the "patch needs improvement" flag is probably meant for the other patches in this ticket. the fix_depth_bug.diff patch is a one-line-patch,
which adds exactly one character. it's hard to imagine how such a patch could be improved :)

It's also hard to know how the core developers are supposed to know this without tracking every single comment on every single ticket. If there is some subset of this ticket that constitutes a one character fix that needs to be made, it should be a separate ticket describing the single issue that will be resolved by applying the patch.

I should also point out that in reality, there isn't any such thing as a 'one character fix'. You need to convince me (or any other core developer) that the one character fix is (1) required and (2) correct. You can't do that in 1 character :-)

comment:22 by Matthias Urlichs <smurf@…>, 17 years ago

I'm sorry if you misunderstood my "what gives". It wasn't directed at you specifically, and I admittedly overlooked the fact that the last two patches (which I'm concerned about) address a problem that ends up being completely unrelated to the first one (to which the needs-* bits apply).

I'll re-file these patches into a new ticket.

comment:23 by Matthias Urlichs <smurf@…>, 17 years ago

See #6018 for a re-post of the select_related(depth...).extra() problem.

comment:24 by Malcolm Tredinnick, 17 years ago

Resolution: duplicate
Status: newclosed

Marking as a dupe of #6018, since the root cause has been fixed as part of that ticket.

Note: See TracTickets for help on using tickets.
Back to Top