Django

Code

Ticket #3358 (closed: duplicate)

Opened 2 years ago

Last modified 9 months ago

select_related() should work with ValuesQuerySets

Reported by: dbr <daniel.brandt@gmail.com> Assigned to: nobody
Milestone: Component: Database wrapper
Version: Keywords:
Cc: Triage Stage: Accepted
Has patch: 0 Needs documentation: 0
Needs tests: 0 Patch needs improvement: 0

Description

(Some of you may be reading this for the second or third time since I brought it up both on the django-developers list and IRC)

There appears to be no way to make a query like this through the django models without using SQL or hitting the database twice:

SELECT
trackback.title, blog.name, blog.url
FROM trackback INNER JOIN blog 
ON trackback.blog_id = blog.id 

As far as I know, selecting with table fields is preferred above using a SELECT * any time possible.

I understand this would break the model object to table row mapping of the QuerySet (objects woulnd't be complete if some fields were missing), perhaps a new kind of QuerySet or ValueQuerySet is needed?

If this is something that would be a welcome addition to the codebase, I could probably help out.

Attachments

Change History

01/24/07 11:20:11 changed by mir@noris.de

  • needs_better_patch changed.
  • stage changed from Unreviewed to Design decision needed.
  • needs_tests changed.
  • needs_docs changed.

Hi Daniel, if you have a good idea how to do this, please post it here. It would help the ticket a lot to move forward. Also, can you please add links to the threads (in google's mail archive)?

01/25/07 14:39:55 changed by dbr <daniel.brandt@gmail.com>

The thread on django-developers contain of nothing more than my post, unfortunately.

My idea was using a ValueQuerySet, or some variant of it. The only real difference is that it gets properly filled through the join. If I try using the model with values() and select_related() this is what happens:

>>> Trackback.objects.select_related().values('blog', 'title')
[{'blog': 1L, 'title': 'Tracback entry'}, {'blog': 3L, 'title': 'Trackback entry 2'}]

.. so I'm getting the key and not the Blog model object I expected in the ValueQuerySet.

01/25/07 14:41:03 changed by dbr <daniel.brandt@gmail.com>

Hi, btw, and thanks for reading my ticket.. :-)

(follow-up: ↓ 6 ) 01/26/07 01:23:26 changed by Michael Radziej <mir@noris.de>

  • version deleted.
  • summary changed from select_related() and selecting by db table fields to select_related() should work with ValueQuerySets.

I've changed the summary to something that hopefully expresses your idea. So, what result *would* you expect? Something like the following?

>>> Trackback.objects.select_related().values('blog', 'title')
[{'blog': <Blog object>, 'title': 'Tracback entry'}, {'blog': <block object>, 'title': 'Trackback entry 2'}]

That would break compatibility, and we usually don't do this if we can avoid it. How about:

>>> Trackback.objects.select_related().values('blog__date', 'blog__author', 'title')
[{'blog': {'author': 'Mr. Sunshine', 'Date': ...}, 'title': 'Tracback entry'}}, ... }]

values('blog'} would still be allowed with the old result, but discouraged and can't be used together with the extension.

For one-to-many or many-to-many relationships, the value in the dict could be a list of dicts. Hey, what I like about this is that it would give a good way to define a limited select_related.

Can you please state your opinion and start a discussion on django-developers? This should really be discussed on the list. There's a related ticket about limiting select_related: #3275

01/26/07 01:25:19 changed by Michael Radziej <mir@noris.de>

  • summary changed from select_related() should work with ValueQuerySets to select_related() should work with ValuesQuerySets.

Oops, it's called a ValuesQuerySet. English is so obsessed with plurals ;-)

(in reply to: ↑ 4 ) 01/26/07 02:28:07 changed by dbr <daniel.brandt@gmail.com>

>>> Trackback.objects.select_related().values('blog__date', 'blog__author', 'title')
[{'blog': {'author': 'Mr. Sunshine', 'Date': ...}, 'title': 'Tracback entry'}}, ... }]

That would work.. looks like a good solution to me.

I'll try to start a new discussion on the list, now that I have this ticket to point to as well! Maybe I wasn't clear enough with my first post to the list..

12/02/07 14:13:00 changed by jacob

  • keywords set to qs-rf.
  • stage changed from Design decision needed to Accepted.

12/09/07 01:44:44 changed by mtredinnick

  • keywords deleted.
  • status changed from new to closed.
  • resolution set to duplicate.

This is essentially a duplicate of #5768, although the title of this one is a bit misleading. So I'll close this in favour of the latter.

We can probably work on this after queryset-refactor is completed, since it's not a blocker for that branch and not important enough to hold up the branch for.

By the way, the initial assumption in the description is a bit of a misnomer. We always select columns by name and, in practice, selecting all columns or just a few doesn't change query execution time much at all. Table joins, where-filtering, ordering and grouping are by far the biggest time sinks when conducting a complex query (and for simple queries, they're so fast it doesn't matter anyway). However, the overhead of constructing a Python object (and Django model) for each returned row isn't zero, so values() has some benefit and that's why it's worth implementing this eventually.


Add/Change #3358 (select_related() should work with ValuesQuerySets)




Change Properties
Action