Opened 8 years ago

Closed 7 years ago

#3358 closed (duplicate)

select_related() should work with ValuesQuerySets

Reported by: dbr <daniel.brandt@…> Owned by: nobody
Component: Database layer (models, ORM) Version:
Severity: Keywords:
Cc: Triage Stage: Accepted
Has patch: no Needs documentation: no
Needs tests: no Patch needs improvement: no
Easy pickings: UI/UX:

Description

(Some of you may be reading this for the second or third time since I brought it up both on the django-developers list and IRC)

There appears to be no way to make a query like this through the django models without using SQL or hitting the database twice:

SELECT
trackback.title, blog.name, blog.url
FROM trackback INNER JOIN blog 
ON trackback.blog_id = blog.id 

As far as I know, selecting with table fields is preferred above using a SELECT * any time possible.

I understand this would break the model object to table row mapping of the QuerySet (objects woulnd't be complete if some fields
were missing), perhaps a new kind of QuerySet or ValueQuerySet is needed?

If this is something that would be a welcome addition to the codebase, I could probably help out.

Change History (8)

comment:1 Changed 8 years ago by mir@…

  • Needs documentation unset
  • Needs tests unset
  • Patch needs improvement unset
  • Triage Stage changed from Unreviewed to Design decision needed

Hi Daniel, if you have a good idea how to do this, please post it here. It would help the ticket a lot to move forward. Also, can you please add links to the threads (in google's mail archive)?

comment:2 Changed 8 years ago by dbr <daniel.brandt@…>

The thread on django-developers contain of nothing more than my post, unfortunately.

My idea was using a ValueQuerySet, or some variant of it. The only real difference is that it gets properly filled through the join.
If I try using the model with values() and select_related() this is what happens:

>>> Trackback.objects.select_related().values('blog', 'title')
[{'blog': 1L, 'title': 'Tracback entry'}, {'blog': 3L, 'title': 'Trackback entry 2'}]

.. so I'm getting the key and not the Blog model object I expected in the ValueQuerySet.

comment:3 Changed 8 years ago by dbr <daniel.brandt@…>

Hi, btw, and thanks for reading my ticket.. :-)

comment:4 follow-up: Changed 8 years ago by Michael Radziej <mir@…>

  • Summary changed from select_related() and selecting by db table fields to select_related() should work with ValueQuerySets
  • Version 0.95 deleted

I've changed the summary to something that hopefully expresses your idea. So, what result *would* you expect? Something like the following?

>>> Trackback.objects.select_related().values('blog', 'title')
[{'blog': <Blog object>, 'title': 'Tracback entry'}, {'blog': <block object>, 'title': 'Trackback entry 2'}]

That would break compatibility, and we usually don't do this if we can avoid it. How about:

>>> Trackback.objects.select_related().values('blog__date', 'blog__author', 'title')
[{'blog': {'author': 'Mr. Sunshine', 'Date': ...}, 'title': 'Tracback entry'}}, ... }]

values('blog'} would still be allowed with the old result, but discouraged and can't be used together with the extension.

For one-to-many or many-to-many relationships, the value in the dict could be a list of dicts. Hey, what I like about this is that it would give a good way to define a limited select_related.

Can you please state your opinion and start a discussion on django-developers? This should really be discussed on the list.
There's a related ticket about limiting select_related: #3275

comment:5 Changed 8 years ago by Michael Radziej <mir@…>

  • Summary changed from select_related() should work with ValueQuerySets to select_related() should work with ValuesQuerySets

Oops, it's called a ValuesQuerySet. English is so obsessed with plurals ;-)

comment:6 in reply to: ↑ 4 Changed 8 years ago by dbr <daniel.brandt@…>

>>> Trackback.objects.select_related().values('blog__date', 'blog__author', 'title')
[{'blog': {'author': 'Mr. Sunshine', 'Date': ...}, 'title': 'Tracback entry'}}, ... }]


That would work.. looks like a good solution to me.

I'll try to start a new discussion on the list, now that I have this ticket to point to as well! Maybe I wasn't clear enough with my first post to the list..

comment:7 Changed 7 years ago by jacob

  • Keywords qs-rf added
  • Triage Stage changed from Design decision needed to Accepted

comment:8 Changed 7 years ago by mtredinnick

  • Keywords qs-rf removed
  • Resolution set to duplicate
  • Status changed from new to closed

This is essentially a duplicate of #5768, although the title of this one is a bit misleading. So I'll close this in favour of the latter.

We can probably work on this after queryset-refactor is completed, since it's not a blocker for that branch and not important enough to hold up the branch for.

By the way, the initial assumption in the description is a bit of a misnomer. We always select columns by name and, in practice, selecting all columns or just a few doesn't change query execution time much at all. Table joins, where-filtering, ordering and grouping are by far the biggest time sinks when conducting a complex query (and for simple queries, they're so fast it doesn't matter anyway). However, the overhead of constructing a Python object (and Django model) for each returned row isn't zero, so values() has some benefit and that's why it's worth implementing this eventually.

Note: See TracTickets for help on using tickets.
Back to Top