Opened 17 years ago

Closed 16 years ago

#3358 closed (duplicate)

select_related() should work with ValuesQuerySets

Reported by: dbr <daniel.brandt@…> Owned by: nobody
Component: Database layer (models, ORM) Version:
Severity: Keywords:
Cc: Triage Stage: Accepted
Has patch: no Needs documentation: no
Needs tests: no Patch needs improvement: no
Easy pickings: no UI/UX: no

Description

(Some of you may be reading this for the second or third time since I brought it up both on the django-developers list and IRC)

There appears to be no way to make a query like this through the django models without using SQL or hitting the database twice:

SELECT
trackback.title, blog.name, blog.url
FROM trackback INNER JOIN blog 
ON trackback.blog_id = blog.id 

As far as I know, selecting with table fields is preferred above using a SELECT * any time possible.

I understand this would break the model object to table row mapping of the QuerySet (objects woulnd't be complete if some fields
were missing), perhaps a new kind of QuerySet or ValueQuerySet is needed?

If this is something that would be a welcome addition to the codebase, I could probably help out.

Change History (8)

comment:1 by mir@…, 17 years ago

Triage Stage: UnreviewedDesign decision needed

Hi Daniel, if you have a good idea how to do this, please post it here. It would help the ticket a lot to move forward. Also, can you please add links to the threads (in google's mail archive)?

comment:2 by dbr <daniel.brandt@…>, 17 years ago

The thread on django-developers contain of nothing more than my post, unfortunately.

My idea was using a ValueQuerySet, or some variant of it. The only real difference is that it gets properly filled through the join.
If I try using the model with values() and select_related() this is what happens:

>>> Trackback.objects.select_related().values('blog', 'title')
[{'blog': 1L, 'title': 'Tracback entry'}, {'blog': 3L, 'title': 'Trackback entry 2'}]

.. so I'm getting the key and not the Blog model object I expected in the ValueQuerySet.

comment:3 by dbr <daniel.brandt@…>, 17 years ago

Hi, btw, and thanks for reading my ticket.. :-)

comment:4 by Michael Radziej <mir@…>, 17 years ago

Summary: select_related() and selecting by db table fieldsselect_related() should work with ValueQuerySets
Version: 0.95

I've changed the summary to something that hopefully expresses your idea. So, what result *would* you expect? Something like the following?

>>> Trackback.objects.select_related().values('blog', 'title')
[{'blog': <Blog object>, 'title': 'Tracback entry'}, {'blog': <block object>, 'title': 'Trackback entry 2'}]

That would break compatibility, and we usually don't do this if we can avoid it. How about:

>>> Trackback.objects.select_related().values('blog__date', 'blog__author', 'title')
[{'blog': {'author': 'Mr. Sunshine', 'Date': ...}, 'title': 'Tracback entry'}}, ... }]

values('blog'} would still be allowed with the old result, but discouraged and can't be used together with the extension.

For one-to-many or many-to-many relationships, the value in the dict could be a list of dicts. Hey, what I like about this is that it would give a good way to define a limited select_related.

Can you please state your opinion and start a discussion on django-developers? This should really be discussed on the list.
There's a related ticket about limiting select_related: #3275

comment:5 by Michael Radziej <mir@…>, 17 years ago

Summary: select_related() should work with ValueQuerySetsselect_related() should work with ValuesQuerySets

Oops, it's called a ValuesQuerySet. English is so obsessed with plurals ;-)

in reply to:  4 comment:6 by dbr <daniel.brandt@…>, 17 years ago

>>> Trackback.objects.select_related().values('blog__date', 'blog__author', 'title')
[{'blog': {'author': 'Mr. Sunshine', 'Date': ...}, 'title': 'Tracback entry'}}, ... }]


That would work.. looks like a good solution to me.

I'll try to start a new discussion on the list, now that I have this ticket to point to as well! Maybe I wasn't clear enough with my first post to the list..

comment:7 by Jacob, 16 years ago

Keywords: qs-rf added
Triage Stage: Design decision neededAccepted

comment:8 by Malcolm Tredinnick, 16 years ago

Keywords: qs-rf removed
Resolution: duplicate
Status: newclosed

This is essentially a duplicate of #5768, although the title of this one is a bit misleading. So I'll close this in favour of the latter.

We can probably work on this after queryset-refactor is completed, since it's not a blocker for that branch and not important enough to hold up the branch for.

By the way, the initial assumption in the description is a bit of a misnomer. We always select columns by name and, in practice, selecting all columns or just a few doesn't change query execution time much at all. Table joins, where-filtering, ordering and grouping are by far the biggest time sinks when conducting a complex query (and for simple queries, they're so fast it doesn't matter anyway). However, the overhead of constructing a Python object (and Django model) for each returned row isn't zero, so values() has some benefit and that's why it's worth implementing this eventually.

Note: See TracTickets for help on using tickets.
Back to Top