Opened 16 years ago

Closed 13 years ago

#5768 closed (fixed)

Allow QuerySet.values() to return values spanning joins (for multi-valued relations)

Reported by: anonymous <tobutaz+bugs@…> Owned by:
Component: Database layer (models, ORM) Version: dev
Severity: Keywords: values, values_list, one-to-many, many-to-many reverse
Cc: trevor@…, miracle2k, Tai Lee Triage Stage: Ready for checkin
Has patch: yes Needs documentation: no
Needs tests: no Patch needs improvement: no
Easy pickings: no UI/UX: no

Description

I'd like to write:

class Tag(models.Model):
    name = models.CharField(maxlength=50)

class Thing(models.Model):
    tags = models.ManyToManyKey(Tag)

Thing.objects.filter(tags__name).values('tags.name').distinct()

My use case is slightly more complicated (I use django-tagging and the GenericRelation does not make things simpler).

Of course I could do it manually with a loop, but it is necessary to have a queryset for some things, such as using a generic view.

Attachments (2)

5768-m2m-values-r14618.diff (9.6 KB ) - added by Tai Lee 13 years ago.
5768-m2m-values-r14639.diff (10.6 KB ) - added by Tai Lee 13 years ago.

Download all attachments as: .zip

Change History (27)

comment:1 by anonymous <tobutaz+bugs@…>, 16 years ago

Here is the use case with django-tagging:

    e_t = ContentType.objects.get_for_model(Entity)
    q_s = TaggedItem.objects.filter(content_type=e_t).extra(tables=['tag'], where=['tagged_item.tag_id=tag.id']).values('tag__name__exact').distinct()

comment:2 by Malcolm Tredinnick, 16 years ago

Keywords: qs-rf added
Triage Stage: UnreviewedAccepted

Probably not a completely crazy idea and not too hard to do in queryset-refactor now.

comment:3 by Malcolm Tredinnick, 16 years ago

Component: UncategorizedDatabase wrapper

comment:4 by Malcolm Tredinnick, 16 years ago

Keywords: qs-rf removed

I've marked #3358 as a duplicate of this. We can implement it eventually, but probably not as part of the queryset-refactor work, since it's not really a show-stopper at the moment.

A couple of thoughts on the implementation:

  1. we should use foo__bar as the format for specifying the names, so that it's consistent with filtering and the new cross-model order_by() syntax.
  2. If we allow many-to-many or reverse one-to-many relations to be specified, we should restrict it to a maximum of one such relation in each values() call. The problem here is one of efficiency: if relation 1 has n1 rows and relation 2 has n2 rows in their result sets, doing a single, non-union query will result in n1*n2 rows being returned. Not cool if n1 and n2 are something like 100 each. If we do a union query, it's not more efficient than doing two separate queries, except we'll have to pay some computation penalty inside Django every time to work out when this is needed. So we don't permit that and the few cases where it's needed can be done with separate queries and dictionary updates.

So I'm going to remove the qs-rf keyword, but any patch for this should wait until after that branch is merged, since it will touch stuff that bears no resemblance to the current trunk code.

comment:5 by Trevor Caira, 16 years ago

Cc: trevor@… added

comment:6 by dogada, 16 years ago

Owner: changed from nobody to dogada
Status: newassigned

comment:7 by dogada, 16 years ago

Has patch: set
Resolution: fixed
Status: assignedclosed

Support of related models in QuerySet.values() was implemented as part of ticket #5420. See following patch:
http://code.djangoproject.com/attachment/ticket/5420/queryset_fields_trunk.diff

Patch changes signature of this method from values(*fields) to values(*fields, **related_fields) but the change is backward compatible.

See more details about this patch at: http://www.mysoftparade.com/blog/django_orm_performance_patch/

comment:8 by Malcolm Tredinnick, 16 years ago

Resolution: fixed
Status: closedreopened

This isn't fixed. A ticket is fixed when code for it has been committed to the appropriate branch in subversion.

Also, this isn't related to #5420 (your approach there is mixing multiple things). Reopening.

comment:9 by Malcolm Tredinnick, 16 years ago

Also, for anybody working on this, please see the end of comment 4: patches should either be against queryset-refactor or wait until after that is merged into trunk. The current ValuesQuerySet implementation is quite different in the new code.

comment:10 by dogada, 16 years ago

Owner: dogada removed
Status: reopenednew

comment:11 by Malcolm Tredinnick, 16 years ago

(In [7230]) queryset-refactor: Refactored the way values() works so that it works properly
across inherited models.

Completely by accident, this also allows values() queries to include fields
from related models, providing it is crossing a single-valued relation
(one-to-one, many-to-one). Many-to-many values() fields still aren't supported,
since that requires actual thinking. So this refs #5768.

comment:12 by Malcolm Tredinnick, 15 years ago

Summary: Allow QuerySet.values() to return values spanning joins.Allow QuerySet.values() to return values spanning joins (for multi-valued relations)

comment:13 by mrts, 15 years ago

milestone: 1.2

Tentatively pushing to 1.2 as this lines up with defer() and only(), covering the missing ground.

comment:14 by Malcolm Tredinnick, 15 years ago

milestone: 1.2

Removing milestone, lest people become confused by thinking there's any commitment to this in any particular timeframe. It will be committed when it's done, whenever that might be.

comment:15 by miracle2k, 15 years ago

Cc: miracle2k added

comment:16 by Tai Lee, 13 years ago

Cc: Tai Lee added

Adding a milestone might also prompt someone to work on or review this ticket within the next 1.5 years or major release cycle.

by Tai Lee, 13 years ago

Attachment: 5768-m2m-values-r14618.diff added

comment:17 by Tai Lee, 13 years ago

Keywords: values values_list one-to-many many-to-many reverse added
Needs documentation: set
Version: 0.96SVN

The patch I've just uploaded doesn't add any new functionality, but simply exposes the existing functionality with an allow_m2m argument on values() and values_list(). It should be completely backwards compatible, and I've added tests for forward relations (many-to-one) which were already possibly (but missing tests, unless I just couldn't find them) as well as reverse relations (one-to-many and many-to-many).

I ran the entire test suite with allow_m2m=True by default without any breakage, but I've left it as allow_m2m=False by default in the patch just to make sure that users aren't accidentally surprised when their call to values() returns more results than there are in the original queryset due to the reverse relations, by requiring users to actively specify that they want to lookup values on these relations.

I still need to add docs to the patch, but would like to hear feedback before I do so about any concerns that anyone may have with the implementation.

comment:18 by Carl Meyer, 13 years ago

This looks pretty good to me, though I'm not entirely sold on allowing for combinatorial explosions.

It seems to me that if we implement Malcolm's suggestion of limiting it to a single m2m relation per query, we could probably remove the new allow_m2m keyword arg altogether; it's not backwards-incompatible to have some additional fieldnames work that didn't before. Without the limit, I think the allow_m2m arg is necessary, just to force people to look at the docs and see the giant red warning that "you might be in for a combinatorial explosion if you use this."

I'd like to hear from some others on this point -- could you bring it up on the -developers mailing list?

Re the patch: the copy-pasted extra-kwargs TypeError in .values() still says "values_list" in its text. Also, there are already several tests for forward relations in .values(): some incidental, the one specifically testing it is in modeltests/many_to_one. However, I'm still not opposed to adding tests for that here; it seems best if test_values() actually tests the full range of .values() functionality.

comment:19 by Carl Meyer, 13 years ago

Conclusion from IRC discussion with Russell and mrmachine is that combinatorial explosions are already possible in the ORM, and will only happen here under certain circumstances of dataset size and fields used in combination, so it's acceptable to eliminate the allow_m2m kwarg and just allow (any number of) m2ms in values/values_list, with a note in the docs about the possible explosion in returned dataset size.

comment:20 by Tai Lee, 13 years ago

Needs documentation: unset

Updated the patch to allow m2m by default, with docs.

comment:21 by Russell Keith-Magee, 13 years ago

Triage Stage: AcceptedReady for checkin

comment:22 by Carl Meyer, 13 years ago

Patch needs improvement: set
Triage Stage: Ready for checkinAccepted

I'm getting failures in the new tests on Postgres; appears to be a unicode/str issue. Shouldn't be too hard to fix, don't have time to dig into it at the moment.

by Tai Lee, 13 years ago

Attachment: 5768-m2m-values-r14639.diff added

comment:23 by Tai Lee, 13 years ago

Patch needs improvement: unset
Triage Stage: AcceptedReady for checkin

Apologies, I should have ran the tests for postgresql and mysql as well. The problem was that assertQuerysetEqual is dependant on the order of results, and with no explicit ordering applied the results were different in postgresql. I've updated the tests (and also fixed another typo I missed earlier) and rerun the lookup tests on sqlite, postgresql and mysql.

comment:24 by Tai Lee, 13 years ago

Full test suite for sqlite and postgresql just passed, too. I didn't do mysql after I found out it was going to take 10+ hours or have a bunch of other failures.

comment:25 by Carl Meyer, 13 years ago

Resolution: fixed
Status: newclosed

(In [14655]) Fixed #5768 -- Added support for ManyToManyFields and reverse relations in values() and values_list(). Thanks to mrmachine for the patch.

Note: See TracTickets for help on using tickets.
Back to Top