Code

Opened 7 years ago

Closed 3 years ago

#5768 closed (fixed)

Allow QuerySet.values() to return values spanning joins (for multi-valued relations)

Reported by: anonymous <tobutaz+bugs@…> Owned by:
Component: Database layer (models, ORM) Version: master
Severity: Keywords: values, values_list, one-to-many, many-to-many reverse
Cc: trevor@…, miracle2k, mrmachine Triage Stage: Ready for checkin
Has patch: yes Needs documentation: no
Needs tests: no Patch needs improvement: no
Easy pickings: UI/UX:

Description

I'd like to write:

class Tag(models.Model):
    name = models.CharField(maxlength=50)

class Thing(models.Model):
    tags = models.ManyToManyKey(Tag)

Thing.objects.filter(tags__name).values('tags.name').distinct()

My use case is slightly more complicated (I use django-tagging and the GenericRelation does not make things simpler).

Of course I could do it manually with a loop, but it is necessary to have a queryset for some things, such as using a generic view.

Attachments (2)

5768-m2m-values-r14618.diff (9.6 KB) - added by mrmachine 3 years ago.
5768-m2m-values-r14639.diff (10.6 KB) - added by mrmachine 3 years ago.

Download all attachments as: .zip

Change History (27)

comment:1 Changed 7 years ago by anonymous <tobutaz+bugs@…>

  • Needs documentation unset
  • Needs tests unset
  • Patch needs improvement unset

Here is the use case with django-tagging:

    e_t = ContentType.objects.get_for_model(Entity)
    q_s = TaggedItem.objects.filter(content_type=e_t).extra(tables=['tag'], where=['tagged_item.tag_id=tag.id']).values('tag__name__exact').distinct()

comment:2 Changed 7 years ago by mtredinnick

  • Keywords qs-rf added
  • Triage Stage changed from Unreviewed to Accepted

Probably not a completely crazy idea and not too hard to do in queryset-refactor now.

comment:3 Changed 7 years ago by mtredinnick

  • Component changed from Uncategorized to Database wrapper

comment:4 Changed 6 years ago by mtredinnick

  • Keywords qs-rf removed

I've marked #3358 as a duplicate of this. We can implement it eventually, but probably not as part of the queryset-refactor work, since it's not really a show-stopper at the moment.

A couple of thoughts on the implementation:

  1. we should use foo__bar as the format for specifying the names, so that it's consistent with filtering and the new cross-model order_by() syntax.
  2. If we allow many-to-many or reverse one-to-many relations to be specified, we should restrict it to a maximum of one such relation in each values() call. The problem here is one of efficiency: if relation 1 has n1 rows and relation 2 has n2 rows in their result sets, doing a single, non-union query will result in n1*n2 rows being returned. Not cool if n1 and n2 are something like 100 each. If we do a union query, it's not more efficient than doing two separate queries, except we'll have to pay some computation penalty inside Django every time to work out when this is needed. So we don't permit that and the few cases where it's needed can be done with separate queries and dictionary updates.

So I'm going to remove the qs-rf keyword, but any patch for this should wait until after that branch is merged, since it will touch stuff that bears no resemblance to the current trunk code.

comment:5 Changed 6 years ago by trevor

  • Cc trevor@… added

comment:6 Changed 6 years ago by dogada

  • Owner changed from nobody to dogada
  • Status changed from new to assigned

comment:7 Changed 6 years ago by dogada

  • Has patch set
  • Resolution set to fixed
  • Status changed from assigned to closed

Support of related models in QuerySet.values() was implemented as part of ticket #5420. See following patch:
http://code.djangoproject.com/attachment/ticket/5420/queryset_fields_trunk.diff

Patch changes signature of this method from values(*fields) to values(*fields, **related_fields) but the change is backward compatible.

See more details about this patch at: http://www.mysoftparade.com/blog/django_orm_performance_patch/

comment:8 Changed 6 years ago by mtredinnick

  • Resolution fixed deleted
  • Status changed from closed to reopened

This isn't fixed. A ticket is fixed when code for it has been committed to the appropriate branch in subversion.

Also, this isn't related to #5420 (your approach there is mixing multiple things). Reopening.

comment:9 Changed 6 years ago by mtredinnick

Also, for anybody working on this, please see the end of comment 4: patches should either be against queryset-refactor or wait until after that is merged into trunk. The current ValuesQuerySet implementation is quite different in the new code.

comment:10 Changed 6 years ago by dogada

  • Owner dogada deleted
  • Status changed from reopened to new

comment:11 Changed 6 years ago by mtredinnick

(In [7230]) queryset-refactor: Refactored the way values() works so that it works properly
across inherited models.

Completely by accident, this also allows values() queries to include fields
from related models, providing it is crossing a single-valued relation
(one-to-one, many-to-one). Many-to-many values() fields still aren't supported,
since that requires actual thinking. So this refs #5768.

comment:12 Changed 6 years ago by mtredinnick

  • Summary changed from Allow QuerySet.values() to return values spanning joins. to Allow QuerySet.values() to return values spanning joins (for multi-valued relations)

comment:13 Changed 5 years ago by mrts

  • milestone set to 1.2

Tentatively pushing to 1.2 as this lines up with defer() and only(), covering the missing ground.

comment:14 Changed 5 years ago by mtredinnick

  • milestone 1.2 deleted

Removing milestone, lest people become confused by thinking there's any commitment to this in any particular timeframe. It will be committed when it's done, whenever that might be.

comment:15 Changed 5 years ago by miracle2k

  • Cc miracle2k added

comment:16 Changed 3 years ago by mrmachine

  • Cc mrmachine added

Adding a milestone might also prompt someone to work on or review this ticket within the next 1.5 years or major release cycle.

Changed 3 years ago by mrmachine

comment:17 Changed 3 years ago by mrmachine

  • Keywords values, values_list, one-to-many, many-to-many reverse added
  • Needs documentation set
  • Version changed from 0.96 to SVN

The patch I've just uploaded doesn't add any new functionality, but simply exposes the existing functionality with an allow_m2m argument on values() and values_list(). It should be completely backwards compatible, and I've added tests for forward relations (many-to-one) which were already possibly (but missing tests, unless I just couldn't find them) as well as reverse relations (one-to-many and many-to-many).

I ran the entire test suite with allow_m2m=True by default without any breakage, but I've left it as allow_m2m=False by default in the patch just to make sure that users aren't accidentally surprised when their call to values() returns more results than there are in the original queryset due to the reverse relations, by requiring users to actively specify that they want to lookup values on these relations.

I still need to add docs to the patch, but would like to hear feedback before I do so about any concerns that anyone may have with the implementation.

comment:18 Changed 3 years ago by carljm

This looks pretty good to me, though I'm not entirely sold on allowing for combinatorial explosions.

It seems to me that if we implement Malcolm's suggestion of limiting it to a single m2m relation per query, we could probably remove the new allow_m2m keyword arg altogether; it's not backwards-incompatible to have some additional fieldnames work that didn't before. Without the limit, I think the allow_m2m arg is necessary, just to force people to look at the docs and see the giant red warning that "you might be in for a combinatorial explosion if you use this."

I'd like to hear from some others on this point -- could you bring it up on the -developers mailing list?

Re the patch: the copy-pasted extra-kwargs TypeError in .values() still says "values_list" in its text. Also, there are already several tests for forward relations in .values(): some incidental, the one specifically testing it is in modeltests/many_to_one. However, I'm still not opposed to adding tests for that here; it seems best if test_values() actually tests the full range of .values() functionality.

comment:19 Changed 3 years ago by carljm

Conclusion from IRC discussion with Russell and mrmachine is that combinatorial explosions are already possible in the ORM, and will only happen here under certain circumstances of dataset size and fields used in combination, so it's acceptable to eliminate the allow_m2m kwarg and just allow (any number of) m2ms in values/values_list, with a note in the docs about the possible explosion in returned dataset size.

comment:20 Changed 3 years ago by mrmachine

  • Needs documentation unset

Updated the patch to allow m2m by default, with docs.

comment:21 Changed 3 years ago by russellm

  • Triage Stage changed from Accepted to Ready for checkin

comment:22 Changed 3 years ago by carljm

  • Patch needs improvement set
  • Triage Stage changed from Ready for checkin to Accepted

I'm getting failures in the new tests on Postgres; appears to be a unicode/str issue. Shouldn't be too hard to fix, don't have time to dig into it at the moment.

Changed 3 years ago by mrmachine

comment:23 Changed 3 years ago by mrmachine

  • Patch needs improvement unset
  • Triage Stage changed from Accepted to Ready for checkin

Apologies, I should have ran the tests for postgresql and mysql as well. The problem was that assertQuerysetEqual is dependant on the order of results, and with no explicit ordering applied the results were different in postgresql. I've updated the tests (and also fixed another typo I missed earlier) and rerun the lookup tests on sqlite, postgresql and mysql.

comment:24 Changed 3 years ago by mrmachine

Full test suite for sqlite and postgresql just passed, too. I didn't do mysql after I found out it was going to take 10+ hours or have a bunch of other failures.

comment:25 Changed 3 years ago by carljm

  • Resolution set to fixed
  • Status changed from new to closed

(In [14655]) Fixed #5768 -- Added support for ManyToManyFields and reverse relations in values() and values_list(). Thanks to mrmachine for the patch.

Add Comment

Modify Ticket

Change Properties
<Author field>
Action
as closed
as The resolution will be set. Next status will be 'closed'
The resolution will be deleted. Next status will be 'new'
Author


E-mail address and user name can be saved in the Preferences.

 
Note: See TracTickets for help on using tickets.