Opened 5 years ago

Closed 5 weeks ago

#16735 closed New feature (wontfix)

QuerySet.values() should be aliasable

Reported by: alex.latchford@… Owned by: Ian Foote
Component: Database layer (models, ORM) Version: master
Severity: Normal Keywords: queryset, alias, values
Cc: django@…, Ben Davis Triage Stage: Accepted
Has patch: yes Needs documentation: no
Needs tests: no Patch needs improvement: yes
Easy pickings: no UI/UX: no

Description

Student.objects.all().values('name', 'mother__name', 'class__teacher__name')
> QD {'name': 'Freddie', 'mother__name': 'Helen', 'class__teacher__name': 'Mr Williams'}

This sort of query doesn't quite demonstrate the problem fully, but it's the best example I can think of for now. Essentially I'd like to be able to alias these values in such a fashion that they are easily distinguishable or able to be shortened..

I envisage something like this..

Student.objects.all().values(student_name='name', mother_name='mother__name', teacher_name='class__teacher__name')
> QD {'student_name': 'Freddie', 'mother_name': 'Helen', 'teacher_name': 'Mr Williams'}

You should also be able to leave arguments if the standard name will suffice..

Many thanks,
Alex

Attachments (2)

column_alias.diff (33.0 KB) - added by Nate Bragg 5 years ago.
16735-2.patch (33.0 KB) - added by Nate Bragg 5 years ago.
Corrected an oversight when aliasing columns - previous patch would alias columns generically when it didn't need to.

Download all attachments as: .zip

Change History (28)

comment:1 Changed 5 years ago by Aymeric Augustin

Needs documentation: unset
Needs tests: unset
Patch needs improvement: unset
Triage Stage: UnreviewedAccepted

Interesting idea. From a quick inspection of the source (ValuesQuerySet) this looks doable.

Changed 5 years ago by Nate Bragg

Attachment: column_alias.diff added

comment:2 Changed 5 years ago by Nate Bragg

Has patch: set
Owner: changed from nobody to Nate Bragg
Status: newassigned

This patch is the most direct way I could see of adding this feature.

It passes all run tests in ./runtests.py --settings=test_sqlite

I tried to be careful, but if it is missing anything, I'd be happy to take another look at it.

comment:3 Changed 5 years ago by Łukasz Rekucki

Patch needs improvement: set

While adding this to values() is perfectly fine, doing the same with values_list() feels weird. Also, it's buggy:

# from ValuesListQuerySet in the patch
fields = list(self._fields) + self._aliased_fields.keys()

This makes the order of values in returned tuples depend on order of dictionary keys, which is undefined. Just try:

print Model.objects.values_list(a2="field", a3="other_field")
# on Python 2.7 and 3.2 values should be reversed.
Last edited 5 years ago by Łukasz Rekucki (previous) (diff)

Changed 5 years ago by Nate Bragg

Attachment: 16735-2.patch added

Corrected an oversight when aliasing columns - previous patch would alias columns generically when it didn't need to.

comment:4 Changed 5 years ago by Nate Bragg

Triage Stage: AcceptedDesign decision needed

Ostensibly, I agree with you - it's a bit odd, and I don't presume to know why one would use it. Initially, when I added that to values_list(), it was to "harmonize" it with the features added to values(). However, similar behavior results when you annotate or aggregate on a values_list() call - the aliased names are added in the order of the keys.

So, I would propose one of three solutions:

  1. Remove aliased names in the values_list() call entirely, replaced with a dummy empty dictionary.
  2. Force the sorting of the aliases, presumably alphabetically; this would also suggest repairing this defect with annotations and aggregations, which may not be quite so simple.
  3. Leave as is in my updated patch, with indeterminate order returned.

I will be happy to implement which ever one seems best to a core developer. As such, I'm setting this to DDN.

comment:5 Changed 5 years ago by Anssi Kääriäinen

What is the rationale of doing this for values_list()? If I am not mistaken, you will get a list back anyways. So the aliases are used for what?

The .annotate() + values_list() seems to be a bug. Basically, multiple annotates in one .annotate() call have indeterminate order, which results in indeterminate order in the values_list(). I bet some users will be hit if/when Python randomizes the hashing algorithm. So, indeterminate order is not good. Unfortunately I don't see a fix other than disallowing multiple annotates in one call. Which is backwards incompatible.

In addition, some benchmark that this doesn't slow down fetching large sets of objects from the DB is needed. .values() is mostly an optimization, so it should remain as fast as possible. Maybe django-bench contains some benchmark already?

comment:6 Changed 4 years ago by Ulrich Petri

Cc: django@… added

I would disagree with the observation that .values() is just an optimization. In combination with .annotate() it is vital if you need to annotate / group by multiple fields. This is also one of the use cases where the missing aliases are most painful.

Example:

>>> Region.objects.values(
    "name", 
    "orders__items__product__category__parent__name"
).annotate(quantity = Sum("orders__items__quantity"))
[
    {'quantity': 10, 'name': u'...', 'orders__items__product__category__parent__name': u'Something'},
    {'quantity': 20, 'name': u'...', 'orders__items__product__category__parent__name': u'Something else'},
]

comment:7 Changed 4 years ago by jrs_66@…

I'm guessing that the seemingly trivial, and widely used, concept of aliasing values() lists is still not possible? Ever try a union of self joined tables? Is there any hope of this being put in place in the future?

comment:8 in reply to:  6 ; Changed 4 years ago by Karthik

Replying to UloPe:

I would disagree with the observation that .values() is just an optimization. In combination with .annotate() it is vital if you need to annotate / group by multiple fields. This is also one of the use cases where the missing aliases are most painful.

Example:

>>> Region.objects.values(
    "name", 
    "orders__items__product__category__parent__name"
).annotate(quantity = Sum("orders__items__quantity"))
[
    {'quantity': 10, 'name': u'...', 'orders__items__product__category__parent__name': u'Something'},
    {'quantity': 20, 'name': u'...', 'orders__items__product__category__parent__name': u'Something else'},
]

I agree. I'm reading this thread because I have this exact same problem at the moment.

comment:9 Changed 4 years ago by Aymeric Augustin

Triage Stage: Design decision neededAccepted

Unless I missed something, there isn't any objection to adding this feature to values.

comment:10 in reply to:  8 Changed 3 years ago by erik.telepovsky

Replying to Karthik:

Replying to UloPe:

I would disagree with the observation that .values() is just an optimization. In combination with .annotate() it is vital if you need to annotate / group by multiple fields. This is also one of the use cases where the missing aliases are most painful.

Example:

>>> Region.objects.values(
    "name", 
    "orders__items__product__category__parent__name"
).annotate(quantity = Sum("orders__items__quantity"))
[
    {'quantity': 10, 'name': u'...', 'orders__items__product__category__parent__name': u'Something'},
    {'quantity': 20, 'name': u'...', 'orders__items__product__category__parent__name': u'Something else'},
]

I agree. I'm reading this thread because I have this exact same problem at the moment.

I agree as well. It is important to have alias functionality in values() method.

comment:11 Changed 3 years ago by Anssi Kääriäinen

Quick observation: it might be better to have the aliasing feature as separate queryset operation:

Region.objects.alias(
    orders__items__product__category__parent__name="parentname"
).values(
    "name", "parentname"
).annotate(quantity=Sum("orders__items__quantity"))

Why this way? There are a couple other places where aliasing could be useful. For example in multijoin situations you could explicitly define if you want to filter the same join or different join by using the same alias or two different aliases (currently the way is "if filtered in same .filter() condition, then same joins, else different joins). Also, this way you might be able to inject extra SQL directly into the query:

Region.objects.alias(
    somesql=RawSQL("case when somecol > 0 then 1 else -1 end")
).order_by('somesql')

Of course, this ticket should not try to do more than just the bare minimum to get the aliasing to work for .values(). The point is aliasing could be useful in other operations, too, so lets be prepared for that.

comment:12 in reply to:  9 Changed 3 years ago by Wraithan

Replying to aaugustin:

Unless I missed something, there isn't any objection to adding this feature to values.

So my understanding is that this ticket is being held back by touching values_list as well as values. So a patch only affecting values (as well as docs and tests of course) is what is needed to get accepted?

comment:13 Changed 3 years ago by Russell Keith-Magee

@Wraithan As far as I can make out, yes. I can't see any benefit to having API consistency between values() and values_list() - if only because values_list() already accepts a keyword argument, which introduces a whole world of pain in the API (what if you want an alias called flat?).

The "values() is an optimisation" argument doesn't hold water for me. Yes, it's an optimisation. The source of the optimisation is passing less information on the wire during the database query (i.e., only returning two of 15 fields). An alias is already being used for this operation - it just isn't user specified. Making it user specified is a single dictionary lookup in a couple of key locations is a minor change in implementation with huge usability benefits.

So - clean up the patch and we can get this into trunk.

comment:14 Changed 3 years ago by Marc Tamlyn

Agreed this should be possible.

As a side note Russ - I've found the main optimisation for using values was in fact avoiding calling Model.__init__ a few thousand times, not the lack of data on the wire.

comment:15 Changed 3 years ago by django@…

I've been wanting something like this for a while. Just another use case (when sending emails with pre-defined, admin-editable text.

text = Settings.objects.get(name='email_text').value  # text is something like "Hi {name}, you are {age} years old. You work at {job}"

email_texts = [text.format(**kw) for kw in Person.objects.all().values(name='name', age='age', job="job__name")]

comment:16 Changed 3 years ago by Ben Davis

Cc: Ben Davis added
Version: 1.3master

I've created a branch on github for a fix against the latest version of django: https://github.com/bendavis78/django/tree/issues/16735

The patch can be found here:
https://github.com/bendavis78/django/commit/cdff83bed850a631e7c6d3cb12359b9b1d3e9bc4

This patch only changes values() and not values_list().

comment:17 Changed 2 years ago by paveluc.alexandr@…

Will this patch be included in next Django release?

comment:18 Changed 2 years ago by Simon Charette

Unfortunately this feature didn't make it to the 1.7.x branch before it was feature frozen.

comment:19 in reply to:  18 Changed 21 months ago by Abdulhaq Emhemmed

Replying to charettes:

Unfortunately this feature didn't make it to the 1.7.x branch before it was feature frozen.

Since the feature freeze deadline for v1.8 is approaching, will this be added to 1.8??

I tried @bendavis78 patch and it works fine!

comment:20 Changed 21 months ago by Tim Graham

The patch likely needs to be updated to apply cleanly and then we need a pull request so the patch can be reviewed.

comment:21 Changed 21 months ago by Josh Smeaton

Just quickly, with the changes to annotate that have landed, it is now possible to create aliases yourself, and reference them from the values call:

Model.objects.annotate(my_alias=F('some__long__name__to__alias')).values('my_alias')

This is fairly similar to the suggestion Anssi made upthread of using a new method alias(). Not sure if this would be preferable though, as it is more verbose and requires an extra round of queryset copying.

comment:22 Changed 10 months ago by Tim Graham

Owner: Nate Bragg deleted
Status: assignednew
Summary: Queryset values should be aliasableQuerySet.values() should be aliasable

comment:23 Changed 10 months ago by Tim Graham

Related to "Allow expressions in values() queryset method (#25871)" and django-developers discussion.

comment:24 Changed 6 weeks ago by Ian Foote

Owner: set to Ian Foote
Status: newassigned

comment:25 Changed 6 weeks ago by Tim Graham

PR with some comments for improvement.

comment:26 Changed 5 weeks ago by Ian Foote

Resolution: wontfix
Status: assignedclosed

Following the further discussion on django-developers, I think the consensus is not to support aliasing beyond an explicit .values(alias=F('field')) as supported since #25871 was fixed. I'm therefore updating this to wontfix.

Note: See TracTickets for help on using tickets.
Back to Top