Opened 4 years ago

Last modified 4 months ago

#16735 assigned New feature

Queryset values should be aliasable

Reported by: alex.latchford@… Owned by: nate_b
Component: Database layer (models, ORM) Version: master
Severity: Normal Keywords: queryset, alias, values
Cc: django@…, bendavis78 Triage Stage: Accepted
Has patch: yes Needs documentation: no
Needs tests: no Patch needs improvement: yes
Easy pickings: no UI/UX: no

Description

Student.objects.all().values('name', 'mother__name', 'class__teacher__name')
> QD {'name': 'Freddie', 'mother__name': 'Helen', 'class__teacher__name': 'Mr Williams'}

This sort of query doesn't quite demonstrate the problem fully, but it's the best example I can think of for now. Essentially I'd like to be able to alias these values in such a fashion that they are easily distinguishable or able to be shortened..

I envisage something like this..

Student.objects.all().values(student_name='name', mother_name='mother__name', teacher_name='class__teacher__name')
> QD {'student_name': 'Freddie', 'mother_name': 'Helen', 'teacher_name': 'Mr Williams'}

You should also be able to leave arguments if the standard name will suffice..

Many thanks,
Alex

Attachments (2)

column_alias.diff (33.0 KB) - added by nate_b 3 years ago.
16735-2.patch (33.0 KB) - added by nate_b 3 years ago.
Corrected an oversight when aliasing columns - previous patch would alias columns generically when it didn't need to.

Download all attachments as: .zip

Change History (23)

comment:1 Changed 4 years ago by aaugustin

  • Needs documentation unset
  • Needs tests unset
  • Patch needs improvement unset
  • Triage Stage changed from Unreviewed to Accepted

Interesting idea. From a quick inspection of the source (ValuesQuerySet) this looks doable.

Changed 3 years ago by nate_b

comment:2 Changed 3 years ago by nate_b

  • Has patch set
  • Owner changed from nobody to nate_b
  • Status changed from new to assigned

This patch is the most direct way I could see of adding this feature.

It passes all run tests in ./runtests.py --settings=test_sqlite

I tried to be careful, but if it is missing anything, I'd be happy to take another look at it.

comment:3 Changed 3 years ago by lrekucki

  • Patch needs improvement set

While adding this to values() is perfectly fine, doing the same with values_list() feels weird. Also, it's buggy:

# from ValuesListQuerySet in the patch
fields = list(self._fields) + self._aliased_fields.keys()

This makes the order of values in returned tuples depend on order of dictionary keys, which is undefined. Just try:

print Model.objects.values_list(a2="field", a3="other_field")
# on Python 2.7 and 3.2 values should be reversed.
Last edited 3 years ago by lrekucki (previous) (diff)

Changed 3 years ago by nate_b

Corrected an oversight when aliasing columns - previous patch would alias columns generically when it didn't need to.

comment:4 Changed 3 years ago by nate_b

  • Triage Stage changed from Accepted to Design decision needed

Ostensibly, I agree with you - it's a bit odd, and I don't presume to know why one would use it. Initially, when I added that to values_list(), it was to "harmonize" it with the features added to values(). However, similar behavior results when you annotate or aggregate on a values_list() call - the aliased names are added in the order of the keys.

So, I would propose one of three solutions:

  1. Remove aliased names in the values_list() call entirely, replaced with a dummy empty dictionary.
  2. Force the sorting of the aliases, presumably alphabetically; this would also suggest repairing this defect with annotations and aggregations, which may not be quite so simple.
  3. Leave as is in my updated patch, with indeterminate order returned.

I will be happy to implement which ever one seems best to a core developer. As such, I'm setting this to DDN.

comment:5 Changed 3 years ago by akaariai

What is the rationale of doing this for values_list()? If I am not mistaken, you will get a list back anyways. So the aliases are used for what?

The .annotate() + values_list() seems to be a bug. Basically, multiple annotates in one .annotate() call have indeterminate order, which results in indeterminate order in the values_list(). I bet some users will be hit if/when Python randomizes the hashing algorithm. So, indeterminate order is not good. Unfortunately I don't see a fix other than disallowing multiple annotates in one call. Which is backwards incompatible.

In addition, some benchmark that this doesn't slow down fetching large sets of objects from the DB is needed. .values() is mostly an optimization, so it should remain as fast as possible. Maybe django-bench contains some benchmark already?

comment:6 follow-up: Changed 3 years ago by UloPe

  • Cc django@… added

I would disagree with the observation that .values() is just an optimization. In combination with .annotate() it is vital if you need to annotate / group by multiple fields. This is also one of the use cases where the missing aliases are most painful.

Example:

>>> Region.objects.values(
    "name", 
    "orders__items__product__category__parent__name"
).annotate(quantity = Sum("orders__items__quantity"))
[
    {'quantity': 10, 'name': u'...', 'orders__items__product__category__parent__name': u'Something'},
    {'quantity': 20, 'name': u'...', 'orders__items__product__category__parent__name': u'Something else'},
]

comment:7 Changed 3 years ago by jrs_66@…

I'm guessing that the seemingly trivial, and widely used, concept of aliasing values() lists is still not possible? Ever try a union of self joined tables? Is there any hope of this being put in place in the future?

comment:8 in reply to: ↑ 6 ; follow-up: Changed 2 years ago by Karthik

Replying to UloPe:

I would disagree with the observation that .values() is just an optimization. In combination with .annotate() it is vital if you need to annotate / group by multiple fields. This is also one of the use cases where the missing aliases are most painful.

Example:

>>> Region.objects.values(
    "name", 
    "orders__items__product__category__parent__name"
).annotate(quantity = Sum("orders__items__quantity"))
[
    {'quantity': 10, 'name': u'...', 'orders__items__product__category__parent__name': u'Something'},
    {'quantity': 20, 'name': u'...', 'orders__items__product__category__parent__name': u'Something else'},
]

I agree. I'm reading this thread because I have this exact same problem at the moment.

comment:9 follow-up: Changed 2 years ago by aaugustin

  • Triage Stage changed from Design decision needed to Accepted

Unless I missed something, there isn't any objection to adding this feature to values.

comment:10 in reply to: ↑ 8 Changed 2 years ago by erik.telepovsky

Replying to Karthik:

Replying to UloPe:

I would disagree with the observation that .values() is just an optimization. In combination with .annotate() it is vital if you need to annotate / group by multiple fields. This is also one of the use cases where the missing aliases are most painful.

Example:

>>> Region.objects.values(
    "name", 
    "orders__items__product__category__parent__name"
).annotate(quantity = Sum("orders__items__quantity"))
[
    {'quantity': 10, 'name': u'...', 'orders__items__product__category__parent__name': u'Something'},
    {'quantity': 20, 'name': u'...', 'orders__items__product__category__parent__name': u'Something else'},
]

I agree. I'm reading this thread because I have this exact same problem at the moment.

I agree as well. It is important to have alias functionality in values() method.

comment:11 Changed 2 years ago by akaariai

Quick observation: it might be better to have the aliasing feature as separate queryset operation:

Region.objects.alias(
    orders__items__product__category__parent__name="parentname"
).values(
    "name", "parentname"
).annotate(quantity=Sum("orders__items__quantity"))

Why this way? There are a couple other places where aliasing could be useful. For example in multijoin situations you could explicitly define if you want to filter the same join or different join by using the same alias or two different aliases (currently the way is "if filtered in same .filter() condition, then same joins, else different joins). Also, this way you might be able to inject extra SQL directly into the query:

Region.objects.alias(
    somesql=RawSQL("case when somecol > 0 then 1 else -1 end")
).order_by('somesql')

Of course, this ticket should not try to do more than just the bare minimum to get the aliasing to work for .values(). The point is aliasing could be useful in other operations, too, so lets be prepared for that.

comment:12 in reply to: ↑ 9 Changed 16 months ago by Wraithan

Replying to aaugustin:

Unless I missed something, there isn't any objection to adding this feature to values.

So my understanding is that this ticket is being held back by touching values_list as well as values. So a patch only affecting values (as well as docs and tests of course) is what is needed to get accepted?

comment:13 Changed 16 months ago by russellm

@Wraithan As far as I can make out, yes. I can't see any benefit to having API consistency between values() and values_list() - if only because values_list() already accepts a keyword argument, which introduces a whole world of pain in the API (what if you want an alias called flat?).

The "values() is an optimisation" argument doesn't hold water for me. Yes, it's an optimisation. The source of the optimisation is passing less information on the wire during the database query (i.e., only returning two of 15 fields). An alias is already being used for this operation - it just isn't user specified. Making it user specified is a single dictionary lookup in a couple of key locations is a minor change in implementation with huge usability benefits.

So - clean up the patch and we can get this into trunk.

comment:14 Changed 16 months ago by mjtamlyn

Agreed this should be possible.

As a side note Russ - I've found the main optimisation for using values was in fact avoiding calling Model.__init__ a few thousand times, not the lack of data on the wire.

comment:15 Changed 14 months ago by django@…

I've been wanting something like this for a while. Just another use case (when sending emails with pre-defined, admin-editable text.

text = Settings.objects.get(name='email_text').value  # text is something like "Hi {name}, you are {age} years old. You work at {job}"

email_texts = [text.format(**kw) for kw in Person.objects.all().values(name='name', age='age', job="job__name")]

comment:16 Changed 13 months ago by bendavis78

  • Cc bendavis78 added
  • Version changed from 1.3 to master

I've created a branch on github for a fix against the latest version of django: https://github.com/bendavis78/django/tree/issues/16735

The patch can be found here:
https://github.com/bendavis78/django/commit/cdff83bed850a631e7c6d3cb12359b9b1d3e9bc4

This patch only changes values() and not values_list().

comment:17 Changed 12 months ago by paveluc.alexandr@…

Will this patch be included in next Django release?

comment:18 follow-up: Changed 12 months ago by charettes

Unfortunately this feature didn't make it to the 1.7.x branch before it was feature frozen.

comment:19 in reply to: ↑ 18 Changed 4 months ago by abdulhaq-e

Replying to charettes:

Unfortunately this feature didn't make it to the 1.7.x branch before it was feature frozen.

Since the feature freeze deadline for v1.8 is approaching, will this be added to 1.8??

I tried @bendavis78 patch and it works fine!

comment:20 Changed 4 months ago by timgraham

The patch likely needs to be updated to apply cleanly and then we need a pull request so the patch can be reviewed.

comment:21 Changed 4 months ago by jarshwah

Just quickly, with the changes to annotate that have landed, it is now possible to create aliases yourself, and reference them from the values call:

Model.objects.annotate(my_alias=F('some__long__name__to__alias')).values('my_alias')

This is fairly similar to the suggestion Anssi made upthread of using a new method alias(). Not sure if this would be preferable though, as it is more verbose and requires an extra round of queryset copying.

Note: See TracTickets for help on using tickets.
Back to Top