Code

Opened 6 years ago

Closed 6 years ago

#6811 closed (duplicate)

Add functions, expressions and grouping to QuerySet.values()

Reported by: justinf Owned by: justinf
Component: Database layer (models, ORM) Version: queryset-refactor
Severity: Keywords: aggregation grouping
Cc: Triage Stage: Unreviewed
Has patch: yes Needs documentation: no
Needs tests: yes Patch needs improvement: no
Easy pickings: UI/UX:

Description

This proposal is for adding aggregation support to QuerySet.values().

An E object is introduced that represents a query expression, along with several subclasses:

  • F: a field/column
  • Function: an SQL function
  • Aggregate: an SQL aggregation function

The basic syntax for values remains unchanged, field names or E objects are passed as position arguments. Grouping is implicit based on the types of arguments to values(). Non-aggregate fields and expressions are used in the GROUP BY clause. If no non-aggregate fields are passed, values() returns a single result. Functions objects have default names in the output based on the function name and field.

Examples:

Sum field1:

>>> Model.objects.values(Sum('field1'))
[ {'field1_sum': 1000} ]

Group by field1, sum field2:

>>> Model.objects.values('field1', Sum('field2'))
[ {'field1':10, 'field2_sum':1000 },
  {'field1':20, 'field2_sum':1234 } ]

Complex expression with a specified output alias:

>>> Model.objects.values('field1', E(Sum('field2') + Sum('field3'), alias='my_sum'))
[ {'field1':10, 'my_sum':1000 },
  {'field1':20, 'my_sum':1234 } ]

Use of F objects:

>>> Invoice.objects.values('id', E(F('subtotal') + F('tax'), alias='total'))
[ {'id':1, 'total':1000.00 },
  {'id':2, 'total':1234.00 } ]

Note that no grouping happens here since there are no aggregate functions.

Output aliases could possibly be specified via kwargs:

>>> Model.objects.values('field1', my_sum=Sum('field2') + Sum('field3'))
[ {'field1':10, 'my_sum':1000 },
  {'field1':20, 'my_sum':1234 } ]

It would be nice to support grouping by functions as well:

>>> Sale.objects.values(Month('date'), Sum('payment'))
[ {'date_month':1, 'payment_sum':1000.00 },
  {'date_month':2, 'payment_sum':1234.00 } ]

In this case Month is non-aggregating, so it's included in GROUP BY. Also, backends will have to be extended so that non-standard functions work.

User defined function using built-in functions:

>>> def my_func(value_col, quantity_col):
>>>    return Sum(value_col) / Sum(quantity_col)
>>> Product.objects.values('type', avg_price=my_func('price','quantity'))
[ {'type':'widget', 'avg_price':1.00 },
  {'type':'gadget', 'avg_price':2.00 } ]

Aggregation should be supported across relationships:

>>> Customer.objects.values('name', Sum('purchases__payment'))
[ {'name':'John', 'purchases__payment_sum':1000.00 },
  {'name':'Jane', 'purchases__payment_sum':1234.00 } ]

I think this proposal adds minimal new syntax and fits nicely with existing features like Q objects. It's possible that changing the behavior of value() to trigger grouping could be confusing.

This proposal so far only covers adding aggregation to QuerySet.values(). Another function (such as annotate() suggested by Russell) could be added to support aggregation with Models. It might be good to eventually support E objects in QuerySet.filter(). It might also be possible to use the E objects in Models via some sort of ExpressionField (but that's getting far ahead of myself).

Some of this already works in a haskish, fragile way (auto-grouping and simple aggregation functions, but not expressions), but I'm re-working my changes as per Malcolm's suggestions. I'll post a very preliminary patch for review when I get some of his suggestions finished. I'm not married to this proposal, and enjoy the work for itself, so if people think we should a different direction, that's fine.

So, thoughts, suggestions, test cases?

Most recent discussion:
http://groups.google.com/group/django-developers/browse_thread/thread/3d4039161f6d2f5e

Past discussions:
http://groups.google.com/group/django-developers/browse_thread/thread/f433edf7b0ebdbcb/9758e11744381e67[#5420]
http://groups.google.com/group/django-developers/browse_thread/thread/e53a62e1c73871b8/2f91fa217bc465bb?#2f91fa217bc465bb

Another proposal:
http://code.djangoproject.com/ticket/3566

Attachments (4)

qs_value_expressions.diff (4.3 KB) - added by justinf 6 years ago.
Preliminary patch. Not well tested.
functions.py (6.8 KB) - added by justinf 6 years ago.
Expression/Function classes
qs_value_expressions.2.diff (4.8 KB) - added by justinf 6 years ago.
fixed a auto-grouping bug in first patch. added keyword alias support
functions.2.py (6.8 KB) - added by justinf 6 years ago.
fixed auto-grouping bug

Download all attachments as: .zip

Change History (7)

comment:1 Changed 6 years ago by justinf

  • Needs documentation unset
  • Needs tests unset
  • Owner changed from nobody to justinf
  • Patch needs improvement unset
  • Status changed from new to assigned

Changed 6 years ago by justinf

Preliminary patch. Not well tested.

Changed 6 years ago by justinf

Expression/Function classes

comment:2 Changed 6 years ago by justinf

  • Has patch set
  • Needs tests set

I added an initial patch and expression classes.

Table and column names are remapped properly, so you can do aggregation across relations as in the last example. Specifying output aliases as keyword arguments is not yet supported, and general expressions and operators don't have a default output alias, so expressions must be wrapped in an E object with an alias.

Changed 6 years ago by justinf

fixed a auto-grouping bug in first patch. added keyword alias support

Changed 6 years ago by justinf

fixed auto-grouping bug

comment:3 Changed 6 years ago by mtredinnick

  • Resolution set to duplicate
  • Status changed from assigned to closed

As with #6821, this is a duplicate of #3566, which is the original ticket for aggregate work.

Patches should go there after discussion on the mailing list (it's not at all clear that consensus on the approach or API has been reached yet).

Add Comment

Modify Ticket

Change Properties
<Author field>
Action
as closed
as The resolution will be set. Next status will be 'closed'
The resolution will be deleted. Next status will be 'new'
Author


E-mail address and user name can be saved in the Preferences.

 
Note: See TracTickets for help on using tickets.