﻿id	summary	reporter	owner	description	type	status	component	version	severity	resolution	keywords	cc	stage	has_patch	needs_docs	needs_tests	needs_better_patch	easy	ui_ux
6811	Add functions, expressions and grouping to QuerySet.values()	Justin	Justin	"This proposal is for adding aggregation support to {{{QuerySet.values()}}}.

An E object is introduced that represents a query expression, along with several subclasses:
 * F: a field/column
 * Function: an SQL function
 * Aggregate: an SQL aggregation function

The basic syntax for values remains unchanged, field names or E objects are passed as position arguments. Grouping is implicit based on the types of arguments to {{{values()}}}. Non-aggregate fields and expressions are used in the GROUP BY clause. If no non-aggregate fields are passed, {{{values()}}} returns a single result. Functions objects have default names in the output based on the function name and field.

Examples:

Sum field1:
{{{
>>> Model.objects.values(Sum('field1'))
[ {'field1_sum': 1000} ]
}}}


Group by field1, sum field2:
{{{
>>> Model.objects.values('field1', Sum('field2'))
[ {'field1':10, 'field2_sum':1000 },
  {'field1':20, 'field2_sum':1234 } ]
}}}


Complex expression with a specified output alias:
{{{
>>> Model.objects.values('field1', E(Sum('field2') + Sum('field3'), alias='my_sum'))
[ {'field1':10, 'my_sum':1000 },
  {'field1':20, 'my_sum':1234 } ]
}}}


Use of F objects:
{{{
>>> Invoice.objects.values('id', E(F('subtotal') + F('tax'), alias='total'))
[ {'id':1, 'total':1000.00 },
  {'id':2, 'total':1234.00 } ]
}}}
Note that no grouping happens here since there are no aggregate functions.


Output aliases could possibly be specified via kwargs:
{{{
>>> Model.objects.values('field1', my_sum=Sum('field2') + Sum('field3'))
[ {'field1':10, 'my_sum':1000 },
  {'field1':20, 'my_sum':1234 } ]
}}}


It would be nice to support grouping by functions as well:
{{{
>>> Sale.objects.values(Month('date'), Sum('payment'))
[ {'date_month':1, 'payment_sum':1000.00 },
  {'date_month':2, 'payment_sum':1234.00 } ]
}}}
In this case Month is non-aggregating, so it's included in GROUP BY. Also, backends will have to be extended so that non-standard functions work.


User defined function using built-in functions:
{{{
>>> def my_func(value_col, quantity_col):
>>>    return Sum(value_col) / Sum(quantity_col)
>>> Product.objects.values('type', avg_price=my_func('price','quantity'))
[ {'type':'widget', 'avg_price':1.00 },
  {'type':'gadget', 'avg_price':2.00 } ]
}}}


Aggregation should be supported across relationships:
{{{
>>> Customer.objects.values('name', Sum('purchases__payment'))
[ {'name':'John', 'purchases__payment_sum':1000.00 },
  {'name':'Jane', 'purchases__payment_sum':1234.00 } ]
}}}


I think this proposal adds minimal new syntax and fits nicely with existing features like Q objects. It's possible that changing the behavior of {{{value()}}} to trigger grouping could be confusing.

This proposal so far only covers adding aggregation to {{{QuerySet.values()}}}. Another function (such as {{{annotate()}}} suggested by Russell) could be added to support aggregation with Models. It might be good to eventually support E objects in {{{QuerySet.filter()}}}. It might also be possible to use the E objects in Models via some sort of ExpressionField (but that's getting far ahead of myself).

Some of this already works in a haskish, fragile way (auto-grouping and simple aggregation functions, but not expressions), but I'm re-working my changes as per Malcolm's suggestions. I'll post a very preliminary patch for review when I get some of his suggestions finished. I'm not married to this proposal, and enjoy the work for itself, so if people think we should a different direction, that's fine.

So, thoughts, suggestions, test cases?



Most recent discussion:
http://groups.google.com/group/django-developers/browse_thread/thread/3d4039161f6d2f5e

Past discussions:
http://groups.google.com/group/django-developers/browse_thread/thread/f433edf7b0ebdbcb/9758e11744381e67[#5420]
http://groups.google.com/group/django-developers/browse_thread/thread/e53a62e1c73871b8/2f91fa217bc465bb?#2f91fa217bc465bb

Another proposal:
http://code.djangoproject.com/ticket/3566

"		closed	Database layer (models, ORM)	queryset-refactor		duplicate	aggregation grouping		Unreviewed	1	0	1	0	0	0
