Support for "Conditional Aggregates"
|Reported by:||Ben Davis||Owned by:|
|Component:||Database layer (models, ORM)||Version:||master|
|Cc:||cal@…, Anssi Kääriäinen, unknownlighter@…, andreterra@…, Ilja Maas, mmitar@…, nkryptic@…, flo@…, Garry Polley, trbs@…||Triage Stage:||Someday/Maybe|
|Has patch:||yes||Needs documentation:||yes|
|Needs tests:||yes||Patch needs improvement:||yes|
This is really a feature request, but I'm posting it here in the hopes that it gets a little backing from the community.
I had a situation where I needed to write a query that would sum up demographics information for items ordered. I needed to count the total number of orders for each type of ethnicity, grouped by the item ordered. So the results should look something like this:
item | caucasian_count | african_american_count | ... ------------+-----------------+------------------------+ Test item | 55 | 72 | Test item 2 | 23 | 48 | ...
The models look roughly like this:
class Item(models.Model): title = CharField(max_length=255) #... (snip) ... class Ethnicity(models.Model): name = CharField(max_length=20) class User(models.Model): #... (snip) ... ethnicity = ForeignKey(Ethnicity) class Order(models.Model): #... (snip) ... user = ForeignKey(User) item = ForeignKey(Item)
The best way for me to accomplish this in a single query was to use conditional aggregates. In SQL, this would be done using
SELECT COUNT(IF(ethnicity='caucasian',TRUE,NULL)) AS caucasian_count, COUNT(IF(ethnicity='african_american',TRUE,NULL)) AS african_american_count ...
Now, I could have just as well written a raw SQL query, but I like django's ORM so much it pains me to not take advantage of it as much as I can.
My idea was for a new type of aggregate class called "CountIf()", which would act just like count, except that you could also pass filter conditions. The resulting django query might look something like this:
query = Order.objects.values('item') #Orders grouped by item for ethnicity in Ethnicity.objects.all(): query = query.annotate(CountIf(user__ethnicity=ethnicity))
... which would then generate the SQL clause similar to above. Unfortunately, the task of converting the condition itself into SQL was a bit too daunting, so I took a small workaround that only required minimal SQL:
from django.db.models.sql import aggregates class CountIf(aggregates.Count): """ Hack Count() to get a conditional count working. """ is_ordinal = True sql_function = 'COUNT' sql_template = '%(function)s(IF(%(condition)s,TRUE,NULL))'
which allowed me to do something like this:
query = Order.objects.values('item') #Orders grouped by item for ethnicity in Ethnicity.objects.all(): condition = 'myap_user.ethnicity_id=%s' % ethnicity.id query = query.annotate(CountIf(user__ethnicity, condition=condition))
If we could figure out a way to convert django-esque filter conditions into the SQL necessary for the condition in the "sql_template" of the aggregate, that would be epic! It seems like someone with a good knowlege of the inner-workings of django.db.models.sql.query would be able to figure it out pretty quickly, but hopefully someone will let me know if this is just a pipe dream :)
Change History (47)
comment:1 Changed 8 years ago by
|Component:||Uncategorized → ORM aggregation|
|Triage Stage:||Unreviewed → Someday/Maybe|
comment:7 Changed 6 years ago by
|Cc:||Anssi Kääriäinen added|
|Patch needs improvement:||set|
comment:30 Changed 4 years ago by
|Keywords:||aggregate annotate added|
|Version:||1.5-beta-1 → master|