Opened 8 years ago

Closed 11 months ago

#10302 closed New feature (fixed)

Add some date features to aggregation

Reported by: Adrian Ribao Owned by:
Component: Database layer (models, ORM) Version: master
Severity: Normal Keywords: aggregation dates
Cc: miracle2k, ben.welsh@…, gmayer, kmpm, Mikhail Korobov, Robin, John Paulett, sebastian.goll@… Triage Stage: Accepted
Has patch: no Needs documentation: no
Needs tests: no Patch needs improvement: no
Easy pickings: no UI/UX: no

Description

It would be nice to use datetime fields as a group in aggregation.

class Log(models.Model):
    datetime = models.DateTimeField(auto_now_add=True)
    subject = models.CharField(max_length=255)

Log.objects.values('datetime').annotate(num_logs=Count('id'))

Grouping by date(not datetime) is not possible.

Attachments (1)

ticket_10302.diff (2.1 KB) - added by Raphael Michel 11 months ago.
Regression test

Download all attachments as: .zip

Change History (18)

comment:1 Changed 8 years ago by Ian Kelly

Needs documentation: unset
Needs tests: unset
Patch needs improvement: unset

Using the sqlite3 backend:

Log.objects.extra(select={'date': 'django_date_trunc("day", "testapp_log"."datetime")'}).values('date').annotate(num_logs=Count('id'))

Other backends will require replacing django_date_trunc() with the appropriate call. But I agree that there should be a single API for this.

comment:2 Changed 8 years ago by (none)

milestone: post-1.0

Milestone post-1.0 deleted

comment:3 Changed 8 years ago by Jacob

milestone: 1.1 beta
Triage Stage: UnreviewedAccepted

comment:4 Changed 8 years ago by Russell Keith-Magee

milestone: 1.1 beta

Bumping this back to accepted, but not on v1.1 path, because this isn't something that is strictly related to aggregates. You can already do a GROUP BY on any date/datetime field that exists on the model. This ticket is asking for the ability to do a GROUP BY on a value computed from another field on the model. This will first require the ability to compute fields and annotate them.

While this is definitely a good idea, it isn't a simple extension or omission from the v1.1 aggregates implementation. In the interim, it can be done using the approach Ian mentioned in the first comment.

comment:5 Changed 7 years ago by palewire

Cc: ben.welsh@… added

I tried to replicate this on a model with a DateTimeField and ran into a problem I can't explain.
I'm working in Postgres and on the 1.1 official release.

The code was:

>>> from django.db.models import Count
>>> from app.models import Model
>>> Model.objects.extra(select={'year': "DATE_TRUNC('year', \"app_model\".\"datetime\")"}).values('year').annotate(total=Count('id'))

And the SQL I got back seemed to erroneously have the datefield included into the GROUP BY statement, like:

>>> from django.db import connection
>>> print connection.queries[-1]
{'time': '0.207', 'sql': 'SELECT (DATE_TRUNC(\'year\', "app_model"."datetime")) AS "year", COUNT("app_model"."id") AS total 
FROM "app_model" GROUP BY DATE_TRUNC(\'year\', "app_model"."datetime"), "app_model"."datetime" ORDER BY "app_model"."datetime" DESC LIMIT 21'}

comment:6 Changed 7 years ago by Russell Keith-Magee

Component: Database layer (models, ORM)ORM aggregation
Owner: nobody deleted

comment:7 Changed 7 years ago by miracle2k

Cc: miracle2k added

comment:8 Changed 7 years ago by gmayer

Cc: gmayer added

I ran into the same problem as palewire above but after some digging realised that my ordering in the model's Meta class was causing django to insert extra GROUP BY fields for all those listed in ordering. After commenting it out the extra().values().annotate() did the right thing, alternatively appending an order_by('year') to his command above would have worked too.

I'm not sure if this is a bug or a feature (why are ordering fields forced to be included in the GROUP BY SQL???) but if it's a feature then this gotcha is poorly documented. The queryset docs only mention something order_by() possibly conflicting with a values() under the distinct() heading, it should be a prominent warning under the values() and/or extra() description.

comment:9 Changed 7 years ago by kmpm

Cc: kmpm added

comment:10 Changed 7 years ago by kmpm

I made a method something like this in a manager.
When I didn't have that last .order_by things went mad as for @gmayer but as soon as I added my own order I could put back the ordering in the model's Meta class.
Don't know if will help but it's a workaround if you need ordering in the Meta class.

def period_avrage(self, start_at=None, end_at=None, period_size='day'):
    extra_psql={'period': "DATE_TRUNC('%s', \"collector_taglog\".\"created_at\")" % period_size}
    qs = self.filter(data_good=True) #only good values
    if start_at: qs = qs.filter(created_at__gte=start_at)
    if end_at: qs = qs.filter(created_at__lte=end_at)    
    #do not remove the order by. Otherwise the ordering on the model will screw things up
    return qs.extra(select=extra_psql).values('period').annotate(avg=Avg('v_num')).order_by('period')
Last edited 14 months ago by Tim Graham (previous) (diff)

comment:11 Changed 6 years ago by anonymous

Cc: Mikhail Korobov added

comment:12 Changed 6 years ago by Robin

Cc: Robin added

comment:13 Changed 6 years ago by John Paulett

Cc: John Paulett added

comment:14 Changed 6 years ago by Chris Beaven

Severity: Normal
Type: New feature

comment:15 Changed 5 years ago by Sebastian Goll

Cc: sebastian.goll@… added
Easy pickings: unset
UI/UX: unset

comment:16 Changed 4 years ago by Anssi Kääriäinen

Component: ORM aggregationDatabase layer (models, ORM)

Changed 11 months ago by Raphael Michel

Attachment: ticket_10302.diff added

Regression test

comment:17 Changed 11 months ago by Raphael Michel

Resolution: fixed
Status: newclosed

This has been fixed by the Expressions API and is therefore obsolete.

Note: See TracTickets for help on using tickets.
Back to Top