Code

Opened 5 years ago

Last modified 17 months ago

#10302 new New feature

Add some date features to aggregation

Reported by: tolano Owned by:
Component: Database layer (models, ORM) Version: master
Severity: Normal Keywords: aggregation dates
Cc: miracle2k, ben.welsh@…, gmayer, kmpm, kmike, robin, jpaulett, sebastian.goll@… Triage Stage: Accepted
Has patch: no Needs documentation: no
Needs tests: no Patch needs improvement: no
Easy pickings: no UI/UX: no

Description

It would be nice to use datetime fields as a group in aggregation.

class Log(models.Model):
    datetime = models.DateTimeField(auto_now_add=True)
    subject = models.CharField(max_length=255)

Log.objects.values('datetime').annotate(num_logs=Count('id'))

Grouping by date(not datetime) is not possible.

Attachments (0)

Change History (16)

comment:1 Changed 5 years ago by ikelly

  • Needs documentation unset
  • Needs tests unset
  • Patch needs improvement unset

Using the sqlite3 backend:

Log.objects.extra(select={'date': 'django_date_trunc("day", "testapp_log"."datetime")'}).values('date').annotate(num_logs=Count('id'))

Other backends will require replacing django_date_trunc() with the appropriate call. But I agree that there should be a single API for this.

comment:2 Changed 5 years ago by anonymous

  • milestone post-1.0 deleted

Milestone post-1.0 deleted

comment:3 Changed 5 years ago by jacob

  • milestone set to 1.1 beta
  • Triage Stage changed from Unreviewed to Accepted

comment:4 Changed 5 years ago by russellm

  • milestone 1.1 beta deleted

Bumping this back to accepted, but not on v1.1 path, because this isn't something that is strictly related to aggregates. You can already do a GROUP BY on any date/datetime field that exists on the model. This ticket is asking for the ability to do a GROUP BY on a value computed from another field on the model. This will first require the ability to compute fields and annotate them.

While this is definitely a good idea, it isn't a simple extension or omission from the v1.1 aggregates implementation. In the interim, it can be done using the approach Ian mentioned in the first comment.

comment:5 Changed 5 years ago by palewire

  • Cc ben.welsh@… added

I tried to replicate this on a model with a DateTimeField and ran into a problem I can't explain.
I'm working in Postgres and on the 1.1 official release.

The code was:

>>> from django.db.models import Count
>>> from app.models import Model
>>> Model.objects.extra(select={'year': "DATE_TRUNC('year', \"app_model\".\"datetime\")"}).values('year').annotate(total=Count('id'))

And the SQL I got back seemed to erroneously have the datefield included into the GROUP BY statement, like:

>>> from django.db import connection
>>> print connection.queries[-1]
{'time': '0.207', 'sql': 'SELECT (DATE_TRUNC(\'year\', "app_model"."datetime")) AS "year", COUNT("app_model"."id") AS total 
FROM "app_model" GROUP BY DATE_TRUNC(\'year\', "app_model"."datetime"), "app_model"."datetime" ORDER BY "app_model"."datetime" DESC LIMIT 21'}

comment:6 Changed 5 years ago by russellm

  • Component changed from Database layer (models, ORM) to ORM aggregation
  • Owner nobody deleted

comment:7 Changed 5 years ago by miracle2k

  • Cc miracle2k added

comment:8 Changed 4 years ago by gmayer

  • Cc gmayer added

I ran into the same problem as palewire above but after some digging realised that my ordering in the model's Meta class was causing django to insert extra GROUP BY fields for all those listed in ordering. After commenting it out the extra().values().annotate() did the right thing, alternatively appending an order_by('year') to his command above would have worked too.

I'm not sure if this is a bug or a feature (why are ordering fields forced to be included in the GROUP BY SQL???) but if it's a feature then this gotcha is poorly documented. The queryset docs only mention something order_by() possibly conflicting with a values() under the distinct() heading, it should be a prominent warning under the values() and/or extra() description.

comment:9 Changed 4 years ago by kmpm

  • Cc kmpm added

comment:10 Changed 4 years ago by kmpm

I made a method something like this in a manager.
When I didn't have that last .order_by things went mad as for @gmayer but as soon as I added my own order I could put back the ordering in the model's Meta class.
Don't know if will help but it's a workaround if you need ordering in the Meta class.

def period_avrage(self, start_at=None, end_at=None, period_size='day'):

extra_psql={'period': "DATE_TRUNC('%s', \"collector_taglog\".\"created_at\")" % period_size}
qs = self.filter(data_good=True) #only good values
if start_at: qs = qs.filter(created_atgte=start_at)
if end_at: qs = qs.filter(created_at
lte=end_at)
#do not remove the order by. Otherwise the ordering on the model will screw things up
return qs.extra(select=extra_psql).values('period').annotate(avg=Avg('v_num')).order_by('period')

comment:11 Changed 4 years ago by anonymous

  • Cc kmike added

comment:12 Changed 4 years ago by robin

  • Cc robin added

comment:13 Changed 3 years ago by jpaulett

  • Cc jpaulett added

comment:14 Changed 3 years ago by SmileyChris

  • Severity set to Normal
  • Type set to New feature

comment:15 Changed 3 years ago by sebastian

  • Cc sebastian.goll@… added
  • Easy pickings unset
  • UI/UX unset

comment:16 Changed 17 months ago by akaariai

  • Component changed from ORM aggregation to Database layer (models, ORM)

Add Comment

Modify Ticket

Change Properties
<Author field>
Action
as new
The owner will be changed from (none) to anonymous. Next status will be 'assigned'
as The resolution will be set. Next status will be 'closed'
Author


E-mail address and user name can be saved in the Preferences.

 
Note: See TracTickets for help on using tickets.