#25339 closed New feature (worksforme)
Aggregation and annotation by time period and intervals (by month, week, day, hour, etc)
Reported by: | Austin Pua | Owned by: | nobody |
---|---|---|---|
Component: | Database layer (models, ORM) | Version: | 1.8 |
Severity: | Normal | Keywords: | QuerySet.extra |
Cc: | Triage Stage: | Unreviewed | |
Has patch: | no | Needs documentation: | no |
Needs tests: | no | Patch needs improvement: | no |
Easy pickings: | no | UI/UX: | no |
Description (last modified by )
I often found that aggregation of data by dynamically generating time periods given a desired time interval is a very sought after feature. I personally created the snippet below to accommodate my needs, but as per the note here, the extra() method will be deprecated soon. I think it will be a waste to deprecate this API, since there is simply a lot of DBMS-specific SQL functions available, but if there is any way we can do this via Django ORM, then I am still up for it. The snippet below shows a custom QuerySet method I often use.
def get_metrics(self, frequency): select_fields = OrderedDict() if frequency == 'week': select_fields['time_period'] = "date_trunc(%s, initial_timestamp::TIMESTAMP WITH TIME ZONE AT TIME ZONE %s + '1 day'::interval) - '1 day'::interval" else: select_fields['time_period'] = "date_trunc(%s, initial_timestamp::TIMESTAMP WITH TIME ZONE AT TIME ZONE %s)" select_params = (frequency, settings.TIME_ZONE,) queryset = self.extra( select=select_fields, select_params=select_params ).values('time_period', ....) queryset = queryset.annotate( # Add annotations ) return queryset
Change History (5)
comment:1 by , 9 years ago
Description: | modified (diff) |
---|
follow-up: 3 comment:2 by , 9 years ago
Resolution: | → worksforme |
---|---|
Status: | new → closed |
comment:3 by , 9 years ago
I do not think that excerpt does the same functionality mine does. It seems that it only filters for the year 2015, extracts the month number, and then performs a 'GROUP BY' based on the month number. It does not generate an actual timeseries that can easily be parsed by javascript and charting libraries.
My excerpt outputs actual time periods based on a range of dates with specific intervals/frequency. If for example, the start_date
is 2015-01-01, end_date
is 2015-07-07, and frequency
is month, it will aggregate the results that match the date range and group them by month dates like so: 2015-01-01, 2015-02-01, 2015-03-01, ... 2015-07-01.
If the frequency
is hour, then it will output each and every hour like so: 2015-01-01 00:00:00, 2015-01-01 01:00:00, 2015-01-01 02:00:00, ... 2015-05-01 00:00:00, 2015-05-01 01:00:00 , ... , 2015-07-07 22:00:00, and 2015-07-07 23:00:00. You will have a column containing the actual timeseries.
One can only do that efficiently in PostgreSQL by using the DATE_TRUNC
function, and I looked in the source code, but it does not seem to be used in anywhere that might resemble something like my excerpt. I think it is possible to write a custom Transform for that, but it would be nice to have it ready out of the box. At least in my case, projects always involve aggregating and presenting data with a timeseries.
comment:4 by , 9 years ago
Malefice, if you feel this would be something useful for Django, then please feel welcome to open up a PR with an implementation that uses Expressions, and we can consider the inclusion at that point. It will have to work with the other supported backends too though in some way.
comment:5 by , 9 years ago
I think DateTrunk expression makes perfect sense for Django. It is often very useful for aggregation for example. Similarly date formatting and date parsing functions would be useful.
Oracle and PostgreSQL have date truncation methods out of the box, for sqlite and mysql it seems we have to use some tricks (format first the date as string '%y-%m-01', and then read that back in as date for month truncate for example).
We began talking about deprecating
.extra()
because now we have user defined expressions that can replace (the majority of).extra
. See Func Expressions specifically.Django 1.9 will also include Day/Month/Year/Week transformations, so you'll be able to do something similar to:
If you find there's something you cannot do with expressions that you can currently do with extra (with regards to SELECT), please come talk to us in #django or post to django-users. We are already aware that extra is still the only way to introduce custom joins.