#10929 closed New feature (fixed)
Support a default value for Sum (and possibly other aggregation functions)
| Reported by: | nolan | Owned by: | Nick Pope |
|---|---|---|---|
| Component: | Database layer (models, ORM) | Version: | dev |
| Severity: | Normal | Keywords: | aggregate annotate default |
| Cc: | real.human@…, net147, cvrebert, jpk, Simon Charette, Hans Aarne Liblik | Triage Stage: | Ready for checkin |
| Has patch: | yes | Needs documentation: | no |
| Needs tests: | no | Patch needs improvement: | no |
| Easy pickings: | no | UI/UX: | no |
Description
By default, annotate(sum_field = Sum(...)) results in sum_field being NULL if there were no values to sum. In most cases, 0 would be a better option here, for proper sorting in a later order_by, and for comparisons using lt/gt/gte/lte.
A monkeypatch to implement default values for NULL using COALESCE is available here:
http://stackoverflow.com/questions/553038/treat-null-as-0-in-django-model
Attachments (1)
Change History (39)
comment:1 by , 17 years ago
| Component: | Database layer (models, ORM) → ORM aggregation |
|---|---|
| Owner: | removed |
| Triage Stage: | Unreviewed → Accepted |
comment:2 by , 15 years ago
| Severity: | → Normal |
|---|---|
| Type: | → New feature |
comment:3 by , 14 years ago
| Easy pickings: | unset |
|---|
#14548 was a duplicate, had some useful discussion in its comments.
by , 14 years ago
| Attachment: | 10929-aggregates-default-r17029.diff added |
|---|
comment:4 by , 14 years ago
| Cc: | added |
|---|---|
| Has patch: | set |
| Keywords: | aggregate annotate default added |
| Needs documentation: | set |
| UI/UX: | unset |
| Version: | 1.1-beta-1 → SVN |
Just added a patch with an initial implementation for this feature. It should work for all aggregates by wrapping the default sql_template in COALESCE(%s, %%(default)s), if params has a default item. There's a basic test in there that checks the behaviour for Avg with and without a default. If there's positive feedback about the implementation, I'll have a go at adding documentation and any improvements to the implementation and tests that are suggested.
comment:5 by , 14 years ago
| Cc: | added |
|---|
comment:6 by , 14 years ago
There is a good case for giving Sum a default of zero, while leaving Max and Min with default of None. Essentially, the reason is that zero is the fixed point of the + operator, while max and min have no fixed points (excluding negative infinity and positive infinity respectively, which are problematic). Also, it makes the behaviour analogous to the Python builtins sum, min and max - sum returns 0 for an empty list, whereas min and max throw exceptions. (We don't want to throw exceptions here, for various obvious reasons, but I think we should be indicating 'undefined' in some way under the same circumstances).
If we do this, we need a note in the release notes about the backwards incompatibility - if people were relying on Sum returning None for no data instead of zero. (We can certainly classify the previous behaviour as a bug, since it was out of line with our docs, and neither expected or useful behaviour, but should still mention this). We should also have some tests that ensure that it returns a zero of the right type for different underlying field types.
This also brings up the possibility of whether the default for Sum should act like the start parameter of the sum Python builtin. I think it should. That would make implementation harder, though, I guess. Alternatively we could have a separate 'start' parameter for Sum, which might be clearer, and would make that a separate feature.
comment:7 by , 13 years ago
| Cc: | added |
|---|
comment:8 by , 13 years ago
| Cc: | added |
|---|
comment:9 by , 13 years ago
| Component: | ORM aggregation → Database layer (models, ORM) |
|---|
comment:10 by , 11 years ago
I would support closing this as a duplicate of #23753. Once the linked ticket is implemented, it'll be possible for users to construct their own coalesce value like:
annotate(sum_field = Coalesce(Sum(...), 0))
This leaves the default behaviour alone, but provides a higher level of customisation. It relies on the #14030 patch which will also allow adding a "start" value, as mentioned by @lukeplant above:
annotate(sum_field = Coalesce(Sum(...), 0) + Value(5))
Would everyone be happy enough with the above use?
comment:11 by , 11 years ago
I'm not totally convinced (yet) that we should give up on this when Coalesce() becomes available. If we just instruct people to use Coalesce(Sum(...), default) all the time, we're basically just recreating the literal semantics of SQL itself rather than providing an easy to use abstraction of it. Sure, it would be useful and powerful to have Coalesce() as well, but we might not need to force people to use it all the time for the common case. At the very least, a decision on closing this ticket should probably wait until Coalesce() is actually implemented.
follow-ups: 13 14 15 17 comment:12 by , 11 years ago
@mrmachine that's not a bad argument to make. What we could support is:
Sum('field', default=0)
Which would then be returned from convert_value in the case where the value is None. The argument against that is we'd be special-casing the Sum aggregate, while leaving other aggregates to use the Coalesce method. That's not such a big downside though.
Does anyone else want to weigh in on this? Supporting a default=0 kwarg is a 4 line + docs change.
comment:13 by , 11 years ago
Replying to jarshwah:
@mrmachine that's not a bad argument to make. What we could support is:
Sum('field', default=0)Which would then be returned from
convert_valuein the case where the value is None. The argument against that is we'd be special-casing the Sum aggregate, while leaving other aggregates to use the Coalesce method. That's not such a big downside though.
Does anyone else want to weigh in on this? Supporting a default=0 kwarg is a 4 line + docs change.
+1 for this.
comment:14 by , 11 years ago
Replying to jarshwah:
@mrmachine that's not a bad argument to make. What we could support is:
Sum('field', default=0)Which would then be returned from
convert_valuein the case where the value is None. The argument against that is we'd be special-casing the Sum aggregate, while leaving other aggregates to use the Coalesce method. That's not such a big downside though.
Does anyone else want to weigh in on this? Supporting a default=0 kwarg is a 4 line + docs change.
+1 from me too.
comment:15 by , 11 years ago
Replying to jarshwah:
@mrmachine that's not a bad argument to make. What we could support is:
Sum('field', default=0)Which would then be returned from
convert_valuein the case where the value is None. The argument against that is we'd be special-casing the Sum aggregate, while leaving other aggregates to use the Coalesce method. That's not such a big downside though.
Does anyone else want to weigh in on this? Supporting a default=0 kwarg is a 4 line + docs change.
I ran into a situation today where this feature would have been nice. +1 from me.
comment:16 by , 11 years ago
If you are running into this use case, you can use Coalesce(Sum('field'), 0) in Django 1.8. At least until a decision is made on this ticket. If you want to bring this up on the ML for the other ORM people to have input, that'd be a good next step.
comment:17 by , 9 years ago
Replying to Josh Smeaton:
@mrmachine that's not a bad argument to make. What we could support is:
Sum('field', default=0)Which would then be returned from
convert_valuein the case where the value is None. The argument against that is we'd be special-casing the Sum aggregate, while leaving other aggregates to use the Coalesce method. That's not such a big downside though.
Does anyone else want to weigh in on this? Supporting a default=0 kwarg is a 4 line + docs change.
+1 for me .
comment:18 by , 9 years ago
| Cc: | added |
|---|
May I suggest the argument be named coalesce and be added to all aggregates (through the django.db.models.aggregate.Aggregate class)?
comment:19 by , 9 years ago
I'm also facing this issue. The workaround with Coalesce works, but...
from django.db.models import Sum from django.db.models.functions import Coalesce
really?
comment:20 by , 9 years ago
Note that the new aggregate functions in django.contrib.postgres silently handle this.
I still believe adding a coalesce sugar argument to aggregate functions would be more appropriate than a default one as it would allow passing expressions instead of simple Python value.
Sum('amount', coalesce=F('minimum_amount'))
Which is more readable than
Coalesce(Sum('amount'), F('minimum_amount'))
comment:23 by , 6 years ago
Aaargh, cmon, 11 years? To add a perfectly sane default of 0 for Sum? I'm amazed this hasn't been fixed for so long :) What was wrong with supporting default=0 at least? (yes I had the same perfectly normal reaction of getting annoyed by seeing NULL in my API output instead of 0 for a Sum of Payments for a User).
comment:25 by , 5 years ago
| Owner: | set to |
|---|---|
| Status: | new → assigned |
comment:26 by , 5 years ago
| Patch needs improvement: | set |
|---|
comment:27 by , 5 years ago
| Needs documentation: | unset |
|---|---|
| Owner: | changed from to |
| Patch needs improvement: | unset |
I have produced an implementation using Coalesce for all aggregates other than Count.
See the new PR.
comment:28 by , 4 years ago
| Patch needs improvement: | set |
|---|
comment:29 by , 4 years ago
| Patch needs improvement: | unset |
|---|
follow-up: 31 comment:30 by , 4 years ago
| Patch needs improvement: | set |
|---|
Some tests don't work on MySQL and Oracle.
comment:31 by , 4 years ago
| Patch needs improvement: | unset |
|---|
Replying to Mariusz Felisiak:
Some tests don't work on MySQL and Oracle.
The tests are now all passing. \o/
comment:32 by , 4 years ago
| Triage Stage: | Accepted → Ready for checkin |
|---|
This is a reasonable suggestion; variants in SQL syntax notwithstanding, it shouldn't be too hard to implement.
For those following the advice of Stack Overflow: There is no need to monkeypatch the sql_aggregates module - if you override the add_to_query function(), you can load the sql-specific aggregate from wherever you want.
In fact, you don't even need to have the split between generic and SQL specific aggregate. If all you want is a custom SQL aggregate that you can use in your own code, the following definition will help:
from django.db.models.sql.aggregates import Aggregate class MyAggregate(Aggregate): sql_function = 'SOME_SQL_FUNC' sql_template = 'SQL_TEMPLATE_BITS(%(function)s(%(field)s), %(default)s)' def __init__(self, lookup, **extra): self.lookup = lookup self.extra = extra def _default_alias(self): return '%s__%s' % (self.lookup, self.__class__.__name__.lower()) default_alias = property(_default_alias) def add_to_query(self, query, alias, col, source, is_summary): super(MyAggregate, self).__init__(col, source, is_summary, **self.extra) query.aggregate_select[alias] = self