Opened 15 years ago

Closed 3 years ago

Last modified 14 months ago

#10929 closed New feature (fixed)

Support a default value for Sum (and possibly other aggregation functions)

Reported by: nolan Owned by: Nick Pope
Component: Database layer (models, ORM) Version: dev
Severity: Normal Keywords: aggregate annotate default
Cc: real.human@…, net147, cvrebert, jpk, Simon Charette, Hans Aarne Liblik Triage Stage: Ready for checkin
Has patch: yes Needs documentation: no
Needs tests: no Patch needs improvement: no
Easy pickings: no UI/UX: no

Description

By default, annotate(sum_field = Sum(...)) results in sum_field being NULL if there were no values to sum. In most cases, 0 would be a better option here, for proper sorting in a later order_by, and for comparisons using lt/gt/gte/lte.

A monkeypatch to implement default values for NULL using COALESCE is available here:
http://stackoverflow.com/questions/553038/treat-null-as-0-in-django-model

Attachments (1)

10929-aggregates-default-r17029.diff (2.2 KB ) - added by Tai Lee 13 years ago.

Download all attachments as: .zip

Change History (39)

comment:1 by Russell Keith-Magee, 15 years ago

Component: Database layer (models, ORM)ORM aggregation
Owner: nobody removed
Triage Stage: UnreviewedAccepted

This is a reasonable suggestion; variants in SQL syntax notwithstanding, it shouldn't be too hard to implement.

For those following the advice of Stack Overflow: There is no need to monkeypatch the sql_aggregates module - if you override the add_to_query function(), you can load the sql-specific aggregate from wherever you want.

In fact, you don't even need to have the split between generic and SQL specific aggregate. If all you want is a custom SQL aggregate that you can use in your own code, the following definition will help:

from django.db.models.sql.aggregates import Aggregate 
class MyAggregate(Aggregate): 
    sql_function = 'SOME_SQL_FUNC' 
    sql_template = 'SQL_TEMPLATE_BITS(%(function)s(%(field)s), %(default)s)'

    def __init__(self, lookup, **extra): 
        self.lookup = lookup 
        self.extra = extra 
    def _default_alias(self): 
        return '%s__%s' % (self.lookup, self.__class__.__name__.lower()) 
    default_alias = property(_default_alias) 
    def add_to_query(self, query, alias, col, source, is_summary): 
        super(MyAggregate, self).__init__(col, source, is_summary, **self.extra) 
        query.aggregate_select[alias] = self 

comment:2 by Luke Plant, 13 years ago

Severity: Normal
Type: New feature

comment:3 by Ramiro Morales, 13 years ago

Easy pickings: unset

#14548 was a duplicate, had some useful discussion in its comments.

comment:4 by Tai Lee, 13 years ago

Cc: real.human@… added
Has patch: set
Keywords: aggregate annotate default added
Needs documentation: set
UI/UX: unset
Version: 1.1-beta-1SVN

Just added a patch with an initial implementation for this feature. It should work for all aggregates by wrapping the default sql_template in COALESCE(%s, %%(default)s), if params has a default item. There's a basic test in there that checks the behaviour for Avg with and without a default. If there's positive feedback about the implementation, I'll have a go at adding documentation and any improvements to the implementation and tests that are suggested.

comment:5 by net147, 12 years ago

Cc: net147 added

comment:6 by Luke Plant, 12 years ago

There is a good case for giving Sum a default of zero, while leaving Max and Min with default of None. Essentially, the reason is that zero is the fixed point of the + operator, while max and min have no fixed points (excluding negative infinity and positive infinity respectively, which are problematic). Also, it makes the behaviour analogous to the Python builtins sum, min and max - sum returns 0 for an empty list, whereas min and max throw exceptions. (We don't want to throw exceptions here, for various obvious reasons, but I think we should be indicating 'undefined' in some way under the same circumstances).

If we do this, we need a note in the release notes about the backwards incompatibility - if people were relying on Sum returning None for no data instead of zero. (We can certainly classify the previous behaviour as a bug, since it was out of line with our docs, and neither expected or useful behaviour, but should still mention this). We should also have some tests that ensure that it returns a zero of the right type for different underlying field types.

This also brings up the possibility of whether the default for Sum should act like the start parameter of the sum Python builtin. I think it should. That would make implementation harder, though, I guess. Alternatively we could have a separate 'start' parameter for Sum, which might be clearer, and would make that a separate feature.

comment:7 by cvrebert, 12 years ago

Cc: cvrebert added

comment:8 by jpk, 11 years ago

Cc: jpk added

comment:9 by Anssi Kääriäinen, 11 years ago

Component: ORM aggregationDatabase layer (models, ORM)

comment:10 by Josh Smeaton, 9 years ago

I would support closing this as a duplicate of #23753. Once the linked ticket is implemented, it'll be possible for users to construct their own coalesce value like:

annotate(sum_field = Coalesce(Sum(...), 0))

This leaves the default behaviour alone, but provides a higher level of customisation. It relies on the #14030 patch which will also allow adding a "start" value, as mentioned by @lukeplant above:

annotate(sum_field = Coalesce(Sum(...), 0) + Value(5))

Would everyone be happy enough with the above use?

comment:11 by Tai Lee, 9 years ago

I'm not totally convinced (yet) that we should give up on this when Coalesce() becomes available. If we just instruct people to use Coalesce(Sum(...), default) all the time, we're basically just recreating the literal semantics of SQL itself rather than providing an easy to use abstraction of it. Sure, it would be useful and powerful to have Coalesce() as well, but we might not need to force people to use it all the time for the common case. At the very least, a decision on closing this ticket should probably wait until Coalesce() is actually implemented.

comment:12 by Josh Smeaton, 9 years ago

@mrmachine that's not a bad argument to make. What we could support is:

Sum('field', default=0)

Which would then be returned from convert_value in the case where the value is None. The argument against that is we'd be special-casing the Sum aggregate, while leaving other aggregates to use the Coalesce method. That's not such a big downside though.

Does anyone else want to weigh in on this? Supporting a default=0 kwarg is a 4 line + docs change.

in reply to:  12 comment:13 by JuneHyeon Bae, 9 years ago

Replying to jarshwah:

@mrmachine that's not a bad argument to make. What we could support is:

Sum('field', default=0)

Which would then be returned from convert_value in the case where the value is None. The argument against that is we'd be special-casing the Sum aggregate, while leaving other aggregates to use the Coalesce method. That's not such a big downside though.

Does anyone else want to weigh in on this? Supporting a default=0 kwarg is a 4 line + docs change.

+1 for this.

in reply to:  12 comment:14 by Mathieu, 9 years ago

Replying to jarshwah:

@mrmachine that's not a bad argument to make. What we could support is:

Sum('field', default=0)

Which would then be returned from convert_value in the case where the value is None. The argument against that is we'd be special-casing the Sum aggregate, while leaving other aggregates to use the Coalesce method. That's not such a big downside though.

Does anyone else want to weigh in on this? Supporting a default=0 kwarg is a 4 line + docs change.

+1 from me too.

in reply to:  12 comment:15 by Tyler Morgan, 9 years ago

Replying to jarshwah:

@mrmachine that's not a bad argument to make. What we could support is:

Sum('field', default=0)

Which would then be returned from convert_value in the case where the value is None. The argument against that is we'd be special-casing the Sum aggregate, while leaving other aggregates to use the Coalesce method. That's not such a big downside though.

Does anyone else want to weigh in on this? Supporting a default=0 kwarg is a 4 line + docs change.

I ran into a situation today where this feature would have been nice. +1 from me.

comment:16 by Josh Smeaton, 9 years ago

If you are running into this use case, you can use Coalesce(Sum('field'), 0) in Django 1.8. At least until a decision is made on this ticket. If you want to bring this up on the ML for the other ORM people to have input, that'd be a good next step.

in reply to:  12 comment:17 by Jiyda Mint Moussa, 8 years ago

Replying to Josh Smeaton:

@mrmachine that's not a bad argument to make. What we could support is:

Sum('field', default=0)

Which would then be returned from convert_value in the case where the value is None. The argument against that is we'd be special-casing the Sum aggregate, while leaving other aggregates to use the Coalesce method. That's not such a big downside though.

Does anyone else want to weigh in on this? Supporting a default=0 kwarg is a 4 line + docs change.

+1 for me .

comment:18 by Simon Charette, 8 years ago

Cc: Simon Charette added

May I suggest the argument be named coalesce and be added to all aggregates (through the django.db.models.aggregate.Aggregate class)?

comment:19 by Aymeric Augustin, 7 years ago

I'm also facing this issue. The workaround with Coalesce works, but...

from django.db.models import Sum
from django.db.models.functions import Coalesce

really?

Last edited 7 years ago by Aymeric Augustin (previous) (diff)

comment:20 by Simon Charette, 7 years ago

Note that the new aggregate functions in django.contrib.postgres silently handle this.

I still believe adding a coalesce sugar argument to aggregate functions would be more appropriate than a default one as it would allow passing expressions instead of simple Python value.

Sum('amount', coalesce=F('minimum_amount'))

Which is more readable than

Coalesce(Sum('amount), F('minimum_amount'))
Version 2, edited 7 years ago by Simon Charette (previous) (next) (diff)

comment:21 by Hans Aarne Liblik, 4 years ago

Cc: Hans Aarne Liblik added

+1 for

default

kwarg

comment:22 by Michael Yartsev, 4 years ago

Another +1

comment:23 by Björn W, 4 years ago

Aaargh, cmon, 11 years? To add a perfectly sane default of 0 for Sum? I'm amazed this hasn't been fixed for so long :) What was wrong with supporting default=0 at least? (yes I had the same perfectly normal reaction of getting annoyed by seeing NULL in my API output instead of 0 for a Sum of Payments for a User).

comment:24 by Mariusz Felisiak, 4 years ago

Björn, such comments doesn't help. Patch is welcome.

comment:25 by Barton Ip, 3 years ago

Owner: set to Barton Ip
Status: newassigned
Last edited 3 years ago by Mariusz Felisiak (previous) (diff)

comment:26 by Mariusz Felisiak, 3 years ago

Patch needs improvement: set

comment:27 by Nick Pope, 3 years ago

Needs documentation: unset
Owner: changed from Barton Ip to Nick Pope
Patch needs improvement: unset

I have produced an implementation using Coalesce for all aggregates other than Count.

See the new PR.

comment:28 by Mariusz Felisiak, 3 years ago

Patch needs improvement: set

comment:29 by Nick Pope, 3 years ago

Patch needs improvement: unset

comment:30 by Mariusz Felisiak, 3 years ago

Patch needs improvement: set

Some tests don't work on MySQL and Oracle.

in reply to:  30 comment:31 by Nick Pope, 3 years ago

Patch needs improvement: unset

Replying to Mariusz Felisiak:

Some tests don't work on MySQL and Oracle.

The tests are now all passing. \o/

comment:32 by Mariusz Felisiak, 3 years ago

Triage Stage: AcceptedReady for checkin

comment:33 by Mariusz Felisiak <felisiak.mariusz@…>, 3 years ago

Resolution: fixed
Status: assignedclosed

In 501a8db4:

Fixed #10929 -- Added default argument to aggregates.

Thanks to Simon Charette and Adam Johnson for the reviews.

comment:34 by Mariusz Felisiak <felisiak.mariusz@…>, 3 years ago

In fee87345:

Refs #10929 -- Deprecated forced empty result value for PostgreSQL aggregates.

This deprecates forcing a return value for ArrayAgg, JSONBAgg, and
StringAgg when there are no rows in the query. Now that we have a
default argument for aggregates, we want to revert to returning the
default of None which most aggregate functions return and leave it
up to the user to decide what they want to be returned by default.

comment:35 by GitHub <noreply@…>, 3 years ago

In 022d29c9:

Refs #10929 -- Allowed NowUTC SQL customization for third-party backends.

comment:36 by Mariusz Felisiak <felisiak.mariusz@…>, 17 months ago

In 0db8bf3d:

Refs #10929 -- Fixed aggregates crash when passing strings as defaults.

Previously strings were interpreted as F() expressions and default
crashed with AttributeError:

'F' object has no attribute 'empty_result_set_value'

comment:37 by Mariusz Felisiak <felisiak.mariusz@…>, 16 months ago

In 0be8095:

Refs #10929 -- Stopped forcing empty result value by PostgreSQL aggregates.

Per deprecation timeline.

comment:38 by GitHub <noreply@…>, 14 months ago

In b05dfc28:

Refs #34381, Refs #10929 -- Fixed postgres_tests.test_aggregates.TestGeneralAggretate.test_empty_result_set() on PostgreSQL 14+.

Follow up to 0be8095b254fad65b2480d677ebe6098c41bbad6.

Note: See TracTickets for help on using tickets.
Back to Top