Opened 11 years ago

Closed 11 years ago

Last modified 10 years ago

#2479 closed enhancement (duplicate)

[patch] add sum() method to query set ( SELECT SUM( fieldname .... )

Reported by: Simon Greenhill, dev@… Owned by: Adrian Holovaty
Component: Database layer (models, ORM) Version: master
Severity: minor Keywords:
Cc: dev@… Triage Stage: Unreviewed
Has patch: yes Needs documentation: no
Needs tests: no Patch needs improvement: no
Easy pickings: UI/UX:


Added a sum() method to enable SELECT SUM( ) queries e.g.:

a = Table.objects.all().sum( 'fieldname' )

implemented in the same way as the count() method.

Note: tested on MySQL and SQLite backends.

Attachments (4)

query_sum.diff (1017 bytes) - added by dev@… 11 years ago. (1.1 KB) - added by dev@… 11 years ago.
unit tests (1.1 KB) - added by dev@… 11 years ago.
fixed modeltests
aggregate.diff (984 bytes) - added by dev@… 11 years ago.
quick'n'dirty aggregate function

Download all attachments as: .zip

Change History (11)

Changed 11 years ago by dev@…

Attachment: query_sum.diff added

comment:1 Changed 11 years ago by dev@…


1) I left the output as-is. MySQLdb appears to return a float and SQLite an int. Not sure whether it would be better to type cast to int.

2) This makes no attempt to ensure that SUM is run on a sensible fieldtype - both MySQL/SQLite will allow a SUM on pretty much anything, but just return 0 if it doesn't make sense ( char/varchar/blob ).

3) first time writing unittests - please review brutally :)

Changed 11 years ago by dev@…

Attachment: added

unit tests

comment:2 Changed 11 years ago by dev@…

Just thinking about this - rather than proliferate methods for each database function ( .count, .sum, .avg, .abs, .log, .sqrt etc etc ), it'd be better to have a method that can call each function, e.g.:

	f = Fudge.objects.all().function( function_name, fieldname )

	# e.g.:
	f = Fudge.objects.all().function( 'sqrt', 'some_field' )

Might have to store a map of database types to function names to hide any (can't think of any offhand - but I'm sure there are some) naming inconsistencies between the databases and make sure that the correct function gets called e.g.

map = {
	'sqlite3' = {
		'sum':'SUM(%s)', 'average':'AVG(%s)', 
	'mysql' = {
		'sum':'SUM(%s)', 'average':'AVG(%s)', 
	'postgres' = {
		'sum':'SUM(%s)', 'average':'AVG(%s)', 

Changed 11 years ago by dev@…

Attachment: added

fixed modeltests

comment:3 Changed 11 years ago by dev@…

Bit more playing - the original modeltests will fail if the database backend returns floats not ints. The second modeltest file typecasts to int, but this is probably bad practice, and depends on whether sum() should to the type managing or not.

comment:4 Changed 11 years ago by Malcolm Tredinnick

Before you go too far here: firstly, having various aggregate functions is indeed something we want to incorporate. The API may need a bit of thought, though (for example, should it only work like count(), or should some kind of sub-filtering be possible?).

Secondly, is on the operating table at the moment being refactored, so this will need to wait a few more days at least and then be rebased against the current code (although, admittedly, the change isn't too intrusive, so that shouldn't be too hard). Thinking about what it will look like if we also add avg(), max(), ..., etc, though, it may be that one method of this length per aggregate starts to get crazy. We need to think about that.

Thirdly, stop with the casting to int! If a field contains floats, your version of sum() is going to get the answer wrong.

Finally, I would prefer we do some kind of sanity checking on the target field. The users of models are not required to have deep understanding of SQL and databases (in theory), so we should provide sensible error messages when we can. And sum(name) is rarely going to be the right thing.

comment:5 Changed 11 years ago by dev@…

Ok - I'll let you guys give a thorough workover, and have a think about the best way to implement more of these functions.

Re: filtering - .filter() still works, so you can still get a SELECT SUM( something ) FROM table WHERE x=a, y=b etc. If you mean something like SUM( DISTINCT fieldname ), that doesn't work - but that'll only work on MySQL >5 anyway.

Certainly things like COUNT() are so common that they should have their own method, but I agree that having one method per database function is bad DRY. I've quickly had a go at something like this (diff attached) which adds .aggregate to the methods e.g. Something.objects.all().aggregate( 'function_name', 'field_name' ). You could also match the filter syntax, e.g. fieldname_ _function -> function(fieldname) if preferred.

As for sanity checks - you could definitely check for int/double ( & their Django Model equivalents ), and raise something like a FieldIsWrongType exception if it's one of the char/varchar or datetime fields. However, do you want to stop these being executed on non-numeric fields right across the board? or is this going to be something to specify like SUM works on Integer/Float/Decimal/etc Fields, whilst OTHERFUNCTION only works on, say, IntegerFields.

( Aside - are there any database aggregate functions that are useful on non-numeric fields? )

re: type casting - point taken! :)

Changed 11 years ago by dev@…

Attachment: aggregate.diff added

quick'n'dirty aggregate function

comment:6 Changed 11 years ago by dev@…

Resolution: duplicate
Status: newclosed

Bah, duplicate of #1435 which has a better patch.

comment:7 Changed 10 years ago by Henrik Vendelbo <info@…>

Someone suggested in the doc comments calling the method aggregate rather than function.

Also you might consider having a parameter specifying what to return if the queryset is empty. I believe SQL Server returns NULL rather than 0. And I suppose it would be nice to control that case anyway.

Note: See TracTickets for help on using tickets.
Back to Top