Opened 14 years ago

Closed 11 years ago

Last modified 9 years ago

#12886 closed Bug (fixed)

.aggregate() does not honour sliced queryset

Reported by: til@… Owned by: Anssi Kääriäinen <akaariai@…>
Component: Database layer (models, ORM) Version: 1.4
Severity: Normal Keywords: aggregate
Cc: 3point2, Filip Gruszczyński Triage Stage: Accepted
Has patch: no Needs documentation: no
Needs tests: no Patch needs improvement: no
Easy pickings: no UI/UX: no

Description

If I aggregate a sliced queryset, the result is an aggregate across the whole unsliced queryset:

In [97]: qs = Encounter.objects.all()

In [98]: qs.aggregate(**aggregations)
Out[98]: {'total_no_of_adults': 19, 'total_no_of_children': 8, 'total_no_of_dogs': 2}

In [99]: qs[0:5].aggregate(**aggregations)
Out[99]: {'total_no_of_adults': 19, 'total_no_of_children': 8, 'total_no_of_dogs': 2}

It works though if .annotate() is called first, even if no annotations are given:

In [100]: qs[0:5].annotate().aggregate(**aggregations)
Out[100]: {'total_no_of_adults': 8, 'total_no_of_children': 8, 'total_no_of_dogs': 2}

The latter are the correct figures for my example model.

Attachments (1)

12886.patch (3.2 KB ) - added by Filip Gruszczyński 13 years ago.
partial patch with tests showing the problem

Download all attachments as: .zip

Change History (18)

comment:1 by Tilman Koschnick, 14 years ago

Component: Database layer (models, ORM)ORM aggregation
Owner: nobody removed

comment:2 by Russell Keith-Magee, 14 years ago

Triage Stage: UnreviewedAccepted

comment:3 by 3point2, 14 years ago

Cc: 3point2 added

I've also just discovered this issue (using the Sum aggregate) and did some poking around: get_aggregate in source:django/trunk/django/db/models/sql/query.py calls query.clear_limits(). With this line commented out, a query like this is issued:

SELECT SUM("accounts_invoice"."grand_total") AS "grand_total__sum" FROM "accounts_invoice" LIMIT 5

which seems correct. unfortunately the DB (in my case sqlite) ignores the limit for the sum aggregate anyway.

For this to work right, the sql issued will have to look something like

SELECT SUM("grand_total") AS "grand_total__sum" FROM (SELECT * FROM "accounts_invoice" LIMIT 5)

in other words, grab the SQL from the query as if it wasn't an aggregate, and then nest it inside the aggregation function.

comment:4 by Filip Gruszczyński, 13 years ago

I have managed to patch this, but encountered larger problem, that I have problems with. I have added following test to modeltests/aggregation/tests.py:

    def test_join_annotate_aggregate(self):
        vals = Book.objects.all().annotate().aggregate(Max("authors__age"))
        self.assertEqual(len(vals), 1)
        self.assertEqual(vals["authors__age__max"], 57)

and it causes following exception:

======================================================================
ERROR: test_join_annotate_aggregate (modeltests.aggregation.tests.BaseAggregateTestCase)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/home/gruszczy/Programy/django/tests/modeltests/aggregation/tests.py", line 21, in test_join_annotate_aggregate
    vals = Book.objects.all().annotate().aggregate(Max("authors__age"))
  File "/home/gruszczy/Programy/django/django/db/models/query.py", line 312, in aggregate
    return query.get_aggregation(using=self.db)
  File "/home/gruszczy/Programy/django/django/db/models/sql/query.py", line 366, in get_aggregation
    result = query.get_compiler(using).execute_sql(SINGLE)
  File "/home/gruszczy/Programy/django/django/db/models/sql/compiler.py", line 729, in execute_sql
    cursor.execute(sql, params)
  File "/home/gruszczy/Programy/django/django/db/backends/sqlite3/base.py", line 200, in execute
    return Database.Cursor.execute(self, query, params)
DatabaseError: no such column: T5.age

Generated SQL looks like this:

SELECT "aggregation_book"."id" AS "id", "aggregation_book"."isbn" AS "isbn", "aggregation_book"."name" AS "name", "aggregation_book"."pages" AS
"pages", "aggregation_book"."rating" AS "rating", "aggregation_book"."price" AS "price", "aggregation_book"."contact_id" AS "contact_id",
"aggregation_book"."publisher_id" AS "publisher_id", "aggregation_book"."pubdate" AS "pubdate" FROM "aggregation_book" LEFT OUTER JOIN
"aggregation_book_authors" ON ("aggregation_book"."id" = "aggregation_book_authors"."book_id") LEFT OUTER JOIN "aggregation_author" T5 ON
("aggregation_book_authors"."author_id" = T5."id") GROUP BY "aggregation_book"."id", "aggregation_book"."isbn", "aggregation_book"."name",
"aggregation_book"."pages", "aggregation_book"."rating", "aggregation_book"."price", "aggregation_book"."contact_id",
"aggregation_book"."publisher_id", "aggregation_book"."pubdate"

I am posting my partial patch with tests, so maybe someone could direct me, where I should go from here and how can I solve the encountered bug.

by Filip Gruszczyński, 13 years ago

Attachment: 12886.patch added

partial patch with tests showing the problem

comment:5 by Filip Gruszczyński, 13 years ago

Cc: Filip Gruszczyński added

comment:6 by Luke Plant, 13 years ago

Type: Bug

comment:7 by Luke Plant, 13 years ago

Severity: Normal

comment:8 by Aymeric Augustin, 12 years ago

UI/UX: unset

Change UI/UX from NULL to False.

comment:9 by Aymeric Augustin, 12 years ago

Easy pickings: unset

Change Easy pickings from NULL to False.

comment:10 by davidpope, 11 years ago

Version: 1.2-beta1.4

This seems like a fairly significant limitation. As far as I can see, I cannot have the database perform aggregate functions on queries with LIMIT clauses, e.g. "show me the max date for the top 100 results matching this WHERE clause". This caveat should at least be called out in the documentation, since it otherwise implies that slices and aggregates should work without issue.

comment:11 by Anssi Kääriäinen, 11 years ago

Component: ORM aggregationDatabase layer (models, ORM)

comment:12 by fhahn, 11 years ago

Keywords: aggregate added

comment:13 by Anssi Kääriäinen <akaariai@…>, 11 years ago

Owner: set to Anssi Kääriäinen <akaariai@…>
Resolution: fixed
Status: newclosed

In 7737305a4f5dc5006d92dac3a61523ad6c2a523a:

Fixed #12886 -- aggregation over sliced queryset

comment:14 by devin.barry@…, 10 years ago

This bug still exists. I am using Django 1.6 on Python 2.6 and both Sum and Avg use the entire query set instead of the sliced query set. Threw in an annotate() as mentioned and this fixed it all up. Someone should open this up again, it has NOT been fixed!

comment:15 by Anssi Kääriäinen, 10 years ago

The fix is only in Django 1.7+.

comment:16 by Reinout van Rees, 9 years ago

See #23836: the fix doesn't work when aggregating on a related object.

comment:17 by Reinout van Rees, 9 years ago

Note: the fix here *does* fix #15101 (django.contrib.gis' "Extent" aggregate had the same issue).

Note: See TracTickets for help on using tickets.
Back to Top