Opened 13 years ago

Closed 10 years ago

Last modified 8 years ago

#12886 closed Bug (fixed)

.aggregate() does not honour sliced queryset

Reported by: til@… Owned by: Anssi Kääriäinen <akaariai@…>
Component: Database layer (models, ORM) Version: 1.4
Severity: Normal Keywords: aggregate
Cc: 3point2, Filip Gruszczyński Triage Stage: Accepted
Has patch: no Needs documentation: no
Needs tests: no Patch needs improvement: no
Easy pickings: no UI/UX: no

Description

If I aggregate a sliced queryset, the result is an aggregate across the whole unsliced queryset:

In [97]: qs = Encounter.objects.all()

In [98]: qs.aggregate(**aggregations)
Out[98]: {'total_no_of_adults': 19, 'total_no_of_children': 8, 'total_no_of_dogs': 2}

In [99]: qs[0:5].aggregate(**aggregations)
Out[99]: {'total_no_of_adults': 19, 'total_no_of_children': 8, 'total_no_of_dogs': 2}

It works though if .annotate() is called first, even if no annotations are given:

In [100]: qs[0:5].annotate().aggregate(**aggregations)
Out[100]: {'total_no_of_adults': 8, 'total_no_of_children': 8, 'total_no_of_dogs': 2}

The latter are the correct figures for my example model.

Attachments (1)

12886.patch (3.2 KB) - added by Filip Gruszczyński 12 years ago.
partial patch with tests showing the problem

Download all attachments as: .zip

Change History (18)

comment:1 Changed 13 years ago by Tilman Koschnick

Component: Database layer (models, ORM)ORM aggregation
Owner: nobody deleted

comment:2 Changed 13 years ago by Russell Keith-Magee

Triage Stage: UnreviewedAccepted

comment:3 Changed 13 years ago by 3point2

Cc: 3point2 added

I've also just discovered this issue (using the Sum aggregate) and did some poking around: get_aggregate in source:django/trunk/django/db/models/sql/query.py calls query.clear_limits(). With this line commented out, a query like this is issued:

SELECT SUM("accounts_invoice"."grand_total") AS "grand_total__sum" FROM "accounts_invoice" LIMIT 5

which seems correct. unfortunately the DB (in my case sqlite) ignores the limit for the sum aggregate anyway.

For this to work right, the sql issued will have to look something like

SELECT SUM("grand_total") AS "grand_total__sum" FROM (SELECT * FROM "accounts_invoice" LIMIT 5)

in other words, grab the SQL from the query as if it wasn't an aggregate, and then nest it inside the aggregation function.

comment:4 Changed 12 years ago by Filip Gruszczyński

I have managed to patch this, but encountered larger problem, that I have problems with. I have added following test to modeltests/aggregation/tests.py:

    def test_join_annotate_aggregate(self):
        vals = Book.objects.all().annotate().aggregate(Max("authors__age"))
        self.assertEqual(len(vals), 1)
        self.assertEqual(vals["authors__age__max"], 57)

and it causes following exception:

======================================================================
ERROR: test_join_annotate_aggregate (modeltests.aggregation.tests.BaseAggregateTestCase)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/home/gruszczy/Programy/django/tests/modeltests/aggregation/tests.py", line 21, in test_join_annotate_aggregate
    vals = Book.objects.all().annotate().aggregate(Max("authors__age"))
  File "/home/gruszczy/Programy/django/django/db/models/query.py", line 312, in aggregate
    return query.get_aggregation(using=self.db)
  File "/home/gruszczy/Programy/django/django/db/models/sql/query.py", line 366, in get_aggregation
    result = query.get_compiler(using).execute_sql(SINGLE)
  File "/home/gruszczy/Programy/django/django/db/models/sql/compiler.py", line 729, in execute_sql
    cursor.execute(sql, params)
  File "/home/gruszczy/Programy/django/django/db/backends/sqlite3/base.py", line 200, in execute
    return Database.Cursor.execute(self, query, params)
DatabaseError: no such column: T5.age

Generated SQL looks like this:

SELECT "aggregation_book"."id" AS "id", "aggregation_book"."isbn" AS "isbn", "aggregation_book"."name" AS "name", "aggregation_book"."pages" AS
"pages", "aggregation_book"."rating" AS "rating", "aggregation_book"."price" AS "price", "aggregation_book"."contact_id" AS "contact_id",
"aggregation_book"."publisher_id" AS "publisher_id", "aggregation_book"."pubdate" AS "pubdate" FROM "aggregation_book" LEFT OUTER JOIN
"aggregation_book_authors" ON ("aggregation_book"."id" = "aggregation_book_authors"."book_id") LEFT OUTER JOIN "aggregation_author" T5 ON
("aggregation_book_authors"."author_id" = T5."id") GROUP BY "aggregation_book"."id", "aggregation_book"."isbn", "aggregation_book"."name",
"aggregation_book"."pages", "aggregation_book"."rating", "aggregation_book"."price", "aggregation_book"."contact_id",
"aggregation_book"."publisher_id", "aggregation_book"."pubdate"

I am posting my partial patch with tests, so maybe someone could direct me, where I should go from here and how can I solve the encountered bug.

Changed 12 years ago by Filip Gruszczyński

Attachment: 12886.patch added

partial patch with tests showing the problem

comment:5 Changed 12 years ago by Filip Gruszczyński

Cc: Filip Gruszczyński added

comment:6 Changed 12 years ago by Luke Plant

Type: Bug

comment:7 Changed 12 years ago by Luke Plant

Severity: Normal

comment:8 Changed 11 years ago by Aymeric Augustin

UI/UX: unset

Change UI/UX from NULL to False.

comment:9 Changed 11 years ago by Aymeric Augustin

Easy pickings: unset

Change Easy pickings from NULL to False.

comment:10 Changed 10 years ago by davidpope

Version: 1.2-beta1.4

This seems like a fairly significant limitation. As far as I can see, I cannot have the database perform aggregate functions on queries with LIMIT clauses, e.g. "show me the max date for the top 100 results matching this WHERE clause". This caveat should at least be called out in the documentation, since it otherwise implies that slices and aggregates should work without issue.

comment:11 Changed 10 years ago by Anssi Kääriäinen

Component: ORM aggregationDatabase layer (models, ORM)

comment:12 Changed 10 years ago by fhahn

Keywords: aggregate added

comment:13 Changed 10 years ago by Anssi Kääriäinen <akaariai@…>

Owner: set to Anssi Kääriäinen <akaariai@…>
Resolution: fixed
Status: newclosed

In 7737305a4f5dc5006d92dac3a61523ad6c2a523a:

Fixed #12886 -- aggregation over sliced queryset

comment:14 Changed 9 years ago by devin.barry@…

This bug still exists. I am using Django 1.6 on Python 2.6 and both Sum and Avg use the entire query set instead of the sliced query set. Threw in an annotate() as mentioned and this fixed it all up. Someone should open this up again, it has NOT been fixed!

comment:15 Changed 9 years ago by Anssi Kääriäinen

The fix is only in Django 1.7+.

comment:16 Changed 8 years ago by Reinout van Rees

See #23836: the fix doesn't work when aggregating on a related object.

comment:17 Changed 8 years ago by Reinout van Rees

Note: the fix here *does* fix #15101 (django.contrib.gis' "Extent" aggregate had the same issue).

Note: See TracTickets for help on using tickets.
Back to Top