Opened 3 years ago

Last modified 23 months ago

#19415 new Cleanup/optimization

Clarify how aggregates work with multi-valued relationships and multiple filter() calls

Reported by: svniemeijer@… Owned by:
Component: Documentation Version: master
Severity: Normal Keywords:
Cc: Triage Stage: Accepted
Has patch: no Needs documentation: no
Needs tests: no Patch needs improvement: no
Easy pickings: no UI/UX: no

Description

This issue seems related to #16603, but has a slightly different scope, so I am creating a new ticket for it.

Consider the following model:

class Author(models.Model):
    name = models.CharField(max_length=100)

class Book(models.Model):
    author = models.ForeignKey(Author, related_name='books')
    title = models.CharField(max_length=100)
    published = models.DateField()
    pages = models.IntegerField()

then the query:

    qs = Author.objects.filter(books__published__gte=date(2012, 11, 1))
    qs = qs.filter(books__published__lte=date(2012, 11, 30))
    qs = qs.values('name').annotate(num_pages=Sum('books__pages'))
    print qs.query

will result in the following SQL:

SELECT "app_author"."name", SUM("app_book"."pages") AS "num_pages" FROM "app_author" LEFT OUTER JOIN "app_book" ON ("app_author"."id" = "app_book"."author_id") INNER JOIN "app_book" T3 ON ("app_author"."id" = T3."author_id") WHERE ("app_book"."published" >= 2012-11-01  AND T3."published" <= 2012-11-30 ) GROUP BY "app_author"."name"

The problem is that in this case the upper date filter is not taken into account when calculating the 'num_pages' aggregate.

I would have expected to see the following SQL:

SELECT "app_author"."name", SUM("app_book"."pages") AS "num_pages" FROM "app_author" LEFT OUTER JOIN "app_book" ON ("app_author"."id" = "app_book"."author_id") WHERE ("app_book"."published" >= 2012-11-01  AND "app_book"."published" <= 2012-11-30 ) GROUP BY "app_author"."name"

Note that a query with just a single filter seems to work correctly:

    qs = Author.objects.filter(books__published__gte=date(2012, 11, 1))
    qs = qs.values('name').annotate(num_pages=Sum('books__pages'))
    print qs.query

which gives:

SELECT "app_author"."name", SUM("app_book"."pages") AS "num_pages" FROM "app_author" LEFT OUTER JOIN "app_book" ON ("app_author"."id" = "app_book"."author_id") WHERE "app_book"."published" >= 2012-11-01  GROUP BY "app_author"."name"

Change History (7)

comment:1 Changed 3 years ago by russellm

  • Needs documentation unset
  • Needs tests unset
  • Patch needs improvement unset
  • Resolution set to invalid
  • Status changed from new to closed

As far as I can make out, this is working as expected (for suitably confusing values of expected :-).
The catch is that:

Author.objects.filter(books__published__gte=date(2012, 11, 1)).filter(books__published__lte=date(2012, 11, 30))

and

Author.objects.filter(books__published__gte=date(2012, 11, 1), books__published__lte=date(2012, 11, 30))

are not interchangeable. By using a single filter clause, you're telling Django to use a single join table. Splitting the filter means 2 joins are used (hence the T3 table in the "wrong" query). The involvement of aggregates here is a red herring; the underlying issue is the filters.

comment:2 Changed 3 years ago by svniemeijer@…

Thanks! Combining everything into a single .filter() does indeed solve the problem.

However, the aggregate part is not entirely a red herring and some clarification in the documentation on aggregation could be useful to prevent others making this mistake.

In the documentation topics/db/queries/#spanning-multi-valued-relationships it is already clarified how subsequent filter() calls are treated. However, it may be good to have a similar section in topics/db/aggregation/#filtering-on-annotations that explains how aggregates on multi-valued relationships behave in case multiple filter() statements are used.

There is also no example that shows how to filter using multiple conditions in topics/db/aggregation/#filtering-on-annotations . Adding such an example (using a single .filter() call) would have probably already guided me in the right direction.

comment:3 Changed 3 years ago by russellm

  • Component changed from ORM aggregation to Documentation
  • Easy pickings set
  • Resolution invalid deleted
  • Status changed from closed to reopened
  • Triage Stage changed from Unreviewed to Accepted
  • Type changed from Bug to Cleanup/optimization

That's a fair point. Reopening and accepting as a documentation fix.

comment:4 Changed 2 years ago by aaugustin

  • Status changed from reopened to new

comment:5 Changed 2 years ago by susan

  • Owner changed from nobody to susan
  • Status changed from new to assigned

comment:6 Changed 2 years ago by anonymous

  • Owner susan deleted
  • Status changed from assigned to new

comment:7 Changed 23 months ago by timo

  • Easy pickings unset
  • Summary changed from Invalid result when using multiple reverse foreign-key filters together with reverse foreign-key aggregate to Clarify how aggregates work with multi-valued relationships and multiple filter() calls
  • Version changed from 1.5-beta-1 to master
Note: See TracTickets for help on using tickets.
Back to Top