Opened 14 years ago
Last modified 10 years ago
#16603 new Bug
Unnecessary join when using a reverse foreign-key filter and reverse foreign-key aggregate call
Reported by: | Ben Davis | Owned by: | nobody |
---|---|---|---|
Component: | Database layer (models, ORM) | Version: | dev |
Severity: | Normal | Keywords: | |
Cc: | anssi.kaariainen@… | Triage Stage: | Accepted |
Has patch: | no | Needs documentation: | no |
Needs tests: | no | Patch needs improvement: | no |
Easy pickings: | no | UI/UX: | no |
Pull Requests: | How to create a pull request | ||
Description ¶
(This started as #16554, but it was only half-valid, so I'm re-submitting the valid part here)
Django allows you to perform queries across reverse foreign key relationships. If, however, you need to access that same relationship in a filter and an aggregate call, the ORM creates unnecessary joins.
Example: ¶
app/models.py
class Author(models.Model): first_name = models.CharField(max_length=30) last_name = models.CharField(max_length=30) email = models.EmailField() class Book(models.Model): author = models.ForeignKey(Author, related_name='books') title = models.CharField(max_length=100) genre = models.CharField(max_length=20, choices=( ('SCIFI', 'SciFi'), ('FANTASY', 'Fantasy'), ('NONFICTION', 'NonFiction') )) published = models.DateField() pages = models.IntegerField()
The django query:
qs = Author.objects.annotate(pages_written=Sum('books__pages')) qs = qs.filter(books__genre='SCIFI') print qs.query
Generated SQL:
SELECT "app_author"."id", "app_author"."first_name", "app_author"."last_name", "app_author"."email", SUM("app_book"."pages") AS "pages_written" FROM "app_author" LEFT OUTER JOIN "app_book" ON ("app_author"."id" = "app_book"."author_id") INNER JOIN "app_book" T3 ON ("app_author"."id" = T3."author_id") WHERE T3."genre" = 'SCIFI' GROUP BY "app_author"."id", "app_author"."first_name", "app_author"."last_name", "app_author"."email"
According to the ticket's flags, the next step(s) to move this issue forward are:
- To provide a patch by sending a pull request. Claim the ticket when you start working so that someone else doesn't duplicate effort. Before sending a pull request, review your work against the patch review checklist. Check the "Has patch" flag on the ticket after sending a pull request and include a link to the pull request in the ticket comment when making that update. The usual format is:
[https://github.com/django/django/pull/#### PR]
.
Change History (6)
by , 14 years ago
Attachment: | 16603_tests.patch added |
---|
comment:1 by , 14 years ago
Component: | Database layer (models, ORM) → ORM aggregation |
---|---|
Triage Stage: | Unreviewed → Accepted |
comment:2 by , 14 years ago
Cc: | added |
---|
comment:3 by , 12 years ago
Component: | ORM aggregation → Database layer (models, ORM) |
---|
comment:4 by , 10 years ago
Updated test to apply cleanly and verified it's still an issue as of 98bcdfa8bd902addd4b8cf37d039b3597d58a45c.
by , 10 years ago
Attachment: | 16603-test.diff added |
---|
Confirmed. This is somewhat related to #10060, but not a duplicate of that ticket. I am a bit surprised if there is not a duplicate of this ticket, but I could not find any. Maybe I am just too tired...
The problem reported is not easy to fix. The main issue is that reverse (and m2m) filtering should be done using subqueries and WHERE EXISTS, WHERE pk IN or joining the subquery. As implemented now the joins generated will return duplicate rows, and that causes the problem. That is, the above query should be written as
or
or
I haven't tested the queries above, and as said I am a little tired... But the idea should at least be correct :)