Opened 4 years ago
Closed 4 years ago
#31666 closed Cleanup/optimization (wontfix)
Combine prefetch_related queries for FK's to the same target model
Reported by: | Adam Johnson | Owned by: | nobody |
---|---|---|---|
Component: | Database layer (models, ORM) | Version: | dev |
Severity: | Normal | Keywords: | |
Cc: | Triage Stage: | Unreviewed | |
Has patch: | no | Needs documentation: | no |
Needs tests: | no | Patch needs improvement: | no |
Easy pickings: | no | UI/UX: | no |
Description
If you have a source model with two or more FK's to the same target model, prefetch_related()
for those FK's could do a single query rather than several. For models with duplicate targets in different FK's, this would also reduce transferred data and duplicate objects in memory.
For example:
from django.db import models class Author(models.Model): pass class Book(models.Model): author1 = models.ForeignKey(Author, on_delete=models.DO_NOTHING, related_name='+') author2 = models.ForeignKey(Author, on_delete=models.DO_NOTHING, related_name='+')
If I create some authors and fetch everything, the same author is represented by two objects because they were fetched in different queries:
In [1]: from example.core.models import Author, Book In [2]: a1 = Author.objects.create() In [3]: a2 = Author.objects.create() In [4]: a3 = Author.objects.create() In [5]: b1 = Book.objects.create(author1=a1, author2=a2) In [6]: b2 = Book.objects.create(author1=a2, author2=a3) In [7]: bs = Book.objects.prefetch_related('author1', 'author2') In [8]: bs[0].author1 Out[8]: <Author: Author object (1)> In [9]: bs[0].author2 Out[9]: <Author: Author object (2)> In [10]: bs[1].author1 Out[10]: <Author: Author object (2)> In [11]: bs[0].author2 is bs[1].author1 Out[11]: False
Change History (3)
comment:2 by , 4 years ago
To my mind the duplicate objects in memory is kinda a feature. Specifically it seems desirable to me that a prefetch related clause only changes performance not behaviour...
Agreed, different in memory objects should be created as it would also be slightly backward incompatible. The ORM doesn't maintain an identity map of objects (e.g. like SQLAlchemy's session does) so changing it solely for this feature would be confusing.
Performing a single query could be doable using a form of Author.objects.filter(Q(books__author1__in=author1_set) | Q(books__author2__in=author2_set))
which should naturally result in duplicated Author
being returned because of the product caused by the two LEFT JOIN
s to book
.
comment:3 by , 4 years ago
Resolution: | → wontfix |
---|---|
Status: | new → closed |
I agree with previous comments. This optimization is backward incompatible, quite niche, and can cause a lot of headache.
To my mind the duplicate objects in memory is kinda a feature.
Specifically it seems desirable to me that a prefetch related clause only changes performance not behaviour. Particularly it seems concerning that adding a prefetch_related to this (very contrived) example would change it's behaviour in the case where author1 and author2 happen to be the same author: