#36764 closed Bug (duplicate)
QuerySet.only() causes n+1 queries with reverse foreign key relation
| Reported by: | Bernhard Vallant | Owned by: | |
|---|---|---|---|
| Component: | Database layer (models, ORM) | Version: | dev |
| Severity: | Normal | Keywords: | queryset, only, defered |
| Cc: | Triage Stage: | Unreviewed | |
| Has patch: | no | Needs documentation: | no |
| Needs tests: | no | Patch needs improvement: | no |
| Easy pickings: | no | UI/UX: | no |
Description (last modified by )
To reproduce I tried adding the following test to tests.defer.tests.DeferTests:
def test_only_reverse_fk(self): qs = self.s1.primary_set.only("name") with self.assertNumQueries(1): for primary in qs: primary.name
This fails because it produces an additional query for every Primary instance to just fetch related_id:
test_only_reverse_fk (defer.tests.DeferTests.test_only_reverse_fk) failed:
AssertionError('3 != 1 : 3 queries executed, 1 expected
Captured queries were:
1. SELECT "defer_primary"."id", "defer_primary"."name" FROM "defer_primary" WHERE "defer_primary"."related_id" = 1
2. SELECT "defer_primary"."id", "defer_primary"."related_id" FROM "defer_primary" WHERE "defer_primary"."id" = 1 LIMIT 21
3. SELECT "defer_primary"."id", "defer_primary"."related_id" FROM "defer_primary" WHERE "defer_primary"."id" = 2 LIMIT 21
')
When replacing the queryset with qs = Primary.objects.filter(related_id=self.s1.pk).only("name") basically the same SQL is produced, but no additional queries.
I think the additional queries are quite an unexpected behaviour and it is unnecessary to retrieve related_id for every row. Somehow this is also a dangerous behaviour as it can cause n + 1 queries when actually trying to optimize your query.
If this behaviour should be the expected one it should at least be documented and tested somewhere (I don't think there are tests for only() when using relations like this somewhere).
Change History (7)
comment:2 by , 3 weeks ago
| Description: | modified (diff) |
|---|
comment:3 by , 3 weeks ago
Product.objects.all().delete() Category.objects.all().delete() category = Category.objects.create(name="Test") Product.objects.create(name="P1", price=10, category=category) Product.objects.create(name="P2", price=20, category=category) # Test qs = category.product_set.only("name") print(f"_known_related_objects: {qs._known_related_objects}") reset_queries() for p in qs: p.name # only accessing 'name', but triggers extra queries print(f"\nQueries executed: {len(connection.queries)}") for q in connection.queries: print(f" {q['sql']}")
output:
_known_related_objects: {<ForeignKey: category>: {10011: <Category: Test>}} Queries executed: 3 SELECT "product"."id", "product"."name" FROM "product" WHERE "product"."related_id" = 10011 SELECT "product"."id", "product"."related_id" FROM "product" WHERE "product"."id" = 100020 LIMIT 21 SELECT "product"."id", "product"."related_id" FROM "product" WHERE "product"."id" = 100021 LIMIT 21
this problem seems related to _known_related_objects in RelatedManager._apply_rel_filters(), which causes Django to access the deferred related_id on each instance.
comment:4 by , 3 weeks ago
I could also tracked it down to the id field being in _known_related_objects. For me the question is if the proper solution in this case would be to automatically append the field to the "only" fields (if it is necessary for the relations to properly work) or if it can be omitted in a way from _known_related_objects.
comment:5 by , 3 weeks ago
For me the question is if the proper solution in this case would be to automatically append the field to the "only" fields (if it is necessary for the relations to properly work) or if it can be omitted in a way from _known_related_objects.
I can't find it right now but there's another ticket that disuss doing and why that might be problematic. I'll keep searching.
comment:6 by , 3 weeks ago
comment:7 by , 3 weeks ago
| Resolution: | → duplicate |
|---|---|
| Status: | new → closed |
Thanks for the report! FWIW, this works:
and so does
.values("name").