Reuse QuerySet._result_cache during IN clause
|Reported by:||mattrobenolt||Owned by:||mattrobenolt|
|Component:||Database layer (models, ORM)||Version:||1.4|
|Severity:||Normal||Keywords:||orm, queryset, result_cache, optimization, subquery|
|Cc:||apollo13, charette.s@…||Triage Stage:||Design decision needed|
|Has patch:||yes||Needs documentation:||no|
|Needs tests:||no||Patch needs improvement:||no|
If I have a QuerySet that has been evaluated previously, and shove it into an __in clause, a subquery is still generated.
- Reuse a list of primary keys instead, to prevent the database from performing the subquery.
cities = City.objects.all() list(cities) # Force evaluation of cities to populate cache print cities._result_cache is not None # Verify that the cache is filled Event.objects.filter(venue__city__in=cities) # This should use just the pks instead of a subquery
I began digging through the ORM, and found the source of the problem being that the origin QuerySet object is being clone()'d, causing it's _result_cache to be None.
The simple solution for now is to not pass a QuerySet into an IN clause if we know the set of ids. Can be remedied by something like [m.pk for m in qs], or just list(qs).
For the latter solution, I think the documentation should be updated to reflect this distinction to avoid unexpected results and performance degradation.
Change History (7)
comment:1 Changed 3 years ago by mattrobenolt
- Has patch set
- Needs documentation unset
- Needs tests unset
- Owner changed from nobody to mattrobenolt
- Patch needs improvement unset
- Triage Stage changed from Unreviewed to Fixed on a branch
comment:2 Changed 3 years ago by apollo13
- Triage Stage changed from Fixed on a branch to Design decision needed
comment:4 Changed 3 years ago by akaariai
- Resolution set to wontfix
- Status changed from new to closed
comment:6 Changed 3 years ago by mattrobenolt
- Resolution wontfix deleted
- Status changed from closed to reopened