Opened 10 years ago
Closed 9 years ago
#19264 closed Cleanup/optimization (duplicate)
prefetch_one_level is slow
Reported by: | Simon Percivall | Owned by: | nobody |
---|---|---|---|
Component: | Database layer (models, ORM) | Version: | 1.4 |
Severity: | Normal | Keywords: | |
Cc: | percivall@… | Triage Stage: | Accepted |
Has patch: | no | Needs documentation: | no |
Needs tests: | no | Patch needs improvement: | no |
Easy pickings: | no | UI/UX: | no |
Description
prefetch_one_level
in django.db.models.query.py
does a lot of I think unnecessary extra work.
For every obj in instances a QuerySet
is created for the related many to many by doing getattr(obj, attname).all()
. This QuerySet
however is never really used, instead its _prefetched_objects_cache
is filled with the prefetched items.
qs = getattr(obj, attname).all() qs._result_cache = vals obj._prefetched_objects_cache[cache_name] = qs
If instead you'd simply create a QuerySet
you save having to call clone
for every obj, which is very expensive.
qs = QuerySet(getattr(obj, attname).model) qs._result_cache = vals obj._prefetched_objects_cache[cache_name] = qs
For a function like prefetch_related
whose main purpose is to speed up a certain case of use, speed seems like a good quality. Maybe I've misunderstood an edge case here, but if not, this is an easy optimization that gives quite the speedup.
All modeltests on 1.4 pass with this change, by the way.
Change History (7)
comment:1 Changed 10 years ago by
Cc: | percivall@… added |
---|
comment:2 Changed 10 years ago by
comment:3 Changed 10 years ago by
Very true. Too bad. However, I still think it'd give a speedup in the very common case that all
and get_query_set
aren't overridden, won't it? Though I guess this change is less attractive once it gets more tangled with the implementation of Managers.
comment:5 Changed 10 years ago by
Triage Stage: | Unreviewed → Design decision needed |
---|
comment:6 Changed 9 years ago by
Triage Stage: | Design decision needed → Accepted |
---|
Accepting based on Luke's comment.
comment:7 Changed 9 years ago by
Resolution: | → duplicate |
---|---|
Status: | new → closed |
It seems #20577 is a duplicate. I think that ticket's patch has better approach.
The problem is that the
QuerySet
must still behave the same way as the originalQuerySet
, apart from having its_result_cache
already filled. For example, if the user doessome_obj.related_objects.all().filter(bar=foo)
or evensome_obj.related_objects.filter(bar=foo)
etc, then I think your code above will fail - it won't have the right base filters applied, and it might have a differentorder_by
etc.A test for those cases would be a good addition in any case. There is certainly the possibility for some speedups for the common case.