select_related() results in a poor perfomance
|Reported by:||Ivan Virabyan||Owned by:||nobody|
|Component:||Database layer (models, ORM)||Version:||master|
|Severity:||Normal||Keywords:||select_related, get_cached_row, perfomance|
|Cc:||Triage Stage:||Ready for checkin|
|Has patch:||yes||Needs documentation:||no|
|Needs tests:||no||Patch needs improvement:||no|
Consider this code:
s = time() list(Comment.objects.all()[:500]) list(User.objects.all()[:500]) print 'separate queries', time() - s s = time() list(Comment.objects.select_related('author')[:500]) print 'select_related', time() - s
The result is surprising:
separate queries 0.126932859421 select_related 0.276528120041
As you can see, using select_related makes things two times slower. And it is not a query time, query time is just a few milliseconds. So I dived into implementation of get_cached_row, and found that everything is recalculated there for each row, even though most of the information may be calculated only once (outside the loop).
So I've made a patch, and after that version of query with select_related had nearly the same performance as with separate queries.