id summary reporter owner description type status component version severity resolution keywords cc stage has_patch needs_docs needs_tests needs_better_patch easy ui_ux 20577 Make prefetch_related faster by lazily creating related querysets Anssi Kääriäinen Alex Aktsipetrov "In one project of mine I will need to prefetch the following things for each ""machine"": `computerdata__operatingsystem, identifiers`. computerdata is one-to-one to machine, operatingsystem is manytomany, and identifiers are many-to-one. The data is distributed in a way that any one machine have on average one operating system, and a couple of identifiers. Fetching data results in this profile: {{{ 1 0.000 0.000 6.835 6.835 manage.py:2() 1 0.000 0.000 6.795 6.795 __init__.py:394(execute_from_command_line) 1 0.000 0.000 6.795 6.795 __init__.py:350(execute) 1 0.000 0.000 6.207 6.207 base.py:228(run_from_argv) 1 0.000 0.000 6.199 6.199 base.py:250(execute) 1 0.000 0.000 6.072 6.072 ad_guess.py:9(handle) 10/2 0.016 0.002 6.069 3.034 query.py:853(_fetch_all) 6/1 0.000 0.000 6.043 6.043 query.py:80(__iter__) 1 0.000 0.000 5.837 5.837 query.py:517(_prefetch_related_objects) 1 0.009 0.009 5.837 5.837 query.py:1512(prefetch_related_objects) 3 0.080 0.027 5.819 1.940 query.py:1671(prefetch_one_level) 4640 0.018 0.000 3.917 0.001 manager.py:132(all) 4646 0.014 0.000 3.206 0.001 query.py:587(filter) 4648 0.037 0.000 3.193 0.001 query.py:601(_filter_or_exclude) 4648 0.031 0.000 2.661 0.001 query.py:1188(add_q) 4648 0.053 0.000 2.401 0.001 query.py:1208(_add_q) 4648 0.144 0.000 2.284 0.000 query.py:1010(build_filter) 2320 0.040 0.000 2.076 0.001 related.py:529(get_queryset) 2320 0.063 0.000 1.823 0.001 related.py:404(get_queryset) 14380 0.068 0.000 1.052 0.000 query.py:160(iterator) 1 0.023 0.023 0.993 0.993 related.py:418(get_prefetch_queryset) 9299 0.067 0.000 0.841 0.000 query.py:838(_clone) 4649 0.086 0.000 0.752 0.000 query.py:1323(setup_joins) 9299 0.226 0.000 0.738 0.000 query.py:214(clone) 4644 0.177 0.000 0.668 0.000 related.py:1041(get_lookup_constraint) 1 0.000 0.000 0.577 0.577 __init__.py:256(fetch_command) 14375 0.330 0.000 0.548 0.000 base.py:325(__init__) 127/79 0.007 0.000 0.447 0.006 {__import__} 4645 0.012 0.000 0.443 0.000 query.py:788(using) 14380 0.017 0.000 0.433 0.000 compiler.py:694(results_iter) 5 0.197 0.039 0.197 0.039 {method 'execute' of 'psycopg2._psycopg.cursor' objects} }}} If I am reading this correctly, the actual data fetching costs 0.2 seconds of the total runtime of 6.8 seconds. (In reality the ratio is 0.2 vs 2 seconds due to overhead of profiling not affecting the execute time but having a big effect on other parts). The big ""failure"" in above profile is the creation of related querysets: 4640 0.018 0.000 3.917 0.001 manager.py:132(all) this takes more than half (approx 57%) of the total runtime. Every cycle here is wasted - we don't ever need the related queryset when using the prefetched results. I see two options here: - Allow assigning the results to somewhere else than manager.all() (that is, a plain list you can name). This would naturally get rid of the overhead, but then you will need to alter code to explicitly use the named list when that is available. - Somehow lazily instantiate the .all() queryset. If prefetch is in effect calling relmanager.all() will not create a queryset, it just creates a proxy which when iterated gives the related objects, but works otherwise like the related queryset (*not* like manager). I prefer option 2 as this doesn't require any new APIs or changes to user code to take advantage of this feature. However creating a proxy object that works like the related queryset except for iteration, and which doesn't involve actually creating that queryset will likely be an interesting technical challenge (for example it would be nice to ensure isinstance(obj.operating_systems.all(), QuerySet) == True. Solving this challenge will likely speed up some prefetch_related queries by 50% or so." Cleanup/optimization closed Database layer (models, ORM) dev Normal fixed karyon Ready for checkin 1 0 0 0 0 0