﻿id	summary	reporter	owner	description	type	status	component	version	severity	resolution	keywords	cc	stage	has_patch	needs_docs	needs_tests	needs_better_patch	easy	ui_ux
32996	Cache PathInfos on ForeignObject/ForeignObjectRel (et al)	Keryn Knight	Keryn Knight	"In most usages of `get_path_info` (across it's various implementations) the `filtered_relation` parameter isn't used. The places it does potentially get given a filtered relation look to be (cursory sweep):
- `django.db.models.sql.query.Query.names_to_path`
- `django.db.models.fields.related.ManyToManyField._get_path_info`
Whenever `get_path_info()` is used (that is, without giving any arguments) it's essentially a fixed result based on a few attributes of the parent object (the `ForeignObject` or whatever), which are permanently held across the lifetime of the process AFAIK (that is, eg: `id(User.groups.rel)` is stable)

So calling `get_path_info` everywhere when there's no filtered relation being used is expensive:
{{{
In [2]: %timeit User.groups.rel.get_path_info()
2.27 µs ± 37.4 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each
}}}

By defining additionally (and the opposite for places where `get_reverse_path_info` is defined):
{{{
@cached_property
    def path_infos(self):
        return self.get_path_info()
}}}
we can amortize that cost over the life of the process such that:
{{{
In [3]: %timeit User.groups.rel.path_infos
164 ns ± 2.16 ns per loop (mean ± std. dev. of 7 runs, 10000000 loops each
}}}

Thus, given 101 Users, each with 1 user permission, and 1 group (also with single permission) as outlined here:
{{{
perms = tuple(Permission.objects.all())
for x in range(n):
    y = get_random_string(10)
    user = User.objects.create(username=y, email=f'{y}@{y}', password=y)
    random_group_perm = random.choice(perms)
    random_user_perm = random.choice(perms)
    user.user_permissions.add(random_user_perm)
    group = Group.objects.create(name=y)
    group.permissions.add(random_group_perm)
    user.groups.add(group)
}}}
we can execute the following before changing over to using a cached property:
{{{
In [2]: %prun -stime -lpath_info for _ in range(100): tuple(User.objects.prefetch_related('groups', 'user_permissions', 'groups__permissions'))
ncalls  tottime  percall  cumtime  percall filename:lineno(function)
 31500    0.033    0.000    0.048    0.000 related.py:728(get_path_info)
   600    0.002    0.000    0.009    0.000 related.py:1545(_get_path_info)
   600    0.002    0.000    0.003    0.000 related.py:920(get_reverse_path_info)
   600    0.001    0.000    0.009    0.000 related.py:1574(get_reverse_path_info)
   600    0.000    0.000    0.010    0.000 reverse_related.py:197(get_path_info)
}}}
and then after:
{{{
Ordered by: internal time
ncalls  tottime  percall  cumtime  percall filename:lineno(function)
     3    0.000    0.000    0.000    0.000 related.py:1559(_get_path_info)
     3    0.000    0.000    0.000    0.000 related.py:934(get_reverse_path_info)
     4    0.000    0.000    0.000    0.000 related.py:734(get_path_info)
     3    0.000    0.000    0.000    0.000 related.py:1592(get_reverse_path_info)
     3    0.000    0.000    0.000    0.000 reverse_related.py:207(get_path_info)
     4    0.000    0.000    0.000    0.000 related.py:748(path_infos)
     3    0.000    0.000    0.000    0.000 related.py:766(reverse_path_infos)
     3    0.000    0.000    0.000    0.000 reverse_related.py:210(path_infos)
}}}
Note that those `ncalls` aren't a ''true'' reflection, because after the cached property replaces itself subsequent attribute accesses won't be tracked, hence the `timeit` information above.

Across the test suite, beforehand:
{{{
> python -m cProfile runtests.py -v0 | grep ""path_info""
...
   211111    0.227    0.000    0.335    0.000 related.py:728(get_path_info)
     5349    0.021    0.000    0.073    0.000 related.py:1545(_get_path_info)
     6804    0.019    0.000    0.027    0.000 related.py:920(get_reverse_path_info)
     3469    0.006    0.000    0.013    0.000 wsgi.py:152(get_path_info)                Unrelated
     5568    0.005    0.000    0.068    0.000 reverse_related.py:197(get_path_info)
     3969    0.004    0.000    0.058    0.000 related.py:1574(get_reverse_path_info)
     1380    0.001    0.000    0.021    0.000 related.py:1571(get_path_info)
      124    0.000    0.000    0.001    0.000 fields.py:399(get_path_info)
      113    0.000    0.000    0.000    0.000 related.py:742(get_reverse_path_info)
       27    0.000    0.000    0.000    0.000 fields.py:416(get_reverse_path_info)
        6    0.000    0.000    0.000    0.000 fields.py:359(_get_path_info_with_parent)
       15    0.000    0.000    0.000    0.000 empty_join.py:54(get_path_info)
       10    0.000    0.000    0.000    0.000 empty_join.py:67(get_reverse_path_info)
        1    0.000    0.000    0.002    0.002 tests.py:23(test_bad_path_info            Unrelated
        1    0.000    0.000    0.000    0.000 tests.py:196(test_environ_path_info_type) Unrelated
        1    0.000    0.000    0.000    0.000 tests.py:164(test_wsgirequest_path_info)  Unrelated
        2    0.000    0.000    0.000    0.000 request.py:148(get_full_path_info)        Unrelated
}}}
and after switching to a cached property:
{{{
> python -m cProfile runtests.py -v0 | grep ""path_info""
...
     3469    0.006    0.000    0.013    0.000 wsgi.py:152(get_path_info)                Unrelated
      873    0.003    0.000    0.004    0.000 related.py:734(get_path_info)
      396    0.002    0.000    0.003    0.000 related.py:934(get_reverse_path_info)
      203    0.001    0.000    0.006    0.000 related.py:1559(_get_path_info)
      843    0.001    0.000    0.004    0.000 related.py:748(path_infos)
      342    0.001    0.000    0.006    0.000 reverse_related.py:207(get_path_info)
      256    0.000    0.000    0.006    0.000 reverse_related.py:210(path_infos)
      196    0.000    0.000    0.002    0.000 related.py:766(reverse_path_infos)
      130    0.000    0.000    0.004    0.000 related.py:1592(get_reverse_path_info)
       73    0.000    0.000    0.002    0.000 related.py:1585(get_path_info)
       17    0.000    0.000    0.000    0.000 fields.py:399(get_path_info)
       67    0.000    0.000    0.002    0.000 related.py:1588(path_infos)
        7    0.000    0.000    0.000    0.000 related.py:752(get_reverse_path_info)
       15    0.000    0.000    0.000    0.000 fields.py:416(path_infos)
        4    0.000    0.000    0.000    0.000 fields.py:420(get_reverse_path_info)
        1    0.000    0.000    0.001    0.001 tests.py:23(test_bad_path_info)           Unrelated
        2    0.000    0.000    0.000    0.000 fields.py:359(_get_path_info_with_parent)
        1    0.000    0.000    0.000    0.000 tests.py:196(test_environ_path_info_type) Unrelated
        2    0.000    0.000    0.000    0.000 empty_join.py:54(get_path_info)
        1    0.000    0.000    0.000    0.000 tests.py:164(test_wsgirequest_path_info)  Unrelated
        1    0.000    0.000    0.000    0.000 empty_join.py:67(get_reverse_path_info)
        2    0.000    0.000    0.000    0.000 request.py:148(get_full_path_info)        Unrelated
}}}

As you can see, even within the test suite there are a substantial number of calls to effectively get the same information, across the many and various test models. Obviously if you look at the timing numbers, it's all kind of insubstantial. It's really just work that doesn't need repeatedly doing, I don't think.

I have a branch which I'll tidy up and push to check CI and allow for discussion, should it be accepted.
"	Cleanup/optimization	closed	Database layer (models, ORM)	dev	Normal	fixed			Ready for checkin	1	0	0	0	0	0
