Opened 3 days ago
Last modified 16 hours ago
#36282 assigned Bug
Prefetched relations are not used from parent classes in all cases
Description ¶
When using prefetch_related
to prefetch relations, Django makes some attempt to use prefetched data present in parent models, but it is very limited. It is only implemented for ForeignKey
and only works for the immediate parent.
Assuming this model hierarchy:
class Related(models.Model): pass class GrandParent(models.Model): name = models.CharField(max_length=50) gp_fk = models.ForeignKey(Related, null=True, on_delete=models.CASCADE, related_name='gp_fk_rel') gp_m2m = models.ManyToManyField(Related, related_name='gp_m2m_rel') class Parent(GrandParent): pass class Child(Parent): pass
I have written these test cases:
class PrefetchRelatedWorksWithInheritance(TestCase): @classmethod def setUpTestData(cls): cls.related1 = Related.objects.create() cls.related2 = Related.objects.create() cls.related3 = Related.objects.create() cls.child = Child.objects.create( gp_fk=cls.related1, ) cls.m2m_child = Child.objects.create() cls.m2m_child.gp_m2m.set([cls.related1, cls.related2, cls.related3]) def test_parent_fk_available_in_child(self): qs = GrandParent.objects.select_related('parent').prefetch_related( 'gp_fk' ).filter(pk=self.child.pk) with self.assertNumQueries(2): results = list(qs) self.assertEqual(len(results), 1) # Works, Parent can look into its GrandParent and find the prefetched data self.assertEqual(results[0].parent.gp_fk, self.related1) def test_grandparent_fk_available_in_child(self): qs = GrandParent.objects.select_related('parent', 'parent__child').prefetch_related( 'gp_fk' ).filter(pk=self.child.pk) with self.assertNumQueries(2): results = list(qs) self.assertEqual(len(results), 1) # Causes extra query, Child only looks in Parent, not in GrandParent self.assertEqual(results[0].parent.child.gp_fk, self.related1) def test_parent_m2m_available_in_child(self): qs = GrandParent.objects.select_related('parent').prefetch_related( 'gp_m2m' ).filter(pk=self.m2m_child.pk) with self.assertNumQueries(2): results = list(qs) self.assertEqual(len(results), 1) # Causes extra query, M2M never looks in its parents self.assertEqual(set(results[0].parent.gp_m2m.all()), {self.related1, self.related2, self.related3}) def test_grandparent_m2m_available_in_child(self): qs = GrandParent.objects.select_related('parent', 'parent__child').prefetch_related( 'gp_m2m' ).filter(pk=self.m2m_child.pk) with self.assertNumQueries(2): results = list(qs) self.assertEqual(len(results), 1) # Causes extra query, M2M never looks in its parents self.assertEqual(set(results[0].parent.child.gp_m2m.all()), {self.related1, self.related2, self.related3})
Only the first of the tests passes.
For ForeignKeys the reason is here: https://github.com/django/django/blob/c3a23aa02faa1cf1d32e43d66858e793cd9ecac4/django/db/models/fields/related_descriptors.py#L229-L240
This only looks into the immediate parent, not any grandparents.
For ManyToManyField the culprit is here: https://github.com/django/django/blob/c3a23aa02faa1cf1d32e43d66858e793cd9ecac4/django/db/models/fields/related_descriptors.py#L1094-L1098
It only looks in the instance's _prefetched_objects_cache, when it could walk up to its parents, just like ForwardManyToOneDescriptor does.
I stumbled upon this when implementing support for prefetch_related in Django-Model-Utils' InheritanceManager. It automatically returns subclasses form a custom QuerySet iterable. But those subclass-instances then do not see the prefetched data of their parent, so in my fix I had to manually copy the caches, so that Django would find them.
I would be willing to work on a patch to make this work out of the box in Django.
According to the ticket's flags, the next step(s) to move this issue forward are:
- To provide a patch by sending a pull request. Claim the ticket when you start working so that someone else doesn't duplicate effort. Before sending a pull request, review your work against the patch review checklist. Check the "Has patch" flag on the ticket after sending a pull request and include a link to the pull request in the ticket comment when making that update. The usual format is:
[https://github.com/django/django/pull/#### PR]
.
Change History (7)
comment:1 by , 3 days ago
Summary: | Prefetched relations are not available used from parent classes in all cases → Prefetched relations are not used from parent classes in all cases |
---|
follow-ups: 3 6 comment:2 by , 2 days ago
Component: | Uncategorized → Database layer (models, ORM) |
---|---|
Keywords: | mti prefetch_related added |
Triage Stage: | Unreviewed → Accepted |
Type: | Uncategorized → Bug |
comment:3 by , 23 hours ago
Replying to Simon Charette:
Hi, can I work on it?
Kindly assign it to me!
follow-up: 5 comment:4 by , 23 hours ago
The reporter explicitly stated
I would be willing to work on a patch to make this work out of the box in Django.
So I would give it a few more days before assigning the ticket to you Adya.
comment:5 by , 22 hours ago
Replying to Simon Charette:
Ok, it makes sense.
Could you suggest me any other available ticket, and I can work on it? It will be my pleasure
comment:6 by , 18 hours ago
Replying to Simon Charette:
Thank you for your pointers. I have already started working on this here: https://github.com/django/django/compare/main...diesieben07:django:prefetch_related_inheritance
I will work on polishing it to get it ready for a contribution. One thing to think about is indeed cache invalidation. If you do parent.child.m2m_relation.set(...)
should that invalidate parent.m2m_relation
? Currently it doesn't, but if parent.child.m2m_relation
and parent.m2m_relation
now point to the same cache, because the child one just looks in its parent...?
Probably the clearing also needs to happen recursively and walk up to the parents, but that is a potentially more invasive change.
comment:7 by , 16 hours ago
Owner: | set to |
---|---|
Status: | new → assigned |
Good point about cache invalidation.
All layers of inherited caching should effectively be busted when parent.child.rel
is altered.
Interestingly I ran into the same issue implementing similar logic 5 years ago.
If you're interested in solving this issue I'd have a few pointers
_prefetched_objects_cache
related cache between model instances as it will make cache invalidation a nightmare (the workaround I linked above is flawed in this regard). The logic should be adapted to fully walk the ancestor chain. In other words, the cached value should remain solely on the model hosting the field and subclasses accessors should retrieve them from there.tests/prefetch_related
reuse the existing models. It seems thatAuthorWithAge
is a good candidate for subclassing.Here's a little patch I used to validate your report if it can be handy
TabularUnified django/db/models/fields/related_descriptors.py