Changes between Initial Version and Version 1 of Ticket #22757, comment 7


Ignore:
Timestamp:
Nov 3, 2023, 8:52:54 PM (13 months ago)
Author:
Simon Charette

Legend:

Unmodified
Added
Removed
Modified
  • Ticket #22757, comment 7

    initial v1  
    11Thanks for having a shot at this Nolan!
    22
    3 I think that a lot of the complexity here stems from the fact doesn't support polymorphic querysets that is querysets that can return multiple models.
     3I think that a lot of the complexity here stems from the fact the ORM doesn't support polymorphic querysets that is querysets that can return multiple models.
    44
    5 This poses a challenge from an implementation perspective for this issue because the (simplified here) return signature of `get_prefetch_querysets` contains three members
     5This poses a challenge from an implementation perspective for this issue because the (simplified here) return signature of `get_prefetch_querysets` must contains three members
    66
    7 1. A collection of prefetch querysets
    8 2. A callable that takes a model instance returned from one of the prefetch queries and return a tuple uniquely identifying the item (that's `lambda obj: (obj.pk, obj.__class__)`)
    9 3. A callable that take model instances related objects are prefetched for and return a tuple uniquely identifying the related items (that's `gfk_key`)
     71. A collection of prefetching querysets
     82. A callable that takes a model instance returned from the prefetching querysets and return a tuple uniquely identifying the item (that's `lambda obj: (obj.pk, obj.__class__)`)
     93. A callable that take model instances that objects are prefetched for and return a tuple uniquely identifying the related items (that's `gfk_key`)
    1010
    11 The `prefetch_related` logic then executes all the querysets from 1., build a map from 2, and inserts it with 3. to do the proper `_prefetch_related_cache` assignments.
     11The `prefetch_related` logic then executes all the querysets from 1., build a map from 2, and intersects it with 3. to do the proper `_prefetch_related_cache` assignments.
    1212
    13 Since Django only allows for a single model class to be used to construct objects returned from a queryset, remember the lack of polymorphism support, then one might be tempted to annotate objects with a `_content_type_id` attribute to have the logic in 2. changed to `lambda obj: (obj.pk, self.get_content_type(id=obj._content_type_id, using=obj._state.db).model_class()` so the proper mapping occurs. Doing do doesn't solve the problem of proper `__class__` assignment as you've brought up Nolan nor does it address another edge case that would have probably caused during review.
     13Since Django only allows for a single model class to be used to construct objects returned from a queryset, remember the lack of polymorphism support, then one might be tempted to annotate objects with a `_content_type_id` attribute (e.g. using `Case(When)` like we do in `bulk_update` based on the id) to have the logic in 2. changed to `lambda obj: (obj.pk, self.get_content_type(id=obj._content_type_id, using=obj._state.db).model_class()` so the proper mapping occurs. Doing do doesn't solve the problem of proper `__class__` assignment as you've brought up Nolan nor does it address another edge case that would have probably caused during review.
    1414
    15 Even if we're able to return a specialized `Queryset` that overrides the `_iterable_class` to return mixed models ([https://github.com/django-polymorphic/django-polymorphic/blob/5111fb070e4e0b0153e816d163871f2039ca2ae2/polymorphic/query.py#L25 it's the tactic used by packages like django-polymorphic] for example) how we deal with the case where the same id is needs to be prefetched both concretely and for it's proxy.
     15Even if we're able to return a specialized `Queryset` that overrides the `_iterable_class` to return mixed models ([https://github.com/django-polymorphic/django-polymorphic/blob/5111fb070e4e0b0153e816d163871f2039ca2ae2/polymorphic/query.py#L25 it's the tactic used by packages like django-polymorphic] for example) we'd have to we deal with the case where the same id is needs to be prefetched both concretely and for it's proxy.
    1616
    1717In other words, given the reported scenario, what kind of SQL query would we generate for
     
    2727}}}
    2828
    29 In this case the same `foo_child` row needs to returned twice one as a child and one as a proxy_child which cannot be achieved with a simple `id IN (1, 1)`.
     29In this case the same `foo_child` row needs to returned twice one as a `Child` and one as a `ProxiedChild` which cannot be achieved with a simple `id IN (1, 1)` to ''span'' two rows.
    3030
    3131The only way I can think of dealing with this edge case is by using a `UNION ALL` with combined annotation so the following SQL would be generated
     
    3838}}}
    3939
    40 And that must then be combined with the `_iterable_class` override mentioned above.
     40And that must then be combined with the `_iterable_class` logic mentioned above to create model instances with the proper model class based on the `_content_type_id` annotation.
    4141
    42 There's even more fun now that we've added `GenericPrefetch` in 5.0 as it allows you to specify querysets for each of the models involved which means that it allows passing querysets with `select_related` and annotations which cannot use the `UNION ALL` approach described above so the ''optimization'' would have to be disabled.
     42There's even more fun now that we've added `GenericPrefetch` in 5.0 as it allows you to specify querysets for each of the models involved which means that it allows passing querysets with `select_related` and annotations that result in variadic `SELECT` statement which cannot use the `UNION ALL` approach described above so the ''optimization'' would have to be disabled.
    4343
    44 After looking at this into more details I also agree that the addition in complexity to avoid this query is likely not worth it as it would either require some potentially expensive Python workaround or significant boiler plate to get working properly. I suspect we'll want to ''wontfix'' this one unfortunately.
     44After looking at this into more details I also agree that the addition in complexity to avoid this query is likely not worth it as it would either require some potentially expensive Python workaround or significant boiler plate to get working properly. I suspect we'll want to ''wontfix'' this one unfortunately at least until someone can come with a clever way of achieving the above efficiently or the ORM gains polymorphic capabilities which is a long shot.
Back to Top