Opened 20 months ago

Closed 20 months ago

Last modified 16 months ago

#34413 closed New feature (duplicate)

Variant of Prefetch but for the earliest/latest related object — at Version 6

Reported by: Willem Van Onsem Owned by: nobody
Component: Database layer (models, ORM) Version: dev
Severity: Normal Keywords:
Cc: Triage Stage: Unreviewed
Has patch: yes Needs documentation: no
Needs tests: yes Patch needs improvement: no
Easy pickings: no UI/UX: no

Description (last modified by Willem Van Onsem)

A frequently asked feature that seems to be missing is fetch the earlest/latest related object for each item. Indeed, we can for example work with a subquery to fetch *the primary key* of the earliest/latest related object, but not that object itself.

It turns out however that automating this is not that complicated. What we need to do is (automatically) construct a reverse filter, so if we want the latest Comment of each Post, we make a queryset that makes a Comment.objects.filter(post_id=OuterRef('pk')), then we convert that into a subquery that will, for each Post fetch the primary key of the latest comment, and slightly alter the logic that is already used to for prefetching to then prefetch all the Comments for these primary keys, and add attributes to the corresponding Posts.

I made a small proof-of-concept that should normally work for (most) cases: it allows one to use an arbitrary queryset and specify an ordering (or use the ordering already in the queryset or by the model). It can also span over multiple layers. It requires to *always* specify a to_attr, since, especially when spanning over multiple relations, the default to_attr would require double underscores.

The most ugly part is that we work with an annotation that is then accessible by the user, so it is not somehow hidden. Technically we could remove the attribute, or do something extra in the ORM to prevent exposing this attribute.

I did not yet check what querysets will be problematic. For example a sliced queryset would (very likely) not work, and likely most/all restrictions already in place for a the queryset of a Prefetch object are still applicable.

A simple demonstration of the PrefetchEarliest could be:

User.objects.prefetch_related(PrefetchEarliest('groups', to_attr='first_group', Group.objects.order_by('name'))  		

Change History (7)

comment:1 by Willem Van Onsem, 20 months ago

Needs tests: set

comment:2 by Willem Van Onsem, 20 months ago

Description: modified (diff)

comment:3 by Simon Charette, 20 months ago

Resolution: duplicate
Status: assignedclosed

Pretty sure this is a duplicate of #26780 fixed in the soon to be released 4.2.

Since 242499f2dc2bf24a9a5c855690a2e13d3303581a Prefetch(queryset) supports slicing through the use of filtering over partitioned rank (RANK() OVER (PARTITION BY <join_field> ORDER BY <order_by>) so earliest and latest related objects (or any slice of related objects for that matter) can be prefetched.

User.objects.prefetch_related(
    Prefetch("groups", queryset=Group.objects.order_by('name')[:1], to_attr="first_group")
)

comment:4 by Willem Van Onsem, 20 months ago

Description: modified (diff)

in reply to:  3 comment:5 by Willem Van Onsem, 20 months ago

Replying to Simon Charette

What would happen if we span over multiple relations, like:

User.objects.prefetch_related(
    Prefetch("groups__permissions", queryset=Permission.objects.order_by('codename')[:1], to_attr="first_permission")
)

comment:6 by Willem Van Onsem, 20 months ago

Description: modified (diff)
Note: See TracTickets for help on using tickets.
Back to Top