#26780 closed New feature (fixed)
Prefetch objects don't work with slices
Reported by: | Ludwik Trammer | Owned by: | Simon Charette |
---|---|---|---|
Component: | Database layer (models, ORM) | Version: | dev |
Severity: | Normal | Keywords: | prefetch, slicing |
Cc: | TZanke, Dan LaManna, chex, Pavel Lysak, Arthur, şuayip üzülmez, Simon Charette | Triage Stage: | Ready for checkin |
Has patch: | yes | Needs documentation: | no |
Needs tests: | no | Patch needs improvement: | no |
Easy pickings: | no | UI/UX: | no |
Description
Prefetch() objects does not work with sliced querysets. For example the following code results in AssertionError: Cannot filter a query once a slice has been taken.
:
Category.objects.prefetch_related(Prefetch( 'post_set', queryset=Post.objects.all()[:3], to_attr='example_posts', ))
This behavior is also mentioned in this StackOverflow answer. On the other hand it does not seem to be documented in Django Docs.
Why is it needed?
My use case seems to be a common one: I want to display a list of categories while displaying couple of example objects from each category next to it. If I'm not mistaken there isn't currently an efficient way of doing this. Prefetching without slicing would prefetch all objects (and there may be thousands of them) instead of the three examples that are needed.
Change History (22)
comment:1 Changed 7 years ago by
Summary: | Prefetch objects doesn't work with slices → Prefetch objects don't work with slices |
---|
comment:2 Changed 7 years ago by
comment:3 Changed 7 years ago by
Hi ludwik,
#26565 is about a different issue.
The only way we could support slices would be to extract it from the provided queryset to make it filterable again and apply it when in-memory joining. The prefect queryset would still have to fetch all results so I'm not sure it's worth the trouble as accessing category.example_posts[:3]
would have the same effect.
comment:4 Changed 7 years ago by
Charettes,
Thank you for the reply.
accessing category.example_posts[:3] would have the same effect
I'm pretty sure that is not correct. The end effect would obviously be the same, but the way it is achieved behind the scenes and the performance characteristics that follows would be widely different. What you propose (prefetching without limit, adding slicing in a loop after the fact) would make Django perform a database query selecting all objects from the database table and loading them into memory. This would happen when the main queryset is evaluated (that's what prefetching is all about). Then the slicing would be performed by Python in memory, on a queryset that was already evaluated.
That's what I understood from the documentation and also how Django actually behaved in an experiment I performed couple of minutes ago. What I want to avoid is exactly this behavior - loading thousands of objects from the database to display first three of them.
I would be happy with a sane workaround, but to my knowledge there isn't any.
comment:5 Changed 7 years ago by
Component: | Uncategorized → Database layer (models, ORM) |
---|---|
Triage Stage: | Unreviewed → Accepted |
Type: | Bug → New feature |
Version: | 1.9 → master |
Tentatively accepting but as you've noticed it would require a large refactor of the way prefetching is actually done.
I could see Category.objects.prefetch_related(Prefetch('articles', Article.object.order_by('-published_data')[0:3], to_attr='latest_articles'))
being useful but the underlying prefetch query would need to rely on subqueries, a feature the ORM is not good at for now.
Maybe the addition of the Subquery expression could help here.
comment:6 Changed 7 years ago by
The only way to limit this Query (and to not load all the records into memory) would be to do a top-n-per-group
query which I don't think the ORM
is capable of.
What I can suggest is: Instead of limiting the QuerySet
, find a way to filter it.
For example, get only the Posts
in the last day, week or a month (based on the post-frequency).
Category.objects.prefetch_related(Prefetch( 'post_set', queryset=Post.objects.filter( date_published__gte=datetime.date.today()-timedelta(days=7)), to_attr='example_posts', ))
This way you won't load all the Posts into Memory.
comment:7 Changed 7 years ago by
If window queries are implemented (https://code.djangoproject.com/ticket/26608) then this use case could be possible in something similar to:
Prefetch( 'post_set', queryset= Post.objects .annotate(_rank=Window(Rank(), partition_by='cateogry') .filter(_rank__lte=3) )
comment:8 Changed 5 years ago by
I tried the filter by rank suggestion using Django 2.0 but I get this error:
django.db.utils.NotSupportedError: Window is disallowed in the filter clause.
comment:9 Changed 5 years ago by
I think there is a workaround now to in django new version as we have OuterRef and Subquery.
from django.db.models import OuterRef, Subquery
User.objects.all().prefetch_related('comments',queryset=Comment.objects.filter(id__in=Subquery(Comment.objects.filter(user_id=OuterRef('user_id')).values_list('id', flat=True)[:5])))
comment:10 Changed 4 years ago by
Cc: | TZanke added |
---|
comment:11 Changed 4 years ago by
Cc: | Dan LaManna added |
---|
comment:12 Changed 4 years ago by
Cc: | chex added |
---|
comment:13 Changed 4 years ago by
Cc: | Pavel Lysak added |
---|
comment:14 Changed 18 months ago by
Cc: | Arthur added |
---|
comment:15 Changed 18 months ago by
Note that the solution above might not be portable to some MySql databases, with error
1235, "This version of MySQL doesn't yet support 'LIMIT & IN/ALL/ANY/SOME subquery'"
comment:16 Changed 11 months ago by
Cc: | şuayip üzülmez added |
---|
comment:17 Changed 10 months ago by
As pointed out by others support for filter against window functions would allow prefetch_related_objects
to use Rank(partition_by)
to support this feature.
If someone wants to give a shot at solving this particular issue before #28333 is resolved it should be doable by using a combination of Queryset.raw
and Query.compile
combinations.
comment:18 Changed 10 months ago by
Cc: | Simon Charette added |
---|
comment:19 Changed 10 months ago by
Has patch: | set |
---|---|
Owner: | changed from nobody to Simon Charette |
Status: | new → assigned |
comment:20 Changed 10 months ago by
Triage Stage: | Accepted → Ready for checkin |
---|
It seems to me that #26565 ("Allow Prefetch query to use .values()") is related.