#26780 closed New feature (fixed)
Prefetch objects don't work with slices
Reported by: | Ludwik Trammer | Owned by: | Simon Charette |
---|---|---|---|
Component: | Database layer (models, ORM) | Version: | dev |
Severity: | Normal | Keywords: | prefetch, slicing |
Cc: | TZanke, Dan LaManna, chex, Pavel Lysak, Arthur, şuayip üzülmez, Simon Charette | Triage Stage: | Ready for checkin |
Has patch: | yes | Needs documentation: | no |
Needs tests: | no | Patch needs improvement: | no |
Easy pickings: | no | UI/UX: | no |
Description
Prefetch() objects does not work with sliced querysets. For example the following code results in AssertionError: Cannot filter a query once a slice has been taken.
:
Category.objects.prefetch_related(Prefetch( 'post_set', queryset=Post.objects.all()[:3], to_attr='example_posts', ))
This behavior is also mentioned in this StackOverflow answer. On the other hand it does not seem to be documented in Django Docs.
Why is it needed?
My use case seems to be a common one: I want to display a list of categories while displaying couple of example objects from each category next to it. If I'm not mistaken there isn't currently an efficient way of doing this. Prefetching without slicing would prefetch all objects (and there may be thousands of them) instead of the three examples that are needed.
Change History (22)
comment:1 by , 8 years ago
Summary: | Prefetch objects doesn't work with slices → Prefetch objects don't work with slices |
---|
comment:2 by , 8 years ago
comment:3 by , 8 years ago
Hi ludwik,
#26565 is about a different issue.
The only way we could support slices would be to extract it from the provided it queryset to make it filterable again and apply it when in-memory joining. The prefect queryset would still have to fetch all results so I'm not sure it's worth the trouble as accessing category.example_posts[:3]
would have the same effect.
comment:4 by , 8 years ago
Charettes,
Thank you for the reply.
accessing category.example_posts[:3] would have the same effect
I'm pretty sure that is not correct. The end effect would obviously be the same, but the way it is achieved behind the scenes and the performance characteristics that follows would be widely different. What you propose (prefetching without limit, adding slicing in a loop after the fact) would make Django perform a database query selecting all objects from the database table and loading them into memory. This would happen when the main queryset is evaluated (that's what prefetching is all about). Then the slicing would be performed by Python in memory, on a queryset that was already evaluated.
That's what I understood from the documentation and also how Django actually behaved in an experiment I performed couple of minutes ago. What I want to avoid is exactly this behavior - loading thousands of objects from the database to display first three of them.
I would be happy with a sane workaround, but to my knowledge there isn't any.
comment:5 by , 8 years ago
Component: | Uncategorized → Database layer (models, ORM) |
---|---|
Triage Stage: | Unreviewed → Accepted |
Type: | Bug → New feature |
Version: | 1.9 → master |
Tentatively accepting but as you've noticed it would require a large refactor of the way prefetching is actually done.
I could see Category.objects.prefetch_related(Prefetch('articles', Article.object.order_by('-published_data')[0:3], to_attr='latest_articles'))
being useful but the underlying prefetch query would need to rely on subqueries, a feature the ORM is not good at for now.
Maybe the addition of the Subquery expression could help here.
comment:6 by , 8 years ago
The only way to limit this Query (and to not load all the records into memory) would be to do a top-n-per-group
query which I don't think the ORM
is capable of.
What I can suggest is: Instead of limiting the QuerySet
, find a way to filter it.
For example, get only the Posts
in the last day, week or a month (based on the post-frequency).
Category.objects.prefetch_related(Prefetch( 'post_set', queryset=Post.objects.filter( date_published__gte=datetime.date.today()-timedelta(days=7)), to_attr='example_posts', ))
This way you won't load all the Posts into Memory.
comment:7 by , 8 years ago
If window queries are implemented (https://code.djangoproject.com/ticket/26608) then this use case could be possible in something similar to:
Prefetch( 'post_set', queryset= Post.objects .annotate(_rank=Window(Rank(), partition_by='cateogry') .filter(_rank__lte=3) )
comment:8 by , 7 years ago
I tried the filter by rank suggestion using Django 2.0 but I get this error:
django.db.utils.NotSupportedError: Window is disallowed in the filter clause.
comment:9 by , 7 years ago
I think there is a workaround now to in django new version as we have OuterRef and Subquery.
from django.db.models import OuterRef, Subquery
User.objects.all().prefetch_related('comments',queryset=Comment.objects.filter(id__in=Subquery(Comment.objects.filter(user_id=OuterRef('user_id')).values_list('id', flat=True)[:5])))
comment:10 by , 5 years ago
Cc: | added |
---|
comment:11 by , 5 years ago
Cc: | added |
---|
comment:12 by , 5 years ago
Cc: | added |
---|
comment:13 by , 5 years ago
Cc: | added |
---|
comment:14 by , 3 years ago
Cc: | added |
---|
comment:15 by , 3 years ago
Note that the solution above might not be portable to some MySql databases, with error
1235, "This version of MySQL doesn't yet support 'LIMIT & IN/ALL/ANY/SOME subquery'"
comment:16 by , 2 years ago
Cc: | added |
---|
comment:17 by , 2 years ago
As pointed out by others support for filter against window functions would allow prefetch_related_objects
to use Rank(partition_by)
to support this feature.
If someone wants to give a shot at solving this particular issue before #28333 is resolved it should be doable by using a combination of Queryset.raw
and Query.compile
combinations.
comment:18 by , 2 years ago
Cc: | added |
---|
comment:19 by , 2 years ago
Has patch: | set |
---|---|
Owner: | changed from | to
Status: | new → assigned |
comment:20 by , 2 years ago
Triage Stage: | Accepted → Ready for checkin |
---|
It seems to me that #26565 ("Allow Prefetch query to use .values()") is related.