Opened 8 years ago
Last modified 6 months ago
#26565 new New feature
Allow Prefetch query to use .values()
Reported by: | Maxime Lorant | Owned by: | nobody |
---|---|---|---|
Component: | Database layer (models, ORM) | Version: | dev |
Severity: | Normal | Keywords: | prefetch, values |
Cc: | Dan LaManna, Micah Lyle, Thomas Riccardi, Ülgen Sarıkavak | Triage Stage: | Accepted |
Has patch: | no | Needs documentation: | no |
Needs tests: | no | Patch needs improvement: | no |
Easy pickings: | no | UI/UX: | no |
Description
I came across a use case that does not seems to be possible using the Django ORM on Django 1.9.
Let's says I want to get all objects from a Order
model and prefetch for each order all pay lines, grouped per code, to get a list similar to: [{'code': 'A', 'number': 3, 'amount': 1000}, {'code': 'B', 'number': 1, 'amount': 5000}]
if the order has four PayRollLine associated, with 3 where code=A and 1 where code=B.
# Query that will be used inside the `Prefetch` object later # Let's assume quantity and unit_amount can't be null in the query # to avoid extra Coalesce... prefetch_qs = ( PayrollLine.objects.values('code').annotate( number=ExpressionWrapper( Sum(F('quantity')), output_field=DecimalField() ), amount=ExpressionWrapper( Sum(F('unit_amount') * F('quantity')), output_field=DecimalField() ), ) ) print( Order.objects.all() .prefetch_related( Prefetch('pay_lines', queryset=prefetch_qs, to_attr='pay_lines_grouped') ) )
The first query works fine alone, outside the prefetch object. But, when executing the second query, Django raises an exception:
Traceback (most recent call last): [...] File "/data/.virtualenvs/ORFEO/local/lib/python2.7/site-packages/django/db/models/query.py", line 234, in __repr__ data = list(self[:REPR_OUTPUT_SIZE + 1]) File "/data/.virtualenvs/ORFEO/local/lib/python2.7/site-packages/django/db/models/query.py", line 258, in __iter__ self._fetch_all() File "/data/.virtualenvs/ORFEO/local/lib/python2.7/site-packages/django/db/models/query.py", line 1076, in _fetch_all self._prefetch_related_objects() File "/data/.virtualenvs/ORFEO/local/lib/python2.7/site-packages/django/db/models/query.py", line 656, in _prefetch_related_objects prefetch_related_objects(self._result_cache, self._prefetch_related_lookups) File "/data/.virtualenvs/ORFEO/local/lib/python2.7/site-packages/django/db/models/query.py", line 1457, in prefetch_related_objects obj_list, additional_lookups = prefetch_one_level(obj_list, prefetcher, lookup, level) File "/data/.virtualenvs/ORFEO/local/lib/python2.7/site-packages/django/db/models/query.py", line 1556, in prefetch_one_level prefetcher.get_prefetch_queryset(instances, lookup.get_current_queryset(level))) File "/data/.virtualenvs/ORFEO/local/lib/python2.7/site-packages/django/db/models/fields/related_descriptors.py", line 544, in get_prefetch_queryset instance = instances_dict[rel_obj_attr(rel_obj)] File "/data/.virtualenvs/ORFEO/local/lib/python2.7/site-packages/django/db/models/fields/related.py", line 597, in get_local_related_value return self.get_instance_value_for_fields(instance, self.local_related_fields) File "/data/.virtualenvs/ORFEO/local/lib/python2.7/site-packages/django/db/models/fields/related.py", line 605, in get_instance_value_for_fields opts = instance._meta AttributeError: 'dict' object has no attribute '_meta'
Looks like Django can't work with dict in prefetch_related... On IRC, someone point me to [this PR](https://github.com/django/django/pull/6478) that could resolve my problem but I think it should be verified with unit tests.
Change History (19)
comment:1 by , 8 years ago
Summary: | Allow Prefetch query to returns aggregations → Allow Prefetch query to use .values() |
---|---|
Version: | 1.9 → master |
comment:2 by , 8 years ago
Yes, same error & traceback using PayrollLine.objects.values('code')
only.
The error seems obvious: the prefetch mechanism tries to access to attributes for some reason, but it got a dict from the query, not model instances...
comment:3 by , 8 years ago
Triage Stage: | Unreviewed → Accepted |
---|
I'm not sure if this should be considered a New Feature or a Bug.
comment:4 by , 8 years ago
I'd say there is a bug and a new feature. We should error nicely on values queries, and of course, ability to prefetch values queries would be a nice feature for aggregations specifically.
comment:5 by , 8 years ago
Has patch: | set |
---|
Submitted a PR for the bug part. This should be converted to a feature request once it's merged.
comment:7 by , 8 years ago
Type: | Bug → New feature |
---|
comment:8 by , 8 years ago
Has patch: | unset |
---|
comment:9 by , 8 years ago
Just to add, it would be great if not only .values()
would be supported, but also .values_list(flat=True)
.
comment:12 by , 5 years ago
Cc: | added |
---|
comment:13 by , 5 years ago
Cc: | added |
---|
comment:14 by , 5 years ago
I took a stab at this during the DjangoCon 2019 sprints.
I had very limited experience with the prefetching/ORM internals before this, so there could be workarounds/solutions that I'm unaware of, but here's what I found so far.
One thing I tried to tackle is handling the many to many case with just values()
. My goal was to take the '_prefetch_related_val_%s'
in the queryset.extra(...)
call in the get_prefetch_queryset()
ManyRelatedManager
and add/apply it to whatever values()
call was made and then join that by doing pop()
from the dictionary instead of a getattr
call (which would effectively give the developer the same query that they originally made by modifying it under the hood). I ran into some issues doing this though:
"Any
extra()
call made after avalues()
call will have its extra selected fields ignored" (from the Django 2.2 docs)."
This means that the queryset.extra(...)
call in the ManyRelatedManager
(which from my quick search appears to be the only remaining usage of extra(...)
across the entire Django codebase (except the tests)) is effectively ignored, and even if I try to add a .values(*existing_values, '_prefetch_related_val_%s')
call, it returns this error (from my example test/use case using the models from the prefetch tests):
queryset.values('id', 'title', '_prefetch_related_val_author_id') django.core.exceptions.FieldError: Cannot resolve keyword '_prefetch_related_val_author_id' into field. Choices are: _prefetch_related_val_author_id, authors, bio, bookwithyear, first_time_authors, id, read_by, title
Is this a bug? It says it can't resolve a field and then says that field is a choice.
I thought about trying to workaround it by essentially "undoing" the values()
part of the queryset
and then reapplying it at the end with the value from the join table (which then gets pop
ed, but I'm not sure if that's feasible/doable and what that would actually look like. Besides, this entire solution so far feels somewhat hacky in the first place.
I think that whatever change makes this happen may require some sort of specification on the developer's end for how to apply the join from the prefetched dictionaries to the existing list of instances. Maybe even writing parts of a custom prefetcher and specifying that custom prefetcher as a fourth keyword argument (something I explored, but didn't get very far) to the Prefetch
object, and then tweaking the logic in prefetch_related_objects
and other functions to support using that fourth keyword argument.
I'm also wondering if there's a way with Django 3.1+ technology/API to rewrite the queryset.extra(...)
call in ManyRelatedManager
. An out there idea I had was replacing the queryset.extra(...)
call with a query on the through table and then doing a select related to the foreign key that is on the side we want to bring in to join to the existing instances (in the case above, we'd grab Book.authors.through.objects.filter_based_on_what_we_have_for_books_and_authors().select_related('book')
and then transform to a values result from there but that would (I think) prevent aggregations in the first place and run into a number of other problems.
follow-up: 16 comment:15 by , 4 years ago
Just came across the same need. That would be a nice feature.
comment:17 by , 4 years ago
Cc: | added |
---|
comment:18 by , 2 years ago
What's the workaround for this if I have a similar need?
Perform the prefetch queryset and attribute assignment yourself. In the end prefetch_related
only does the following
It transforms
Authors.objects.filter(popular=True).prefetch_related(Prefetch("books", to_attr="prefetched_books"))
Into
authors = Authors.objects.filter(popular=True) authors_books = defaultdict(list) for book in Book.objects.filter(author__in=authors): authors_books[book.author_id].append(book) for author in authors: setattr(author, "prefetched_books", authors_books[author.pk])
So as long as the back references is included in your values
(which is something the prefetch_related
code would need to do anyway and the part Micah ran into issues with) you should be able to achieve the same thing with two loops.
comment:19 by , 6 months ago
Cc: | added |
---|
I doubt this is related to the linked PR or aggregation. I believe
Prefetch.queryset
cannot be used withvalues()
queryset.Please confirm you get a similar crash with the following queryset: