Opened 8 years ago

Last modified 6 months ago

#26565 new New feature

Allow Prefetch query to use .values()

Reported by: Maxime Lorant Owned by: nobody
Component: Database layer (models, ORM) Version: dev
Severity: Normal Keywords: prefetch, values
Cc: Dan LaManna, Micah Lyle, Thomas Riccardi, Ülgen Sarıkavak Triage Stage: Accepted
Has patch: no Needs documentation: no
Needs tests: no Patch needs improvement: no
Easy pickings: no UI/UX: no

Description

I came across a use case that does not seems to be possible using the Django ORM on Django 1.9.

Let's says I want to get all objects from a Order model and prefetch for each order all pay lines, grouped per code, to get a list similar to: [{'code': 'A', 'number': 3, 'amount': 1000}, {'code': 'B', 'number': 1, 'amount': 5000}] if the order has four PayRollLine associated, with 3 where code=A and 1 where code=B.

# Query that will be used inside the `Prefetch` object later
# Let's assume quantity and unit_amount can't be null in the query 
# to avoid extra Coalesce...
prefetch_qs = (
    PayrollLine.objects.values('code').annotate(
        number=ExpressionWrapper(
            Sum(F('quantity')), output_field=DecimalField()
        ),
        amount=ExpressionWrapper(
            Sum(F('unit_amount') * F('quantity')), output_field=DecimalField()
        ),
    )
)
print(
    Order.objects.all()
    .prefetch_related(
        Prefetch('pay_lines', queryset=prefetch_qs, to_attr='pay_lines_grouped')
    )
)

The first query works fine alone, outside the prefetch object. But, when executing the second query, Django raises an exception:

    Traceback (most recent call last):
      [...]
      File "/data/.virtualenvs/ORFEO/local/lib/python2.7/site-packages/django/db/models/query.py", line 234, in __repr__
        data = list(self[:REPR_OUTPUT_SIZE + 1])
      File "/data/.virtualenvs/ORFEO/local/lib/python2.7/site-packages/django/db/models/query.py", line 258, in __iter__
        self._fetch_all()
      File "/data/.virtualenvs/ORFEO/local/lib/python2.7/site-packages/django/db/models/query.py", line 1076, in _fetch_all
        self._prefetch_related_objects()
      File "/data/.virtualenvs/ORFEO/local/lib/python2.7/site-packages/django/db/models/query.py", line 656, in _prefetch_related_objects
        prefetch_related_objects(self._result_cache, self._prefetch_related_lookups)
      File "/data/.virtualenvs/ORFEO/local/lib/python2.7/site-packages/django/db/models/query.py", line 1457, in prefetch_related_objects
        obj_list, additional_lookups = prefetch_one_level(obj_list, prefetcher, lookup, level)
      File "/data/.virtualenvs/ORFEO/local/lib/python2.7/site-packages/django/db/models/query.py", line 1556, in prefetch_one_level
        prefetcher.get_prefetch_queryset(instances, lookup.get_current_queryset(level)))
      File "/data/.virtualenvs/ORFEO/local/lib/python2.7/site-packages/django/db/models/fields/related_descriptors.py", line 544, in get_prefetch_queryset
        instance = instances_dict[rel_obj_attr(rel_obj)]
      File "/data/.virtualenvs/ORFEO/local/lib/python2.7/site-packages/django/db/models/fields/related.py", line 597, in get_local_related_value
        return self.get_instance_value_for_fields(instance, self.local_related_fields)
      File "/data/.virtualenvs/ORFEO/local/lib/python2.7/site-packages/django/db/models/fields/related.py", line 605, in get_instance_value_for_fields
        opts = instance._meta
    AttributeError: 'dict' object has no attribute '_meta'

Looks like Django can't work with dict in prefetch_related... On IRC, someone point me to [this PR](https://github.com/django/django/pull/6478) that could resolve my problem but I think it should be verified with unit tests.

Change History (19)

comment:1 by Simon Charette, 8 years ago

Summary: Allow Prefetch query to returns aggregationsAllow Prefetch query to use .values()
Version: 1.9master

I doubt this is related to the linked PR or aggregation. I believe Prefetch.queryset cannot be used with values() queryset.

Please confirm you get a similar crash with the following queryset:

queryset = PayrollLine.objects.values('code')
Order.objects.prefetch_related(
    Prefetch('pay_lines', queryset=queryset, to_attr='pay_lines_values')
)

comment:2 by Maxime Lorant, 8 years ago

Yes, same error & traceback using PayrollLine.objects.values('code') only.

The error seems obvious: the prefetch mechanism tries to access to attributes for some reason, but it got a dict from the query, not model instances...

comment:3 by Simon Charette, 8 years ago

Triage Stage: UnreviewedAccepted

I'm not sure if this should be considered a New Feature or a Bug.

comment:4 by Anssi Kääriäinen, 8 years ago

I'd say there is a bug and a new feature. We should error nicely on values queries, and of course, ability to prefetch values queries would be a nice feature for aggregations specifically.

comment:5 by Simon Charette, 8 years ago

Has patch: set

Submitted a PR for the bug part. This should be converted to a feature request once it's merged.

comment:6 by Simon Charette <charettes@…>, 8 years ago

In 7ec330e:

Refs #26565 -- Errored nicely when using Prefetch with a values() queryset.

Thanks Maxime Lorant for the report and Anssi for the suggestion.

comment:7 by Simon Charette, 8 years ago

Type: BugNew feature

comment:8 by Simon Charette, 8 years ago

Has patch: unset

comment:9 by Gwildor Sok, 8 years ago

Just to add, it would be great if not only .values() would be supported, but also .values_list(flat=True).

comment:10 by Alexander Schepanovski, 7 years ago

Anyone on this? Would be great to have.

comment:11 by Simon Charette, 7 years ago

No one has claimed it yet Alexander, feel free to propose a patch!

comment:12 by Dan LaManna, 5 years ago

Cc: Dan LaManna added

comment:13 by Micah Lyle, 5 years ago

Cc: Micah Lyle added

comment:14 by Micah Lyle, 5 years ago

I took a stab at this during the DjangoCon 2019 sprints.

I had very limited experience with the prefetching/ORM internals before this, so there could be workarounds/solutions that I'm unaware of, but here's what I found so far.

One thing I tried to tackle is handling the many to many case with just values(). My goal was to take the '_prefetch_related_val_%s' in the queryset.extra(...) call in the get_prefetch_queryset() ManyRelatedManager and add/apply it to whatever values() call was made and then join that by doing pop() from the dictionary instead of a getattr call (which would effectively give the developer the same query that they originally made by modifying it under the hood). I ran into some issues doing this though:

"Any extra() call made after a values() call will have its extra selected fields ignored" (from the Django 2.2 docs)."

This means that the queryset.extra(...) call in the ManyRelatedManager (which from my quick search appears to be the only remaining usage of extra(...) across the entire Django codebase (except the tests)) is effectively ignored, and even if I try to add a .values(*existing_values, '_prefetch_related_val_%s') call, it returns this error (from my example test/use case using the models from the prefetch tests):

queryset.values('id', 'title', '_prefetch_related_val_author_id')

django.core.exceptions.FieldError: Cannot resolve keyword '_prefetch_related_val_author_id' into field. Choices are: _prefetch_related_val_author_id, authors, bio, bookwithyear, first_time_authors, id, read_by, title

Is this a bug? It says it can't resolve a field and then says that field is a choice.

I thought about trying to workaround it by essentially "undoing" the values() part of the queryset and then reapplying it at the end with the value from the join table (which then gets poped, but I'm not sure if that's feasible/doable and what that would actually look like. Besides, this entire solution so far feels somewhat hacky in the first place.

I think that whatever change makes this happen may require some sort of specification on the developer's end for how to apply the join from the prefetched dictionaries to the existing list of instances. Maybe even writing parts of a custom prefetcher and specifying that custom prefetcher as a fourth keyword argument (something I explored, but didn't get very far) to the Prefetch object, and then tweaking the logic in prefetch_related_objects and other functions to support using that fourth keyword argument.

I'm also wondering if there's a way with Django 3.1+ technology/API to rewrite the queryset.extra(...) call in ManyRelatedManager. An out there idea I had was replacing the queryset.extra(...) call with a query on the through table and then doing a select related to the foreign key that is on the side we want to bring in to join to the existing instances (in the case above, we'd grab Book.authors.through.objects.filter_based_on_what_we_have_for_books_and_authors().select_related('book') and then transform to a values result from there but that would (I think) prevent aggregations in the first place and run into a number of other problems.

comment:15 by Aymeric Augustin, 4 years ago

Just came across the same need. That would be a nice feature.

in reply to:  15 comment:16 by kimsia, 4 years ago

What's the workaround for this if I have a similar need?

comment:17 by Thomas Riccardi, 4 years ago

Cc: Thomas Riccardi added

comment:18 by Simon Charette, 2 years ago

What's the workaround for this if I have a similar need?

Perform the prefetch queryset and attribute assignment yourself. In the end prefetch_related only does the following

It transforms

Authors.objects.filter(popular=True).prefetch_related(Prefetch("books", to_attr="prefetched_books"))

Into

authors = Authors.objects.filter(popular=True)
authors_books = defaultdict(list)
for book in Book.objects.filter(author__in=authors):
    authors_books[book.author_id].append(book)
for author in authors:
    setattr(author, "prefetched_books", authors_books[author.pk])

So as long as the back references is included in your values (which is something the prefetch_related code would need to do anyway and the part Micah ran into issues with) you should be able to achieve the same thing with two loops.

comment:19 by Ülgen Sarıkavak, 6 months ago

Cc: Ülgen Sarıkavak added
Note: See TracTickets for help on using tickets.
Back to Top