Opened 4 years ago

Last modified 4 weeks ago

#26565 new New feature

Allow Prefetch query to use .values()

Reported by: Maxime Lorant Owned by: nobody
Component: Database layer (models, ORM) Version: master
Severity: Normal Keywords: prefetch, values
Cc: Dan LaManna, Micah Lyle Triage Stage: Accepted
Has patch: no Needs documentation: no
Needs tests: no Patch needs improvement: no
Easy pickings: no UI/UX: no

Description

I came across a use case that does not seems to be possible using the Django ORM on Django 1.9.

Let's says I want to get all objects from a Order model and prefetch for each order all pay lines, grouped per code, to get a list similar to: [{'code': 'A', 'number': 3, 'amount': 1000}, {'code': 'B', 'number': 1, 'amount': 5000}] if the order has four PayRollLine associated, with 3 where code=A and 1 where code=B.

# Query that will be used inside the `Prefetch` object later
# Let's assume quantity and unit_amount can't be null in the query 
# to avoid extra Coalesce...
prefetch_qs = (
    PayrollLine.objects.values('code').annotate(
        number=ExpressionWrapper(
            Sum(F('quantity')), output_field=DecimalField()
        ),
        amount=ExpressionWrapper(
            Sum(F('unit_amount') * F('quantity')), output_field=DecimalField()
        ),
    )
)
print(
    Order.objects.all()
    .prefetch_related(
        Prefetch('pay_lines', queryset=prefetch_qs, to_attr='pay_lines_grouped')
    )
)

The first query works fine alone, outside the prefetch object. But, when executing the second query, Django raises an exception:

    Traceback (most recent call last):
      [...]
      File "/data/.virtualenvs/ORFEO/local/lib/python2.7/site-packages/django/db/models/query.py", line 234, in __repr__
        data = list(self[:REPR_OUTPUT_SIZE + 1])
      File "/data/.virtualenvs/ORFEO/local/lib/python2.7/site-packages/django/db/models/query.py", line 258, in __iter__
        self._fetch_all()
      File "/data/.virtualenvs/ORFEO/local/lib/python2.7/site-packages/django/db/models/query.py", line 1076, in _fetch_all
        self._prefetch_related_objects()
      File "/data/.virtualenvs/ORFEO/local/lib/python2.7/site-packages/django/db/models/query.py", line 656, in _prefetch_related_objects
        prefetch_related_objects(self._result_cache, self._prefetch_related_lookups)
      File "/data/.virtualenvs/ORFEO/local/lib/python2.7/site-packages/django/db/models/query.py", line 1457, in prefetch_related_objects
        obj_list, additional_lookups = prefetch_one_level(obj_list, prefetcher, lookup, level)
      File "/data/.virtualenvs/ORFEO/local/lib/python2.7/site-packages/django/db/models/query.py", line 1556, in prefetch_one_level
        prefetcher.get_prefetch_queryset(instances, lookup.get_current_queryset(level)))
      File "/data/.virtualenvs/ORFEO/local/lib/python2.7/site-packages/django/db/models/fields/related_descriptors.py", line 544, in get_prefetch_queryset
        instance = instances_dict[rel_obj_attr(rel_obj)]
      File "/data/.virtualenvs/ORFEO/local/lib/python2.7/site-packages/django/db/models/fields/related.py", line 597, in get_local_related_value
        return self.get_instance_value_for_fields(instance, self.local_related_fields)
      File "/data/.virtualenvs/ORFEO/local/lib/python2.7/site-packages/django/db/models/fields/related.py", line 605, in get_instance_value_for_fields
        opts = instance._meta
    AttributeError: 'dict' object has no attribute '_meta'

Looks like Django can't work with dict in prefetch_related... On IRC, someone point me to [this PR](https://github.com/django/django/pull/6478) that could resolve my problem but I think it should be verified with unit tests.

Change History (15)

comment:1 Changed 4 years ago by Simon Charette

Summary: Allow Prefetch query to returns aggregationsAllow Prefetch query to use .values()
Version: 1.9master

I doubt this is related to the linked PR or aggregation. I believe Prefetch.queryset cannot be used with values() queryset.

Please confirm you get a similar crash with the following queryset:

queryset = PayrollLine.objects.values('code')
Order.objects.prefetch_related(
    Prefetch('pay_lines', queryset=queryset, to_attr='pay_lines_values')
)

comment:2 Changed 4 years ago by Maxime Lorant

Yes, same error & traceback using PayrollLine.objects.values('code') only.

The error seems obvious: the prefetch mechanism tries to access to attributes for some reason, but it got a dict from the query, not model instances...

comment:3 Changed 4 years ago by Simon Charette

Triage Stage: UnreviewedAccepted

I'm not sure if this should be considered a New Feature or a Bug.

comment:4 Changed 4 years ago by Anssi Kääriäinen

I'd say there is a bug and a new feature. We should error nicely on values queries, and of course, ability to prefetch values queries would be a nice feature for aggregations specifically.

comment:5 Changed 4 years ago by Simon Charette

Has patch: set

Submitted a PR for the bug part. This should be converted to a feature request once it's merged.

comment:6 Changed 4 years ago by Simon Charette <charettes@…>

In 7ec330e:

Refs #26565 -- Errored nicely when using Prefetch with a values() queryset.

Thanks Maxime Lorant for the report and Anssi for the suggestion.

comment:7 Changed 4 years ago by Simon Charette

Type: BugNew feature

comment:8 Changed 4 years ago by Simon Charette

Has patch: unset

comment:9 Changed 4 years ago by Gwildor Sok

Just to add, it would be great if not only .values() would be supported, but also .values_list(flat=True).

comment:10 Changed 3 years ago by Alexander Schepanovski

Anyone on this? Would be great to have.

comment:11 Changed 3 years ago by Simon Charette

No one has claimed it yet Alexander, feel free to propose a patch!

comment:12 Changed 11 months ago by Dan LaManna

Cc: Dan LaManna added

comment:13 Changed 8 months ago by Micah Lyle

Cc: Micah Lyle added

comment:14 Changed 8 months ago by Micah Lyle

I took a stab at this during the DjangoCon 2019 sprints.

I had very limited experience with the prefetching/ORM internals before this, so there could be workarounds/solutions that I'm unaware of, but here's what I found so far.

One thing I tried to tackle is handling the many to many case with just values(). My goal was to take the '_prefetch_related_val_%s' in the queryset.extra(...) call in the get_prefetch_queryset() ManyRelatedManager and add/apply it to whatever values() call was made and then join that by doing pop() from the dictionary instead of a getattr call (which would effectively give the developer the same query that they originally made by modifying it under the hood). I ran into some issues doing this though:

"Any extra() call made after a values() call will have its extra selected fields ignored" (from the Django 2.2 docs)."

This means that the queryset.extra(...) call in the ManyRelatedManager (which from my quick search appears to be the only remaining usage of extra(...) across the entire Django codebase (except the tests)) is effectively ignored, and even if I try to add a .values(*existing_values, '_prefetch_related_val_%s') call, it returns this error (from my example test/use case using the models from the prefetch tests):

queryset.values('id', 'title', '_prefetch_related_val_author_id')

django.core.exceptions.FieldError: Cannot resolve keyword '_prefetch_related_val_author_id' into field. Choices are: _prefetch_related_val_author_id, authors, bio, bookwithyear, first_time_authors, id, read_by, title

Is this a bug? It says it can't resolve a field and then says that field is a choice.

I thought about trying to workaround it by essentially "undoing" the values() part of the queryset and then reapplying it at the end with the value from the join table (which then gets poped, but I'm not sure if that's feasible/doable and what that would actually look like. Besides, this entire solution so far feels somewhat hacky in the first place.

I think that whatever change makes this happen may require some sort of specification on the developer's end for how to apply the join from the prefetched dictionaries to the existing list of instances. Maybe even writing parts of a custom prefetcher and specifying that custom prefetcher as a fourth keyword argument (something I explored, but didn't get very far) to the Prefetch object, and then tweaking the logic in prefetch_related_objects and other functions to support using that fourth keyword argument.

I'm also wondering if there's a way with Django 3.1+ technology/API to rewrite the queryset.extra(...) call in ManyRelatedManager. An out there idea I had was replacing the queryset.extra(...) call with a query on the through table and then doing a select related to the foreign key that is on the side we want to bring in to join to the existing instances (in the case above, we'd grab Book.authors.through.objects.filter_based_on_what_we_have_for_books_and_authors().select_related('book') and then transform to a values result from there but that would (I think) prevent aggregations in the first place and run into a number of other problems.

comment:15 Changed 4 weeks ago by Aymeric Augustin

Just came across the same need. That would be a nice feature.

Note: See TracTickets for help on using tickets.
Back to Top