Opened 4 months ago
Closed 4 months ago
#35779 closed Uncategorized (duplicate)
ORM to avoid deferring the reference_id of a prefetch
Reported by: | Thiago Bellini Ribeiro | Owned by: | GunSliger00007 |
---|---|---|---|
Component: | Database layer (models, ORM) | Version: | 4.2 |
Severity: | Normal | Keywords: | |
Cc: | Thiago Bellini Ribeiro | Triage Stage: | Unreviewed |
Has patch: | no | Needs documentation: | no |
Needs tests: | no | Patch needs improvement: | no |
Easy pickings: | no | UI/UX: | no |
Description
Suppose that I have a User
and an Email
models, and they are related through Email.user_id
If I query User
like this:
User.objects.prefetch_related("email_set", Email.objects.only("email"))
I'll see n+1 issues either way, as user_id
was deferred, and the ORM will need to refetch the object from the database.
Although the fix for this is easy, just changing this to:
User.objects.prefetch_related("email_set", Email.objects.only("user_id", "email"))
It can catch some users off-guard, as it is not so obvious that the ORM will not do "the correct thing".
Based on that, I was wondering if maybe the ORM could force the reference_id of a relation to not be deferred when it is used in a Prefetch
object, the same way it already does with pk
. It makes sense to me at least.
---
On a side note, this caused some unexpected behavior at work today, as we were fixing RemovedInDjango50Warnings
and the simple addition of chunk_size
to an .iterator()
with a Prefetch
caused some Model.DoesNotExist
issues.
After digging for a while I found out about this issue, which means our prefetch_related
was being thrown into the trash, but also because the iteration was taking some time to happen (very large table), some related objects got deleted in the meantime, and when the ORM tried to refresh_from_db
to get the related id, it failed with that error.
Change History (4)
comment:1 by , 4 months ago
Owner: | set to |
---|---|
Status: | new → assigned |
comment:3 by , 4 months ago
Replying to GunSliger00007:
Consider optimizing your queries by prefetching only the fields you need. In your case, you’re already using only() to limit the fields loaded by the query, which helps reduce the amount of data loaded into memory.If the related Email objects are large, consider using .values() or .values_list() to reduce the amount of data furthe
User.objects.prefetch_related( Prefetch("email_set", queryset=Email.objects.only("user_id", "email")) )
That User
/Email
was mostly to exemplify the issue :)
comment:4 by , 4 months ago
Resolution: | → duplicate |
---|---|
Status: | assigned → closed |
Duplicate of #33835 closed as wontfix. See explanation in the ticket.
Consider optimizing your queries by prefetching only the fields you need. In your case, you’re already using only() to limit the fields loaded by the query, which helps reduce the amount of data loaded into memory.If the related Email objects are large, consider using .values() or .values_list() to reduce the amount of data furthe