Opened 24 hours ago

Closed 14 hours ago

Last modified 3 hours ago

#36035 closed Bug (duplicate)

Prefetch while annotating the reference of a m2m relation generates duplicated joins

Reported by: Thiago Bellini Ribeiro Owned by:
Component: Database layer (models, ORM) Version: 5.1
Severity: Normal Keywords:
Cc: Thiago Bellini Ribeiro Triage Stage: Unreviewed
Has patch: no Needs documentation: no
Needs tests: no Patch needs improvement: no
Easy pickings: no UI/UX: no

Description

I noticed this when I started investigating this issue from a library that I maintain: https://github.com/strawberry-graphql/strawberry-django/issues/650

Consider the following models:

class User(models.Model):
    groups = models.ManyToManyField("Group", related_name="users")

class Email(models.Model):
    user = models.ForeignKey(User, related_name="emails")

class Group(models.Model):
    ...

If I do:

User.objects.prefetch(Prefetch("emails", Email.objects.annotate(user_name=F("user__name"))))

This gets properly resolved, and the prefetch will generate a query like:

SELECT "email"."id",
       "email"."user_id",
       "user"."name" AS "foo"
  FROM "email"
 INNER JOIN "user"
    ON ("email"."product_id" = "user"."id")
 WHERE "email"."product_id" IN (<ids>)

Perfect!

Now, if I do the same for groups, which is a m2m relation, like this:

User.objects.prefetch(Prefetch("groups", Group.objects.annotate(user_name=F("user__name"))))

The prefetch will get resolved like this:

SELECT ("user_group"."user_id") AS "_prefetch_related_val_user_id",
       "group"."id",
       "group"."name",
       "user"."name" AS "foo"
  FROM "group"
  LEFT OUTER JOIN "user_group"
    ON ("group"."id" = "user_group"."group_id")
  LEFT OUTER JOIN "user"
    ON ("user_group"."user_id" = "user"."id")
 INNER JOIN "user_group" T4
    ON ("group"."id" = T4."group_id")
 WHERE T4."user_id" IN (<ids>)

This of course cause spurious results in the prefetched results. And it also happens even if I annotate the id, which doesn't force a join with the related table:

SELECT ("user_group"."user_id") AS "_prefetch_related_val_user_id",
       "group"."id",
       "group"."name",
       "user_group"."user_id" AS "annotated_user_id"
  FROM "group"
  LEFT OUTER JOIN "user_group"
    ON ("group"."id" = "user_group"."group_id")
 INNER JOIN "user_group" T4
    ON ("group"."id" = T4."group_id")
 WHERE T4."product_id" IN (<ids>)

I tried digging into the codebase and found out that it is related to the join in the prefetch queryset not being reused when the prefetch is actually executed and filtered by the instances. The related line is this one: https://github.com/django/django/blob/fcd9d08379a2aee3b2c49eab0d0b8db6fd66d091/django/db/models/sql/query.py#L1946

In that line, it only uses can_reuse for a m2m, otherwise, it passes None, which will allow reuse. But that can_reuse set is empty when adding the join for the instances filter, which forces it to create the T4 alias.

If I comment that line and force reuse = None the issue appears to be solved and I don't get the extra join anymore (but of course the fix is not only that, as that code is there for some reason)

Change History (2)

comment:1 by Simon Charette, 14 hours ago

Resolution: duplicate
Status: newclosed

Hey Thiago, I'm almost certain this is duplicate of #35677 which you could confirm by trying out the patch in ticket:35677#comment:9.

I'm going to close this one as a duplicate but please comment over there (the discussion might be of interest to you) if the patch does indeed resolve your issue. I might be able to submit a patch in the next few days if it's the case.

in reply to:  1 comment:2 by Thiago Bellini Ribeiro, 3 hours ago

Replying to Simon Charette:

Hey Thiago, I'm almost certain this is duplicate of #35677 which you could confirm by trying out the patch in ticket:35677#comment:9.

I'm going to close this one as a duplicate but please comment over there (the discussion might be of interest to you) if the patch does indeed resolve your issue. I might be able to submit a patch in the next few days if it's the case.

Thanks for the reply! :)

And yes, the patch in https://code.djangoproject.com/ticket/35677#comment:9 indeed fixes the issue for me! Looking forward to the patch

btw, will that fix be backported to older django versions (e.g. 4.2, 5.0, etc)

Note: See TracTickets for help on using tickets.
Back to Top