Opened 17 months ago

Closed 17 months ago

Last modified 17 months ago

#29271 closed Bug (invalid)

Chaining Filters on a Reverse Foreign Key Produces Multiple Joins

Reported by: Michael MacIntosh Owned by: nobody
Component: Database layer (models, ORM) Version: 2.0
Severity: Normal Keywords: filter chain reverse foreign key
Cc: Triage Stage: Accepted
Has patch: no Needs documentation: no
Needs tests: no Patch needs improvement: no
Easy pickings: no UI/UX: no

Description

When you perform multiple filters on a queryset on a reverse foreign key, it produces duplicate joins, which causes the results to multiply. If instead of chaining the filters, you put both of the filters in one filter, you do not get duplicate results.

You also get this behavior if you repeat the same filter.

This also applies to Q objects that are used in chain filters that reference values across reverse foreign keys as well.

Example Models:

class Alpha(models.Model):
    field = models.CharField(max_length=100)

class Beta(models.Model):
    field = models.CharField(max_length=100)
    data = models.CharField(max_length=100)
    alpha = models.ForeignKey(Alpha, on_delete=models.CASCADE)

Example Data:

Alpha.objects.all().delete()
Beta.objects.all().delete()
alpha = Alpha(field="alpha_text")
alpha.save()
beta = Beta(field="beta_text", data="data1", alpha=alpha)
beta.save()
beta = Beta(field="beta_text", data="data2", alpha=alpha)
beta.save()

Example Output:

>>> Alpha.objects.filter(beta__field="beta_text", beta__data="data1")
<QuerySet [<Alpha: Alpha object (12)>]>
>>> Alpha.objects.filter(beta__field="beta_text").filter(beta__data="data1")
<QuerySet [<Alpha: Alpha object (12)>, <Alpha: Alpha object (12)>]>

Formatted SQL of the first query (expected):

SELECT "test_app_alpha"."id",
       "test_app_alpha"."field"
FROM   "test_app_alpha"
       INNER JOIN "test_app_beta"
               ON ( "test_app_alpha"."id" = "test_app_beta"."alpha_id" )
WHERE  ( "test_app_beta"."field" = beta_text
         AND "test_app_beta"."data" = data1 )  

Formatted SQL of the second query (bug)

SELECT "test_app_alpha"."id",
       "test_app_alpha"."field"
FROM   "test_app_alpha"
       INNER JOIN "test_app_beta"
               ON ( "test_app_alpha"."id" = "test_app_beta"."alpha_id" )
       INNER JOIN "test_app_beta" T3
               ON ( "test_app_alpha"."id" = T3."alpha_id" )
WHERE  ( "test_app_beta"."field" = beta_text
         AND T3."data" = data1 )  

Change History (6)

comment:1 Changed 17 months ago by Andrew Standley

Triage Stage: UnreviewedAccepted

Reproduced.

Last edited 17 months ago by Andrew Standley (previous) (diff)

comment:2 Changed 17 months ago by Tim Graham

Is this really unexpected behavior? Duplicate of #16554 (closed as invalid)?

comment:3 Changed 17 months ago by Simon Charette

Resolution: invalid
Status: newclosed

This is expected behavior as per spanning multi-valued relationships as beta is a reverse ForeignKey and thus multi-valued.

I'm not sure how we could give more visibility to this part of the documentation as this keeps being reported as a bug given how unintuitive it is. Maybe this could be mentioned in the filter() reference documentation?

comment:4 Changed 17 months ago by Andrew Standley

I'm pretty sure it's a good rule of thumb that when the behaviour of a feature is highly unintuitive, one should question whether it was a good design choice. I'll take this to the mailing list in the hope someone can explain the justification for this intended behaviour because ,to my naive self, it seems that Q objects would be a cleaner way to allow the type of multi-valued queries referenced in spanning multi-valued relationships.

I would argue that regardless, this is an issue with the documentation as Simon suggests.

The filter documentation simply states that filters are joined via AND. Meanwhile the chaining filters documentation explicitly states that successive filters act on the result of the last filter. Neither have any link or reference to the contradictory behaviour documented in spanning multi-valued relationships

I imagine this would require a new ticket specifically about the documentation?

Last edited 17 months ago by Andrew Standley (previous) (diff)

comment:5 Changed 17 months ago by Tim Graham

#27936 requests documentation improvements.

comment:6 Changed 17 months ago by Andrew Standley

Thanks for the link Tim, but isn't the scope of #27936 quite a bit different? That ticket is regarding possible changes to the spanning multi-value relationships documentation in order to clarify the 'intended' functionality.

The issue I see is that the chaining-filters documentation appears to contradict the spanning multi-value relationships documentation.

My suggested changes would be more along the lines of
1) Adding a link to retrieving specific objects with filters to the filter docs, so the details of filter are more visible.
2) Adding a note to chaining-filters that references the behaviour is handled differently for multi-value relationships and links to spanning multi-value relationships . (Edit: I realize these on the same page, but in almost two years of working with Django I don't think I've ever read that page top to bottom, and I'm usually pretty good about reading the docs)

Also a side note for documentation: this ticket appears to be a duplicate of #18437 as well.

Last edited 17 months ago by Andrew Standley (previous) (diff)
Note: See TracTickets for help on using tickets.
Back to Top