Opened 4 years ago

Closed 4 years ago

#19843 closed Bug (invalid)

ORM generate dupes when filtering twice on related objects

Reported by: tonnzor Owned by: nobody
Component: Database layer (models, ORM) Version: 1.4
Severity: Normal Keywords:
Cc: Triage Stage: Unreviewed
Has patch: no Needs documentation: no
Needs tests: no Patch needs improvement: no
Easy pickings: no UI/UX: no


When I apply to a QuerySet a filter for related model twice -- I get dupes.

Let's say we have classes:

from django.db import models

class Film(models.Model):

class Person(models.Model):

class PersonInFilm(models.Model):
    film = models.ForeignKey(Film)
    person = models.ForeignKey(Person)
    title = models.CharField(max_length=50)

Then we try to find out needed PersonInFilm -- multiple filters (e.g. using Managers):

qs_wrong = Person.objects.filter(personinfilm__film=1).filter(personinfilm__role=1)
print "%s rows\n%s" % (qs_wrong.count(), qs_wrong.values('id').query)

This will result in:

15 rows
SELECT `persons_person`.`id` FROM `persons_person`
INNER JOIN `persons_personinfilm` ON (`persons_person`.`id` = `persons_personinfilm`.`person_id`)
INNER JOIN `persons_personinfilm` T4 ON (`persons_person`.`id` = T4.`person_id`)
WHERE (`persons_personinfilm`.`film_id` = 1  AND T4.`role_id` = 1 )

Then we try to find out needed PersonInFilm -- single filter:

qs_correct = Person.objects.filter(personinfilm__film=1, personinfilm__role=1)
print "%s rows\n%s" % (qs_correct.count(), qs_correct.values('id').query)

This will result in:

1 rows
SELECT `persons_person`.`id` FROM `persons_person`
INNER JOIN `persons_personinfilm` ON (`persons_person`.`id` = `persons_personinfilm`.`person_id`)
WHERE (`persons_personinfilm`.`role_id` = 1  AND `persons_personinfilm`.`film_id` = 1 )

Please keep in mind that often it is impossible to put filter in one command -- e.g. when you use custom Manager.

Change History (3)

comment:1 Changed 4 years ago by ioreku

This explains the results you are getting.

comment:2 Changed 4 years ago by Anssi Kääriäinen

The reason is that two filters generate two different joins. Use one filter() and you get one join only. This is somewhat confusing API, but it will allow one to choose if two or one join is needed for the query.

So, closing as invalid. Still, it would be nice to get a more explicit API for using multiple vs one join. One possibility is qs.ref(personinfilm_ref2='personinfilm'). Now qs.filter(personinfilm__film=1, personinfilm__role=2) will use one join, qs.filter(personinfilm__film=1, personinfilm_ref2__role=2) would generate two joins. This would also allow splitting the filter to two separate calls. But this is separate ticket's issue, and I am not sure the idea actually works that well in practice.

comment:3 Changed 4 years ago by Anssi Kääriäinen

Resolution: invalid
Status: newclosed
Note: See TracTickets for help on using tickets.
Back to Top