Opened 11 years ago

Closed 11 years ago

#19843 closed Bug (invalid)

ORM generate dupes when filtering twice on related objects

Reported by: Artem Skoretskiy Owned by: nobody
Component: Database layer (models, ORM) Version: 1.4
Severity: Normal Keywords:
Cc: Triage Stage: Unreviewed
Has patch: no Needs documentation: no
Needs tests: no Patch needs improvement: no
Easy pickings: no UI/UX: no

Description

When I apply to a QuerySet a filter for related model twice -- I get dupes.

Let's say we have classes:

from django.db import models

class Film(models.Model):
    pass

class Person(models.Model):
    pass

class PersonInFilm(models.Model):
    film = models.ForeignKey(Film)
    person = models.ForeignKey(Person)
    title = models.CharField(max_length=50)

Then we try to find out needed PersonInFilm -- multiple filters (e.g. using Managers):

qs_wrong = Person.objects.filter(personinfilm__film=1).filter(personinfilm__role=1)
print "%s rows\n%s" % (qs_wrong.count(), qs_wrong.values('id').query)

This will result in:

15 rows
SELECT `persons_person`.`id` FROM `persons_person`
INNER JOIN `persons_personinfilm` ON (`persons_person`.`id` = `persons_personinfilm`.`person_id`)
INNER JOIN `persons_personinfilm` T4 ON (`persons_person`.`id` = T4.`person_id`)
WHERE (`persons_personinfilm`.`film_id` = 1  AND T4.`role_id` = 1 )

Then we try to find out needed PersonInFilm -- single filter:

qs_correct = Person.objects.filter(personinfilm__film=1, personinfilm__role=1)
print "%s rows\n%s" % (qs_correct.count(), qs_correct.values('id').query)

This will result in:

1 rows
SELECT `persons_person`.`id` FROM `persons_person`
INNER JOIN `persons_personinfilm` ON (`persons_person`.`id` = `persons_personinfilm`.`person_id`)
WHERE (`persons_personinfilm`.`role_id` = 1  AND `persons_personinfilm`.`film_id` = 1 )

Please keep in mind that often it is impossible to put filter in one command -- e.g. when you use custom Manager.

Change History (3)

comment:1 by ioreku, 11 years ago

This explains the results you are getting.

comment:2 by Anssi Kääriäinen, 11 years ago

The reason is that two filters generate two different joins. Use one filter() and you get one join only. This is somewhat confusing API, but it will allow one to choose if two or one join is needed for the query.

So, closing as invalid. Still, it would be nice to get a more explicit API for using multiple vs one join. One possibility is qs.ref(personinfilm_ref2='personinfilm'). Now qs.filter(personinfilm__film=1, personinfilm__role=2) will use one join, qs.filter(personinfilm__film=1, personinfilm_ref2__role=2) would generate two joins. This would also allow splitting the filter to two separate calls. But this is separate ticket's issue, and I am not sure the idea actually works that well in practice.

comment:3 by Anssi Kääriäinen, 11 years ago

Resolution: invalid
Status: newclosed
Note: See TracTickets for help on using tickets.
Back to Top