Opened 12 years ago

Closed 12 years ago

#15009 closed (invalid)

Generated sql for "or"-ing two filters over multiple joins is wrong

Reported by: Saverio Trioni <strioni@…> Owned by: nobody
Component: Database layer (models, ORM) Version: 1.2
Severity: Keywords:
Cc: Triage Stage: Unreviewed
Has patch: no Needs documentation: no
Needs tests: no Patch needs improvement: no
Easy pickings: no UI/UX: no


In the following data model:

FirstRelated  <- m2m -> BaseModel <- m2m -> SecondRelated <- fk <- ThirdRelated

I am doing a filter on the BaseModel with an "or" between:

  • a condition on the ThirdRelated
  • a condition on the FirstRelated

as BaseModel.objects.filter(Q(seconds__thirds__condition=True) | Q(firsts__condition=True)).

Each of the two filters behaves correctly when executed, but when "or"-ed some records are returned more than once.

The problem is related to the generated SQL, because 'or'-ing two conditions make the joins optional, but then if a condition fails a record can be returned because of the other condition.

The solution probably would be to include a DISTINCT() clause when 'or'-ing the conditions.

Here (and attached, too) you have the models and the tests for a failing example. I leave out the wrong_or/ because they are all standard.

File wrong_or/wrong/

from django.db import models

class First(models.Model):
    condition = models.BooleanField()

class Second(models.Model):

class BaseModel(models.Model):
    firsts = models.ManyToManyField(First)  
    seconds = models.ManyToManyField(Second)

class Third(models.Model):
    second = models.ForeignKey(Second, related_name='thirds')
    condition = models.BooleanField()

File wrong_or/wrong/

from django.db import models
from django.test import TestCase
from wrong_or.wrong.models import BaseModel, First, Second, Third

class SimpleTest(TestCase):
    def test_wrong(self):
        bo = BaseModel.objects.create()
        f = First.objects.create(condition=True)
        s1 = Second.objects.create()
        t1 = Third.objects.create(second=s1, condition=False)
        s2 = Second.objects.create()
        t2 = Third.objects.create(second=s2, condition=False)
        bo.seconds.add(s1, s2)
        q1 = models.Q(firsts__condition=True) # match the only BaseModel
        q2 = models.Q(seconds__thirds__condition=True) # match nothing
        self.assertEqual(1, BaseModel.objects.filter(q1).count())
        # A filter should not duplicate records.
        # This assertion will fail, because there are two copies of 
        # the 'bo' record.
        self.assertEqual(1, BaseModel.objects.filter(q1|q2).count())

Attachments (2) (392 bytes) - added by Saverio Trioni <strioni@…> 12 years ago. (1.1 KB) - added by Saverio Trioni <strioni@…> 12 years ago.

Download all attachments as: .zip

Change History (5)

Changed 12 years ago by Saverio Trioni <strioni@…>

Attachment: added

Changed 12 years ago by Saverio Trioni <strioni@…>

Attachment: added

comment:1 Changed 12 years ago by Russell Keith-Magee

I'm going to mark this wontfix. You're correct that results will be returned more than once, but that's a leaky abstraction born of the relational nature of SQL. There are many places and queries where you can get duplicate results -- all you need to do is traverse more than one join. Although your dataset evidently doesn't reveal this, each of your standalone queries would be prone to duplicate results, under the right conditions.

comment:2 Changed 12 years ago by Saverio Trioni <strioni@…>

Ok, you are right. I always try to read fully the documentation, but a simple search today showed me this:

... (Could you mark this as RTFM???)

comment:3 in reply to:  2 Changed 12 years ago by Karen Tracey

Resolution: invalid
Status: newclosed

Replying to Saverio Trioni <>:

... (Could you mark this as RTFM???)

Invalid is as close as we have to that.

Note: See TracTickets for help on using tickets.
Back to Top