#18437 closed Bug (invalid)
Incorrect double JOIN when using multiple .filter() calls on the same table
| Reported by: | Diederik van der Boor | Owned by: | nobody | 
|---|---|---|---|
| Component: | Database layer (models, ORM) | Version: | 1.4 | 
| Severity: | Normal | Keywords: | |
| Cc: | Triage Stage: | Unreviewed | |
| Has patch: | no | Needs documentation: | no | 
| Needs tests: | no | Patch needs improvement: | no | 
| Easy pickings: | no | UI/UX: | no | 
Description
When using .filter() twice, Django generates a different SQL then using the filters in one call.
In the sample below, both statements filter at the same fields, but return a different query:
    Message.objects.filter(usermessage_set__user=user).filter(usermessage_set__is_read=False)
    Message.objects.filter(usermessage_set__user=user, usermessage_set__is_read=False)
The first results in a double join at the same table:
    SELECT "test_message"."id", "test_message"."from_user_id" FROM "test_message"
    INNER JOIN "test_messagereadstate" ON ("test_message"."id" = "test_messagereadstate"."message_id")
    INNER JOIN "test_messagereadstate" T4 ON ("test_message"."id" = T4."message_id")
    WHERE ("test_messagereadstate"."user_id" = 12345  AND T4."is_read" = False )
Using everything in one .filter() avoids the second join:
    SELECT "test_message"."id", "test_message"."from_user_id" FROM "test_message"
    INNER JOIN "test_messagereadstate" ON ("test_message"."id" = "test_messagereadstate"."message_id")
    WHERE ("test_messagereadstate"."is_read" = False  AND "test_messagereadstate"."user_id" = 12345 )
This is the whole example, that can be executed directly as a script:
#!/usr/bin/env python
# Inline settings file
from django.conf import settings
settings.configure(
    DEBUG = True,
    DATABASES = { 
        'default': {'ENGINE': 'django.db.backends.sqlite3', 'NAME': ':memory:'}   
    },  
)
from django.contrib.auth.models import User
from django.db import models
class Message(models.Model):
    from_user = models.ForeignKey(User, related_name='sent_messages')
    to_users = models.ManyToManyField(User, related_name='received_messages')
    class Meta:
        app_label = 'test'
class MessageReadState(models.Model):
    user = models.ForeignKey(User, related_name='usermessage_set')
    message = models.ForeignKey(Message, related_name='usermessage_set')
    is_read = models.BooleanField(default=False)
    class Meta:
        app_label = 'test'
if __name__ == '__main__':
    user = 12345
    print Message.objects.filter(usermessage_set__user=user).filter(usermessage_set__is_read=False).query
    print Message.objects.filter(usermessage_set__user=user, usermessage_set__is_read=False).query
This happens in both Django 1.3 and 1.4
Change History (6)
comment:1 by , 13 years ago
| Resolution: | → invalid | 
|---|---|
| Status: | new → closed | 
comment:2 by , 13 years ago
Ah, I understand, this has a use case too.
Perhaps a ..note:: would be good to add to the docs.
I didn't notice it before, only now that I know it exists I see which sentence makes a subtle reference to it.
comment:3 by , 13 years ago
There might be room for improvement here. If you can provide a patch that would be great (or, at least where in the docs you would like to see an addition, and what to add there).
comment:4 by , 13 years ago
I just ran into this issue. If you are doing filtering and later add some annotations like Count, Sum, etc. this is a unpleasant surprise and giving you wrong results.
comment:5 by , 6 years ago
We can use: FilteredRelation to specify 1 relation (and join 1 time) only:
https://docs.djangoproject.com/en/3.0/ref/models/querysets/#filteredrelation-objects
comment:6 by , 8 months ago
May be useful for optimize sql with filtered relation.
Here is examples:
qs = MyModel.objects.annotate(actual_other=FilteredRelation("othermodel", condition=Q("othermodel__delete_time__isnull=True"))
qs.filter(actual_other__field="foo")
In this case django produces sql with duplicated join for filtering and for FilteredRelation.
But if we pass FilteredRelation alias to select_related, problem solved:
qs = MyModel.objects.annotate(actual_other=FilteredRelation("othermodel", condition=Q("othermodel__delete_time__isnull=True")).select_related("actual_other")
qs.filter(actual_other__field="foo")
This is a feature. See: https://docs.djangoproject.com/en/1.4/topics/db/queries/#spanning-multi-valued-relationships.
Yes, the behavior can be confusing, but there is no better alternative available for the multi-valued relationships.