Opened 9 years ago

Last modified 9 years ago

#26379 new Bug

Inconsistent behaviour of filter() on related model (RelatedManager)

Reported by: Balázs Oroszi Owned by: nobody
Component: Database layer (models, ORM) Version: 1.9
Severity: Normal Keywords:
Cc: yaiche.amin@… Triage Stage: Accepted
Has patch: no Needs documentation: no
Needs tests: no Patch needs improvement: no
Easy pickings: no UI/UX: no

Description

Given the model in a "testapp" app:

############################
from django.db import models

class Blog(models.Model):
    title = models.CharField(max_length=100)
    # subscribers - related from Person
    # subscriptions - related from Subscription
    
    def __str__(self):
        return self.title

class Person(models.Model):
    name = models.CharField(max_length=100)
    subscribed_blogs = models.ManyToManyField(Blog, related_name="subscribers", through="Subscription")
    # subscriptions - related from Subscription
    
    def __str__(self):
        return self.name

class Subscription(models.Model):
    person = models.ForeignKey(Person, related_name="subscriptions")
    blog = models.ForeignKey(Blog, related_name="subscriptions")
    subscribed_date = models.DateField()
    
    def __str__(self):
        return ''.join([self.person.name, " - ", self.blog.title])
############################

When I filter "subscribers" of a Blog instance, the results are not consistent.
Here is the code demonstrating the effect:

############################
from testapp.models import *
from datetime import datetime

adam = Person.objects.create(name="Adam")
blog_1 = Blog.objects.create(title="Blog 1")
blog_2 = Blog.objects.create(title="Blog 2")
Subscription.objects.create(person=adam, blog=blog_1, subscribed_date=datetime(2016,1,10))
Subscription.objects.create(person=adam, blog=blog_2, subscribed_date=datetime(2016,1,20))

queryparams = {"subscriptions__subscribed_date__gt": datetime(2016,1,15)}

q1 = blog_1.subscribers.filter(**queryparams)
q2 = blog_1.subscribers.all().filter(**queryparams)
q3 = blog_1.subscribers.get_queryset().filter(**queryparams)
print(q1.query)
print(q1)
print(q2.query)
print(q2)
print(q3.query)
print(q3)

print("--------------------------")

q1 = blog_1.subscribers.filter().filter(**queryparams)
q2 = blog_1.subscribers.all().all().filter(**queryparams)
q3 = blog_1.subscribers.get_queryset().all().filter(**queryparams)
print(q1.query)
print(q1)
print(q2.query)
print(q2)
print(q3.query)
print(q3)
############################

The output is:

SELECT "testapp_person"."id", "testapp_person"."name" FROM "testapp_person" INNER JOIN "testapp_subscription" ON ("testapp_person"."id" = "testapp_subscription"."person_id") WHERE ("testapp_subscription"."blog_id" = 1 AND "testapp_subscription"."subscribed_date" > 2016-01-15)
[]
SELECT "testapp_person"."id", "testapp_person"."name" FROM "testapp_person" INNER JOIN "testapp_subscription" ON ("testapp_person"."id" = "testapp_subscription"."person_id") WHERE ("testapp_subscription"."blog_id" = 1 AND "testapp_subscription"."subscribed_date" > 2016-01-15)
[]
SELECT "testapp_person"."id", "testapp_person"."name" FROM "testapp_person" INNER JOIN "testapp_subscription" ON ("testapp_person"."id" = "testapp_subscription"."person_id") WHERE ("testapp_subscription"."blog_id" = 1 AND "testapp_subscription"."subscribed_date" > 2016-01-15)
[]
--------------------------
SELECT "testapp_person"."id", "testapp_person"."name" FROM "testapp_person" INNER JOIN "testapp_subscription" ON ("testapp_person"."id" = "testapp_subscription"."person_id") INNER JOIN "testapp_subscription" T4 ON ("testapp_person"."id" = T4."person_id") WHERE ("testapp_subscription"."blog_id" = 1 AND T4."subscribed_date" > 2016-01-15)
[<Person: Adam>]
SELECT "testapp_person"."id", "testapp_person"."name" FROM "testapp_person" INNER JOIN "testapp_subscription" ON ("testapp_person"."id" = "testapp_subscription"."person_id") INNER JOIN "testapp_subscription" T4 ON ("testapp_person"."id" = T4."person_id") WHERE ("testapp_subscription"."blog_id" = 1 AND T4."subscribed_date" > 2016-01-15)
[<Person: Adam>]
SELECT "testapp_person"."id", "testapp_person"."name" FROM "testapp_person" INNER JOIN "testapp_subscription" ON ("testapp_person"."id" = "testapp_subscription"."person_id") INNER JOIN "testapp_subscription" T4 ON ("testapp_person"."id" = T4."person_id") WHERE ("testapp_subscription"."blog_id" = 1 AND T4."subscribed_date" > 2016-01-15)
[<Person: Adam>]

The first set of queries simply "AND" the filter params with the "subscribers" RelatedManager's inherent related-filtering, while the second set of queries do a separate chain filtering.
This is exactly the kind of situation that is described in the django docs (with the blogs, "Lennon" and "2008"):
https://docs.djangoproject.com/en/dev/topics/db/queries/#spanning-multi-valued-relationships

I believe the second set of queries should be the correct one, and that should be happening also in the first set of queries, but that is not what is happening.

Change History (6)

comment:1 by Tim Graham, 9 years ago

Triage Stage: UnreviewedAccepted

I'm not sure how to explain the behavior. If it's not a bug, please reclassify to a documentation ticket with some explanation.

comment:2 by Balázs Oroszi, 9 years ago

This is most certainly a bug, as getting completely different results in the above case for example:

blog_1.subscribers.all().filter(**queryparams)

and

blog_1.subscribers.all().all().filter(**queryparams)

(and the rest) is buggy behaviour I believe (especially since the second one gives the correct results I believe).
Unfortunately I'm not well versed enough in the internals of Django's ORM to even guess at what could be wrong, so I couldn't come up with anything better than a self-contained sample to reproduce it.
It would be nice if someone else could verify and reproduce the behaviour and having better knowledge about the ORM look into the cause.
Note, that I wrote to both django-users and django-developers first a few months ago, but got no feedback.
https://groups.google.com/forum/#!msg/django-users/tmGdMhGkCGw/NfpEAb_HEAAJ
https://groups.google.com/forum/#!msg/django-developers/gQ_qbMGQsJs/xgYL7BlkFQAJ

Last edited 9 years ago by Balázs Oroszi (previous) (diff)

comment:3 by Amine Yaiche, 9 years ago

Certainly a bug.

What's weird about this is that the SQL query is correct. When i've executed the query in a DBMS it gave me "adam" in the result, while the filter() method gives an empty queryset.

in reply to:  3 comment:4 by Balázs Oroszi, 9 years ago

Which query? The first one? It shouldn't give you any results, as this clause in the first set of queries:

("testapp_subscription"."blog_id" = 1 AND "testapp_subscription"."subscribed_date" > 2016-01-15)

cannot be true (there is only one subscription for "Blog 1", and that is on 2016-01-10).

I also tested the query directly in SQLite, and it seems the SQLite command line program is buggy or something, as it also gave me a result, however when I put quotes around the date above : "2016-01-15", like this:

SELECT "testapp_person"."id", "testapp_person"."name" FROM "testapp_person" INNER JOIN "testapp_subscription" ON ("testapp_person"."id" = "testapp_subscription"."person_id") WHERE ("testapp_subscription"."blog_id" = 1 AND "testapp_subscription"."subscribed_date" > "2016-01-15")

then it correctly didn't give any results. What is incorrect here is the query string itself, the second set of queries in the original bug description are the correct ones!

About the date quotes:
We just might have found another unrelated bug here, namely the quotation around the date, I assume Django is using placeholders internally not the above string literally, but when it generates the string for us to read, it fails to put quotation marks around the date, though I don't know whether that would be correct (or it's simply an SQLite command line bug).

Replying to AmineYaiche:

Certainly a bug.

What's weird about this is that the SQL query is correct. When i've executed the query in a DBMS it gave me "adam" in the result, while the filter() method gives an empty queryset.

comment:5 by Anssi Kääriäinen, 9 years ago

Unfortunately this is by design. The first call to .filter() targets the join generated by the relation, after that new joins are generated. See the sticky filter references in the ORM.

If possible, I'd like to change this at some point. But currently we don't have any way to get the current behavior if sticky filter is removed.

comment:6 by Amine Yaiche, 9 years ago

Cc: yaiche.amin@… added
Note: See TracTickets for help on using tickets.
Back to Top