Opened 7 years ago
Last modified 6 years ago
#29291 new Bug
Negated Q expressions across a nullable relationship are not properly handled by When expressions.
Reported by: | Bo Marchman | Owned by: | nobody |
---|---|---|---|
Component: | Database layer (models, ORM) | Version: | dev |
Severity: | Normal | Keywords: | |
Cc: | Triage Stage: | Accepted | |
Has patch: | no | Needs documentation: | no |
Needs tests: | no | Patch needs improvement: | no |
Easy pickings: | no | UI/UX: | no |
Description (last modified by )
I've run into an issue with conditional expressions using negated Q objects.
I would expect qs1
and qs2
to be equivalent in the following code:
class Application(models.Model): pass class Score(models.Model): application = models.ForeignKey(Application, on_delete=models.PROTECT) reviewed = models.BooleanField() a1 = Application.objects.create() Score.objects.create(reviewed=False, application=a1) Score.objects.create(reviewed=True, application=a1) a2 = Application.objects.create() qs1 = Application.objects.annotate( needs_review=Case( When(~Q(score__reviewed=True), then=V(True)), default=V(False), output_field=BooleanField() ) ).filter(needs_review=True) qs2 = Application.objects.filter( ~Q(score__reviewed=True) ) print(qs1) # <QuerySet [<Application: Application object (1)>, <Application: Application object (2)>]> print(qs2) # <QuerySet [<Application: Application object (2)>]> assert set(qs1) == set(qs2)
The identical ~Q
expression is behaving differently in the context of Case/When
and filter
. Am I completely missing something and this is expected behavior?
This is using Django 2.0.4.
Change History (8)
comment:1 by , 7 years ago
Description: | modified (diff) |
---|
comment:2 by , 7 years ago
comment:3 by , 7 years ago
This is on Postgres 9.6.8.
For qs1
:
SELECT "test_app_application"."id", CASE WHEN NOT ("test_app_score"."reviewed" = TRUE AND "test_app_score"."reviewed" IS NOT NULL) THEN TRUE ELSE FALSE END AS "needs_review" FROM "test_app_application" LEFT OUTER JOIN "test_app_score" ON ("test_app_application"."id" = "test_app_score"."application_id") WHERE CASE WHEN NOT ("test_app_score"."reviewed" = TRUE AND "test_app_score"."reviewed" IS NOT NULL) THEN TRUE ELSE FALSE END = TRUE
And qs2
:
SELECT "test_app_application"."id" FROM "test_app_application" WHERE NOT ("test_app_application"."id" IN (SELECT U1."application_id" FROM "test_app_score" U1 WHERE U1."reviewed" = TRUE))
comment:4 by , 7 years ago
Triage Stage: | Unreviewed → Accepted |
---|---|
Version: | 2.0 → master |
I feel like qs2
is correct and that q1
should be using a subquery as well to correctly deal with SQL's boolean logic with regards to NULL
.
SELECT "test_app_application"."id", CASE WHEN NOT ("test_app_application"."id" IN (SELECT U1."application_id" FROM "test_app_score" U1 WHERE U1."reviewed" = TRUE)) THEN TRUE ELSE FALSE END as "needs_review" FROM "test_app_application" WHERE CASE WHEN NOT ("test_app_application"."id" IN (SELECT U2."application_id" FROM "test_app_score" U2 WHERE U2."reviewed" = TRUE)) THEN TRUE ELSE FALSE END = TRUE
I'm not sure how to convert it into a subquery pushdown though.
comment:5 by , 6 years ago
I think this is just a consequence of the way annotate works on a reverse ForeignKey rather than being a bug in the conditional expression.
In [53]: Application.objects.annotate( ...: needs_review=Case( ...: When(~Q(score__reviewed=True), then=V(True)), ...: default=V(False), ...: output_field=BooleanField() ...: ) ...: ) Out[53]: <QuerySet [<Application: Application object (1)>, <Application: Application object (1)>, <Application: Application object (2)>]> In [54]: Application.objects.annotate( ...: needs_review=Case( ...: When(~Q(score__reviewed=True), then=V(True)), ...: default=V(False), ...: output_field=BooleanField() ...: ) ...: ).values_list('needs_review',flat=True) Out[54]: <QuerySet [True, False, True]>
Or more simply:
In [55]: Application.objects.annotate(score_has_been_reviewed=F('score__reviewed ...: ')) Out[55]: <QuerySet [<Application: Application object (1)>, <Application: Application object (1)>, <Application: Application object (2)>]> In [56]: Application.objects.annotate(score_has_been_reviewed=F('score__reviewed ...: ')).values_list('score_has_been_reviewed',flat=True) Out[56]: <QuerySet [False, True, None]>
comment:6 by , 6 years ago
as I can see, the problem is null db objects can use in where clause properly. If your columns is nullable, you have to manage it when you use in select or where clauses. q1 query might be changed like this:
qs1 = Application.objects.annotate( needs_review=Case( When(~Q(Q(score__reviewed=True) | Q(score__reviewed__isnull=False)), then=V(True)), default=V(False), output_field=BooleanField() ) ).filter(needs_review=True)
if we want to make a query like q2, it could be like this:
qs1 = Application.objects.filter( ~Q(id__in=Score.objects.annotate( needs_review = Case( When(Q(reviewed=True) | Q(reviewed__isnull=False), then=F('application_id')), output_field=IntegerField() ) ).values_list('needs_review', flat=True)) )
at this case, we can get rid of the score table's column or join query.
comment:7 by , 6 years ago
Summary: | Conditional expressions and ~Q queries → Negated Q expressions across a nullable relationship are not properly handled by When expressions. |
---|
It's also interesting to see how .filter(~(Q(score__reviewed=True) | Q(score__reviewed=None))
behaves
SELECT "application"."id" FROM "application" WHERE NOT (("application"."id" IN (SELECT U1."application_id" FROM "score" U1 WHERE U1."reviewed" = TRUE) OR "application"."id" IN (SELECT U0."id" FROM "application" U0 LEFT OUTER JOIN "score" U1 ON (U0."id" = U1."application_id") WHERE (U1."reviewed" IS NULL AND U0."id" = ("application"."id")))))
I think the most appropriate solutions at this point is to adjust When
SQL compilation to reuse the Query._add_q
logic. That will make sure the subquery pushdown is performed if required and make sure implementations don't diverge.
This should be favorable over implementing a new algorithm specific to When
given how .filter
is already heavily testing the handling of exclusions across nullable relationships
comment:8 by , 6 years ago
I dont agree with you. With one time left join and without subquery solution is pretty performd and enough. Orm doesnt have to solve Null case clauses but user has to.
Could you possible provide the SQL queries generated by these querysets and which database you a querying against? You can retrieve them by doing
str(queryset.query)
.