Opened 8 years ago
Last modified 7 weeks ago
#29291 new Bug
Negated Q expressions across a nullable relationship are not properly handled by When expressions.
| Reported by: | Bo Marchman | Owned by: | |
|---|---|---|---|
| Component: | Database layer (models, ORM) | Version: | dev |
| Severity: | Normal | Keywords: | |
| Cc: | Triage Stage: | Accepted | |
| Has patch: | no | Needs documentation: | no |
| Needs tests: | no | Patch needs improvement: | no |
| Easy pickings: | no | UI/UX: | no |
Description (last modified by )
I've run into an issue with conditional expressions using negated Q objects.
I would expect qs1 and qs2 to be equivalent in the following code:
class Application(models.Model): pass class Score(models.Model): application = models.ForeignKey(Application, on_delete=models.PROTECT) reviewed = models.BooleanField() a1 = Application.objects.create() Score.objects.create(reviewed=False, application=a1) Score.objects.create(reviewed=True, application=a1) a2 = Application.objects.create() qs1 = Application.objects.annotate( needs_review=Case( When(~Q(score__reviewed=True), then=V(True)), default=V(False), output_field=BooleanField() ) ).filter(needs_review=True) qs2 = Application.objects.filter( ~Q(score__reviewed=True) ) print(qs1) # <QuerySet [<Application: Application object (1)>, <Application: Application object (2)>]> print(qs2) # <QuerySet [<Application: Application object (2)>]> assert set(qs1) == set(qs2)
The identical ~Q expression is behaving differently in the context of Case/When and filter. Am I completely missing something and this is expected behavior?
This is using Django 2.0.4.
Change History (11)
comment:1 by , 8 years ago
| Description: | modified (diff) |
|---|
comment:2 by , 8 years ago
comment:3 by , 8 years ago
This is on Postgres 9.6.8.
For qs1:
SELECT "test_app_application"."id",
CASE
WHEN NOT ("test_app_score"."reviewed" = TRUE
AND "test_app_score"."reviewed" IS NOT NULL) THEN TRUE
ELSE FALSE
END AS "needs_review"
FROM "test_app_application"
LEFT OUTER JOIN "test_app_score" ON ("test_app_application"."id" = "test_app_score"."application_id")
WHERE CASE
WHEN NOT ("test_app_score"."reviewed" = TRUE
AND "test_app_score"."reviewed" IS NOT NULL) THEN TRUE
ELSE FALSE
END = TRUE
And qs2:
SELECT "test_app_application"."id"
FROM "test_app_application"
WHERE NOT ("test_app_application"."id" IN
(SELECT U1."application_id"
FROM "test_app_score" U1
WHERE U1."reviewed" = TRUE))
comment:4 by , 8 years ago
| Triage Stage: | Unreviewed → Accepted |
|---|---|
| Version: | 2.0 → master |
I feel like qs2 is correct and that q1 should be using a subquery as well to correctly deal with SQL's boolean logic with regards to NULL.
SELECT "test_app_application"."id", CASE WHEN NOT ("test_app_application"."id" IN (SELECT U1."application_id" FROM "test_app_score" U1 WHERE U1."reviewed" = TRUE)) THEN TRUE ELSE FALSE END as "needs_review" FROM "test_app_application" WHERE CASE WHEN NOT ("test_app_application"."id" IN (SELECT U2."application_id" FROM "test_app_score" U2 WHERE U2."reviewed" = TRUE)) THEN TRUE ELSE FALSE END = TRUE
I'm not sure how to convert it into a subquery pushdown though.
comment:5 by , 7 years ago
I think this is just a consequence of the way annotate works on a reverse ForeignKey rather than being a bug in the conditional expression.
In [53]: Application.objects.annotate(
...: needs_review=Case(
...: When(~Q(score__reviewed=True), then=V(True)),
...: default=V(False),
...: output_field=BooleanField()
...: )
...: )
Out[53]: <QuerySet [<Application: Application object (1)>, <Application: Application object (1)>, <Application: Application object (2)>]>
In [54]: Application.objects.annotate(
...: needs_review=Case(
...: When(~Q(score__reviewed=True), then=V(True)),
...: default=V(False),
...: output_field=BooleanField()
...: )
...: ).values_list('needs_review',flat=True)
Out[54]: <QuerySet [True, False, True]>
Or more simply:
In [55]: Application.objects.annotate(score_has_been_reviewed=F('score__reviewed
...: '))
Out[55]: <QuerySet [<Application: Application object (1)>, <Application: Application object (1)>, <Application: Application object (2)>]>
In [56]: Application.objects.annotate(score_has_been_reviewed=F('score__reviewed
...: ')).values_list('score_has_been_reviewed',flat=True)
Out[56]: <QuerySet [False, True, None]>
comment:6 by , 7 years ago
as I can see, the problem is when we try to give the direct conditions to dbnull objects . If the column is nullable, you have to manage it when you use in select or where clauses. q1 query might be changed like this:
qs1 = Application.objects.annotate(
needs_review=Case(
When(~Q(Q(score__reviewed=True) | Q(score__reviewed__isnull=False)), then=V(True)),
default=V(False),
output_field=BooleanField()
)
).filter(needs_review=True)
if we want to make a query like q2, it could be like this:
qs1 = Application.objects.filter(
~Q(id__in=Score.objects.annotate(
needs_review = Case(
When(Q(reviewed=True) | Q(reviewed__isnull=False), then=F('application_id')),
output_field=IntegerField()
)
).values_list('needs_review', flat=True))
)
at this case, we can get rid of the score table's column or join query. q2 is more convenient than this of course.
comment:7 by , 7 years ago
| Summary: | Conditional expressions and ~Q queries → Negated Q expressions across a nullable relationship are not properly handled by When expressions. |
|---|
It's also interesting to see how .filter(~(Q(score__reviewed=True) | Q(score__reviewed=None)) behaves
SELECT "application"."id" FROM "application" WHERE NOT (("application"."id" IN (SELECT U1."application_id" FROM "score" U1 WHERE U1."reviewed" = TRUE) OR "application"."id" IN (SELECT U0."id" FROM "application" U0 LEFT OUTER JOIN "score" U1 ON (U0."id" = U1."application_id") WHERE (U1."reviewed" IS NULL AND U0."id" = ("application"."id")))))
I think the most appropriate solutions at this point is to adjust When SQL compilation to reuse the Query._add_q logic. That will make sure the subquery pushdown is performed if required and make sure implementations don't diverge.
This should be favorable over implementing a new algorithm specific to When given how .filter is already heavily testing the handling of exclusions across nullable relationships
comment:8 by , 7 years ago
I dont agree with you. With one time left join and without subquery solution is pretty performd and enough. Orm doesnt have to solve Null case clauses but user have to.
comment:9 by , 3 months ago
| Has patch: | set |
|---|
comment:10 by , 3 months ago
| Owner: | changed from to |
|---|---|
| Status: | new → assigned |
comment:11 by , 7 weeks ago
| Has patch: | unset |
|---|---|
| Owner: | removed |
| Status: | assigned → new |
Could you possible provide the SQL queries generated by these querysets and which database you a querying against? You can retrieve them by doing
str(queryset.query).