Opened 2 years ago

Last modified 18 months ago

#29291 new Bug

Negated Q expressions across a nullable relationship are not properly handled by When expressions.

Reported by: Bo Marchman Owned by: nobody
Component: Database layer (models, ORM) Version: master
Severity: Normal Keywords:
Cc: Triage Stage: Accepted
Has patch: no Needs documentation: no
Needs tests: no Patch needs improvement: no
Easy pickings: no UI/UX: no

Description (last modified by Bo Marchman)

I've run into an issue with conditional expressions using negated Q objects.

I would expect qs1 and qs2 to be equivalent in the following code:

class Application(models.Model):
    pass

class Score(models.Model):
    application = models.ForeignKey(Application, on_delete=models.PROTECT)
    reviewed = models.BooleanField()

a1 = Application.objects.create()
Score.objects.create(reviewed=False, application=a1)
Score.objects.create(reviewed=True, application=a1)

a2 = Application.objects.create()

qs1 = Application.objects.annotate(
    needs_review=Case(
        When(~Q(score__reviewed=True), then=V(True)),
        default=V(False),
        output_field=BooleanField()
    )
).filter(needs_review=True)

qs2 = Application.objects.filter(
    ~Q(score__reviewed=True)
)

print(qs1) # <QuerySet [<Application: Application object (1)>, <Application: Application object (2)>]>
print(qs2) # <QuerySet [<Application: Application object (2)>]>

assert set(qs1) == set(qs2) 

The identical ~Q expression is behaving differently in the context of Case/When and filter. Am I completely missing something and this is expected behavior?

This is using Django 2.0.4.

Change History (8)

comment:1 Changed 2 years ago by Bo Marchman

Description: modified (diff)

comment:2 Changed 2 years ago by Simon Charette

Could you possible provide the SQL queries generated by these querysets and which database you a querying against? You can retrieve them by doing str(queryset.query).

comment:3 Changed 2 years ago by Bo Marchman

This is on Postgres 9.6.8.

For qs1:

SELECT "test_app_application"."id",
       CASE
           WHEN NOT ("test_app_score"."reviewed" = TRUE
                     AND "test_app_score"."reviewed" IS NOT NULL) THEN TRUE
           ELSE FALSE
       END AS "needs_review"
FROM "test_app_application"
LEFT OUTER JOIN "test_app_score" ON ("test_app_application"."id" = "test_app_score"."application_id")
WHERE CASE
          WHEN NOT ("test_app_score"."reviewed" = TRUE
                    AND "test_app_score"."reviewed" IS NOT NULL) THEN TRUE
          ELSE FALSE
      END = TRUE

And qs2:

SELECT "test_app_application"."id"
FROM "test_app_application"
WHERE NOT ("test_app_application"."id" IN
             (SELECT U1."application_id"
              FROM "test_app_score" U1
              WHERE U1."reviewed" = TRUE))

comment:4 Changed 2 years ago by Simon Charette

Triage Stage: UnreviewedAccepted
Version: 2.0master

I feel like qs2 is correct and that q1 should be using a subquery as well to correctly deal with SQL's boolean logic with regards to NULL.

SELECT "test_app_application"."id",
    CASE
          WHEN NOT ("test_app_application"."id" IN
             (SELECT U1."application_id"
              FROM "test_app_score" U1
              WHERE U1."reviewed" = TRUE)) THEN TRUE
          ELSE FALSE
      END as "needs_review"
FROM "test_app_application"
WHERE CASE
          WHEN NOT ("test_app_application"."id" IN
             (SELECT U2."application_id"
              FROM "test_app_score" U2
              WHERE U2."reviewed" = TRUE)) THEN TRUE
          ELSE FALSE
      END = TRUE

I'm not sure how to convert it into a subquery pushdown though.

comment:5 Changed 2 years ago by Robert Sparks

I think this is just a consequence of the way annotate works on a reverse ForeignKey rather than being a bug in the conditional expression.

In [53]: Application.objects.annotate(
    ...:     needs_review=Case(
    ...:         When(~Q(score__reviewed=True), then=V(True)),
    ...:         default=V(False),
    ...:         output_field=BooleanField()
    ...:     )
    ...: )
Out[53]: <QuerySet [<Application: Application object (1)>, <Application: Application object (1)>, <Application: Application object (2)>]>

In [54]: Application.objects.annotate(
    ...:     needs_review=Case(
    ...:         When(~Q(score__reviewed=True), then=V(True)),
    ...:         default=V(False),
    ...:         output_field=BooleanField()
    ...:     )
    ...: ).values_list('needs_review',flat=True)
Out[54]: <QuerySet [True, False, True]>

Or more simply:

In [55]: Application.objects.annotate(score_has_been_reviewed=F('score__reviewed
    ...: '))
Out[55]: <QuerySet [<Application: Application object (1)>, <Application: Application object (1)>, <Application: Application object (2)>]>

In [56]: Application.objects.annotate(score_has_been_reviewed=F('score__reviewed
    ...: ')).values_list('score_has_been_reviewed',flat=True)
Out[56]: <QuerySet [False, True, None]>

comment:6 Changed 18 months ago by Can Sarıgöl

as I can see, the problem is when we try to give the direct conditions to dbnull objects . If the column is nullable, you have to manage it when you use in select or where clauses. q1 query might be changed like this:

qs1 = Application.objects.annotate(
        needs_review=Case(
                When(~Q(Q(score__reviewed=True) | Q(score__reviewed__isnull=False)), then=V(True)),
                default=V(False),
                output_field=BooleanField()
            )
        ).filter(needs_review=True)

if we want to make a query like q2, it could be like this:

qs1 = Application.objects.filter(
            ~Q(id__in=Score.objects.annotate(
                needs_review = Case(
                    When(Q(reviewed=True) | Q(reviewed__isnull=False), then=F('application_id')),
                    output_field=IntegerField()
                )
            ).values_list('needs_review', flat=True))
        )

at this case, we can get rid of the score table's column or join query. q2 is more convenient than this of course.

Last edited 18 months ago by Can Sarıgöl (previous) (diff)

comment:7 Changed 18 months ago by Simon Charette

Summary: Conditional expressions and ~Q queriesNegated Q expressions across a nullable relationship are not properly handled by When expressions.

It's also interesting to see how .filter(~(Q(score__reviewed=True) | Q(score__reviewed=None)) behaves

SELECT "application"."id"
FROM "application"
WHERE NOT (("application"."id" IN
              (SELECT U1."application_id"
               FROM "score" U1
               WHERE U1."reviewed" = TRUE)
            OR "application"."id" IN
              (SELECT U0."id"
               FROM "application" U0
               LEFT OUTER JOIN "score" U1 ON (U0."id" = U1."application_id")
               WHERE (U1."reviewed" IS NULL
                      AND U0."id" = ("application"."id")))))

I think the most appropriate solutions at this point is to adjust When SQL compilation to reuse the Query._add_q logic. That will make sure the subquery pushdown is performed if required and make sure implementations don't diverge.

This should be favorable over implementing a new algorithm specific to When given how .filter is already heavily testing the handling of exclusions across nullable relationships

comment:8 Changed 18 months ago by Can Sarıgöl

I dont agree with you. With one time left join and without subquery solution is pretty performd and enough. Orm doesnt have to solve Null case clauses but user has to.

Last edited 18 months ago by Can Sarıgöl (previous) (diff)
Note: See TracTickets for help on using tickets.
Back to Top