Opened 4 years ago

Last modified 4 years ago

#31922 closed Bug

Changing ordering of Q statements in filter produces different SQL and results. — at Initial Version

Reported by: Chris Bell Owned by: nobody
Component: Database layer (models, ORM) Version: dev
Severity: Normal Keywords: orm, subquery, q, order
Cc: Simon Charette Triage Stage: Accepted
Has patch: yes Needs documentation: no
Needs tests: no Patch needs improvement: no
Easy pickings: no UI/UX: no

Description

This issue has been found when combining two Q statements that filter on the same models in subqueries, but with slightly different where clauses.

By combining the two Q in different orders the SQL resulting from the queries is different, one producing the correct result and one not. The SQL that is not correct has had an additional where clause added to the subquery.

Here is an example:

Two simple models (in module = utils)

class Book(BaseModel):
    name = models.CharField(max_length=100, null=True, blank=True)

class BookStatus(BaseModel):
    book = models.ForeignKey(Book, null=False, blank=False, on_delete=models.CASCADE, related_name='statuses')
    status = models.CharField(max_length=100, null=True, blank=True)
    valid_from = models.DateTimeField(db_index=True)
    valid_to = models.DateTimeField(db_index=True, blank=True, null=True, default=None)

We will add some data:

from django.utils import timezone
from utils.models import *
from django.db.models import Q, Subquery
b1 = Book.objects.create(name='Green is the new Blue')
b2 = Book.objects.create(name='Orange is the new Red')
BookStatus.objects.create(book=b1, status='DRAFT', valid_from=timezone.now())
BookStatus.objects.create(book=b1, status='PUBLISHED', valid_from=timezone.now())
BookStatus.objects.create(book=b2, status='DRAFT', valid_from=timezone.now())

Now we want to find all books that are in draft but not published.

q1 = Q(statuses__id__in=Subquery(BookStatus.objects.only('id').filter(status='DRAFT')))
q2 = ~Q(statuses__id__in=Subquery(BookStatus.objects.only('id').filter(status='PUBLISHED')))
qs = Book.objects.only('name').filter(q1 & q2)  # NOTICE Q1 BEFORE Q2
str(qs.query)

The resulting query is:

SELECT "utils_book"."id",
    "utils_book"."name"
FROM "utils_book"
    INNER JOIN "utils_bookstatus" ON ("utils_book"."id" = "utils_bookstatus"."book_id")
WHERE (
        "utils_bookstatus"."id" IN (
            SELECT U0."id"
            FROM "utils_bookstatus" U0
            WHERE U0."status" = 'DRAFT'
        )
        AND NOT (
            "utils_book"."id" IN (
                SELECT V1."book_id"
                FROM "utils_bookstatus" V1
                WHERE (
                        V1."id" IN (
                            SELECT U0."id"
                            FROM "utils_bookstatus" U0
                            WHERE U0."status" = 'PUBLISHED'
                        )
                        AND V1."id" = ("utils_bookstatus"."id")
                    )
            )
        )
    )

You can see the last where clause which joins the inner book status to the outer one.

The result being that both books are returned:

qs

<QuerySet [<Book: Book object (1)>, <Book: Book object (2)>]>

Now, if we run the same query again, but reverse the order of the Q I get a different result:

qs = Book.objects.only('name').filter(q2 & q1)  # NOTICE Q2 BEFORE Q1

str(qs.query)


SELECT "utils_book"."id",
    "utils_book"."name"
FROM "utils_book"
    INNER JOIN "utils_bookstatus" ON ("utils_book"."id" = "utils_bookstatus"."book_id")
WHERE (
        NOT (
            "utils_book"."id" IN (
                SELECT V1."book_id"
                FROM "utils_bookstatus" V1
                WHERE V1."id" IN (
                        SELECT U0."id"
                        FROM "utils_bookstatus" U0
                        WHERE U0."status" = 'PUBLISHED'
                    )
            )
        )
        AND "utils_bookstatus"."id" IN (
            SELECT U0."id"
            FROM "utils_bookstatus" U0
            WHERE U0."status" = 'DRAFT'
        )
    )

The result of this query is correct:

qs

<QuerySet [<Book: Book object (2)>]>

There are better ways of trying to get the answer this query is looking for, but this is a simplification of a more complex scenario and is for demoing the issue.

Unfortunately, I have not been able to work out why this is happening or where in the code the subquery is being amended.

Regards

Chris

Change History (0)

Note: See TracTickets for help on using tickets.
Back to Top