Opened 11 years ago

Closed 11 years ago

#21192 closed Bug (fixed)

QuerySet.exclude(<many-to-many-field>__<lookup-field>=<value>) and QuerySet.exclude(Q(<many-to-many-field>__<lookup-field>=<value>)) generates different queries. And second one generates an incorrect query.

Reported by: Debanshu Kundu Owned by: nobody
Component: Database layer (models, ORM) Version: 1.5
Severity: Normal Keywords: QuerySet, exclude, Q
Cc: Debanshu Kundu Triage Stage: Unreviewed
Has patch: no Needs documentation: no
Needs tests: no Patch needs improvement: no
Easy pickings: no UI/UX: no

Description

Suppose we have an application profiles and profiles.models has two models with following definition:

class ModelB(models.Model):
    pass


class ModelA(models.Model):
    b_objects = models.ManyToManyField(ModelB)

Following are some queries possible on these models and SQL generated by the ORM for each query:


1. ModelA.objects.filter(b_objects__id=1)
SQL:

SELECT "profiles_modela"."id"
FROM "profiles_modela" INNER JOIN "profiles_modela_b_objects" ON ("profiles_modela"."id" = "profiles_modela_b_objects"."modela_id")
WHERE "profiles_modela_b_objects"."modelb_id" = 1
LIMIT 21

2. ModelA.objects.filter(Q(b_objects__id=1))
SQL:

SELECT "profiles_modela"."id"
FROM "profiles_modela" INNER JOIN "profiles_modela_b_objects" ON ("profiles_modela"."id" = "profiles_modela_b_objects"."modela_id")
WHERE "profiles_modela_b_objects"."modelb_id" = 1
LIMIT 21

3. ModelA.objects.exclude(~Q(b_objects__id=1))
SQL:

SELECT "profiles_modela"."id"
FROM "profiles_modela"
WHERE NOT (NOT (("profiles_modela"."id" IN (
    SELECT U1."modela_id"
    FROM "profiles_modela_b_objects" U1
    WHERE (U1."modelb_id" = 1  AND U1."modela_id" IS NOT NULL)
) AND "profiles_modela"."id" IS NOT NULL)))
LIMIT 21

4. ModelA.objects.filter(~Q(b_objects__id=1))
SQL:

SELECT "profiles_modela"."id"
FROM "profiles_modela"
WHERE NOT (("profiles_modela"."id" IN (
    SELECT U1."modela_id"
    FROM "profiles_modela_b_objects" U1
    WHERE (U1."modelb_id" = 1  AND U1."modela_id" IS NOT NULL)
) AND "profiles_modela"."id" IS NOT NULL))
LIMIT 21

5. ModelA.objects.exclude(b_objects__id=1)
SQL:

SELECT "profiles_modela"."id"
FROM "profiles_modela"
WHERE NOT (("profiles_modela"."id" IN (
    SELECT U1."modela_id"
    FROM "profiles_modela_b_objects" U1
    WHERE (U1."modelb_id" = 1  AND U1."modela_id" IS NOT NULL)
) AND "profiles_modela"."id" IS NOT NULL))
LIMIT 21

6. ModelA.objects.exclude(Q(b_objects__id=1))
SQL:

SELECT "profiles_modela"."id"
FROM "profiles_modela" INNER JOIN "profiles_modela_b_objects" ON ("profiles_modela"."id" = "profiles_modela_b_objects"."modela_id")
WHERE NOT ("profiles_modela_b_objects"."modelb_id" = 1 )
LIMIT 21

The first three queries (i.e., 1, 2 and 3) are supposed to return all ModelA objects which have an entry of ModelB object with id=1 in the b_objects field. And they all does (Although 3 is not optimal, but that is not the issue here).

The last three queries (i.e., 4, 5, 6) are supposed to return all ModelA objects which do not have an entry of ModelB object with id=1 in the b_objects field. 4 and 5 does, but 6 dose not, as we can see it has an INNER JOIN on the through table of many-to-many field. Instead, if 6 had an LEFT OUTER JOIN, the query would have worked as expected and would have been more optimal than 4 and 5.

Change History (1)

comment:1 by Anssi Kääriäinen, 11 years ago

Resolution: fixed
Status: newclosed

This has been fixed in the upcoming 1.6 and master. Here are the generated queries:

1: SELECT "queries_modela"."id" FROM "queries_modela" INNER JOIN "queries_modela_b_objects" ON ( "queries_modela"."id" = "queries_modela_b_objects"."modela_id" ) WHERE "queries_modela_b_objects"."modelb_id" = 1
2: SELECT "queries_modela"."id" FROM "queries_modela" INNER JOIN "queries_modela_b_objects" ON ( "queries_modela"."id" = "queries_modela_b_objects"."modela_id" ) WHERE "queries_modela_b_objects"."modelb_id" = 1
3: SELECT "queries_modela"."id" FROM "queries_modela" WHERE NOT (NOT ("queries_modela"."id" IN (SELECT U1."modela_id" AS "modela_id" FROM "queries_modela_b_objects" U1 WHERE U1."modelb_id" = 1)))
4: SELECT "queries_modela"."id" FROM "queries_modela" WHERE NOT ("queries_modela"."id" IN (SELECT U1."modela_id" AS "modela_id" FROM "queries_modela_b_objects" U1 WHERE U1."modelb_id" = 1))
5: SELECT "queries_modela"."id" FROM "queries_modela" WHERE NOT ("queries_modela"."id" IN (SELECT U1."modela_id" AS "modela_id" FROM "queries_modela_b_objects" U1 WHERE U1."modelb_id" = 1))
6: SELECT "queries_modela"."id" FROM "queries_modela" WHERE NOT ("queries_modela"."id" IN (SELECT U1."modela_id" AS "modela_id" FROM "queries_modela_b_objects" U1 WHERE U1."modelb_id" = 1))

As you can see queries 4-6 are identical. Query 3 is not identical to 1, it uses double-negate and subquery. I don't remember exactly the reason why this couldn't be fixed. But there are some tests that need subquery to work correctly (queries/test_ticket7096 for one).

Unfortunately there is no way to backpatch these changes to 1.5, the changes required are just too big.

Note: See TracTickets for help on using tickets.
Back to Top