#33929 closed Bug (duplicate)
Field Reference in FilteredRelation Does Not Recognize Previously Defined FilteredRelation
| Reported by: | Matt | Owned by: | nobody |
|---|---|---|---|
| Component: | Database layer (models, ORM) | Version: | 4.1 |
| Severity: | Normal | Keywords: | |
| Cc: | Triage Stage: | Unreviewed | |
| Has patch: | no | Needs documentation: | no |
| Needs tests: | no | Patch needs improvement: | no |
| Easy pickings: | no | UI/UX: | no |
Description
I suspect this may be the same root cause as https://code.djangoproject.com/ticket/33766, but the use-case here is different enough I thought I'd log a new ticket.
All this is using Django 4.0 or 4.1, on PostgreSQL. I confess that I have not checked if other DB layers might generate correct SQL.
It appears that I cannot reference one FilteredRelation from another's condition without jumping through some hoops. Starting with the following example models:
from django.db import models
class A(models.Model):
...
class B(models.Model):
a = models.ForeignKey("A", on_delete=models.CASCADE)
complete = models.BooleanField(default=False)
class C(models.Model):
a = models.ForeignKey("A", on_delete=models.CASCADE)
b = models.OneToOneField("B", blank=True, null=True, on_delete=models.CASCADE)
complete = models.BooleanField(default=False)
Now suppose that I want a count of incomplete B, and also incomplete C, but only when related to an incomplete B.
If I were writing SQL myself, I’d write this as:
SELECT COUNT(b.id) as b_count, COUNT(c.id) as c_count FROM a LEFT JOIN b ON b.a_id = a.id AND NOT b.complete LEFT JOIN c ON c.a_id = a.id AND c.b_id = b.id AND NOT c.complete
Now, the below queryset very nearly works:
A.objects.annotate(
binc=FilteredRelation("b", condition=Q(b__complete=False)),
cinc=FilteredRelation("c", condition=Q(c__b=F("binc__pk"), c__complete=False)),
b_count=Count("binc"),
c_count=Count("cinc"),
)
Unfortunately this uses an incorrect table alias into the cinc FilteredRelation, where I tried to reference F("binc__pk"). If I try to execute it, I get
django.db.utils.ProgrammingError: missing FROM-clause entry for table "t4"
LINE 1: ...("a"."id" = cinc."a_id" AND ((cinc."b_id" = (T4."id") A…
There is a workaround: I can force the correct identifier using RawSQL, and use this, which provides correct results:
A.objects.annotate(
binc=FilteredRelation("b", condition=Q(b__complete=False)),
cinc=FilteredRelation("c", condition=Q(c__b=RawSQL("binc.id", ()), c__complete=False)),
b_count=Count("binc"),
c_count=Count("cinc"),
)
Change History (3)
comment:1 by , 3 years ago
comment:2 by , 3 years ago
| Resolution: | → duplicate |
|---|---|
| Status: | new → closed |
Thanks for the report! Agree, this is a duplicate of #33929.
comment:3 by , 3 years ago
I agree with Mariusz and I think this has the exact same root cause as #33766 and detailed in this comment.
The gist of it is that filter relations conditions are resolved very late in the process of SQL compilation (basically when the FROM clause is generated) and thus the resolving of references to filtered relation induced joins alias can be off.
I hope to add some more context after looking this over. It appears as though Matt is correct regarding being related to https://code.djangoproject.com/ticket/33766. The fields are not being properly aliased.
App name in my tests is
interfacewhich I left references of in the console output, but I removed theinterface_prefix from the formatted SQL queries for readability.>>> qs = A.objects.annotate( ... binc=FilteredRelation("b", condition=Q(b__complete=False)), ... cinc=FilteredRelation("c", condition=Q(c__b=F("binc__pk"), c__complete=False)), ... b_count=Count("binc"), ... c_count=Count("cinc"), ... ) >>> qs.query.alias_map {'interface_a': <django.db.models.sql.datastructures.BaseTable object at 0x11095fa60>, 'binc': <django.db.models.sql.datastructures.Join object at 0x11095c6a0>, 'cinc': <django.db.models.sql.datastructures.Join object at 0x11095de10>}Invoking the query compiler adds the naive alias
T4to the query which also later shows up in theQuery.alias_map.>>> str(qs.query) SELECT "a"."id", COUNT(binc."id") AS "b_count", COUNT(cinc."id") AS "c_count" FROM "a" LEFT OUTER JOIN "b" binc ON ("a"."id" = binc."a_id" AND (NOT binc."complete")) LEFT OUTER JOIN "c" cinc ON ("a"."id" = cinc."a_id" AND ((cinc."b_id" = (T4."id") AND NOT cinc."complete"))) GROUP BY "a"."id"Notice the incrementing
Tnon theQuery.alias_mapbut only the last one is referenced in the resulting query:>>> qs.query.alias_map {'interface_a': <django.db.models.sql.datastructures.BaseTable object at 0x11095fa60>, 'binc': <django.db.models.sql.datastructures.Join object at 0x11095c6a0>, 'cinc': <django.db.models.sql.datastructures.Join object at 0x11095de10>, 'T4': <django.db.models.sql.dat astructures.Join object at 0x1077991b0>, 'T5': <django.db.models.sql.datastructures.Join object at 0x10778b400>} >>> str(qs.query) 'SELECT "interface_a"."id", COUNT(binc."id") AS "b_count", COUNT(cinc."id") AS "c_count" FROM "interface_a" LEFT OUTER JOIN "interface_b" binc ON ("interface_a"."id" = binc."a_id" AND (NOT binc."complete")) LEFT OUTER JOIN "interface_c" cinc ON ("interface_a"."id" = cinc."a_id" AND ((cinc."b_id" = (T6."id") AND NOT cinc."complete"))) GROUP BY "interface_a"."id"' >>> str(qs.query) 'SELECT "interface_a"."id", COUNT(binc."id") AS "b_count", COUNT(cinc."id") AS "c_count" FROM "interface_a" LEFT OUTER JOIN "interface_b" binc ON ("interface_a"."id" = binc."a_id" AND (NOT binc."complete")) LEFT OUTER JOIN "interface_c" cinc ON ("interface_a"."id" = cinc."a_id" AND ((cinc."b_id" = (T8."id") AND NOT cinc."complete"))) GROUP BY "interface_a"."id"' >>> qs.query.alias_map {'interface_a': <django.db.models.sql.datastructures.BaseTable object at 0x11095fa60>, 'binc': <django.db.models.sql.datastructures.Join object at 0x11095c6a0>, 'cinc': <django.db.models.sql.datastructures.Join object at 0x11095de10>, 'T4': <django.db.models.sql.dat astructures.Join object at 0x1077991b0>, 'T5': <django.db.models.sql.datastructures.Join object at 0x10778b400>, 'T6': <django.db.models.sql.datastructures.Join object at 0x11099b640>, 'T7': <django.db.models.sql.datastructures.Join object at 0x110834580>, 'T8': <dj ango.db.models.sql.datastructures.Join object at 0x110ad0610>, 'T9': <django.db.models.sql.datastructures.Join object at 0x110ad03a0>} >>> str(qs.query) SELECT "a"."id", COUNT(binc."id") AS "b_count", COUNT(cinc."id") AS "c_count" FROM "a" LEFT OUTER JOIN "b" binc ON ("a"."id" = binc."a_id" AND (NOT binc."complete")) LEFT OUTER JOIN "c" cinc ON ("a"."id" = cinc."a_id" AND ((cinc."b_id" = (T9"id") AND NOT cinc."complete"))) GROUP BY "a"."id"The following works but selects and groups by one additional field
binc_pk. Postgres Query Planner output included. On a larger queryset this could get expensive.A.objects.annotate( binc=FilteredRelation("b", condition=Q(b__complete=False)), binc_pk=F('binc__pk'), # <-- Allows elimination of relying on raw SQL, however... cinc=FilteredRelation("c", condition=Q(c__b=F('binc_pk'), c__complete=False)), b_count=Count("binc"), c_count=Count("cinc"), )SELECT "a"."id", binc."id" AS "binc_pk", /* <-- Ugly */ COUNT(binc."id") AS "b_count", COUNT(cinc."id") AS "c_count" FROM "a" LEFT OUTER JOIN "b" binc ON ("a"."id" = binc."a_id" AND (NOT binc."complete")) LEFT OUTER JOIN "c" cinc ON ("a"."id" = cinc."a_id" AND ((cinc."b_id" = (binc."id") AND NOT cinc."complete"))) GROUP BY "a"."id", binc."id" /* <-- Ugly */Not ideal.
➜ ~ psql -d dj_issue_33929 psql (14.4) Type "help" for help. dj_issue_33929=# EXPLAIN SELECT dj_issue_33929-# "interface_a"."id", dj_issue_33929-# binc."id" AS "binc_pk", dj_issue_33929-# COUNT(binc."id") AS "b_count", dj_issue_33929-# COUNT(cinc."id") AS "c_count" dj_issue_33929-# FROM "interface_a" dj_issue_33929-# LEFT OUTER JOIN "interface_b" binc dj_issue_33929-# ON ("interface_a"."id" = binc."a_id" dj_issue_33929(# AND (NOT binc."complete")) dj_issue_33929-# LEFT OUTER JOIN "interface_c" cinc dj_issue_33929-# ON ("interface_a"."id" = cinc."a_id" dj_issue_33929(# AND ((cinc."b_id" = (binc."id") dj_issue_33929(# AND NOT cinc."complete"))) dj_issue_33929-# GROUP BY "interface_a"."id", dj_issue_33929-# binc."id"; QUERY PLAN --------------------------------------------------------------------------------------- HashAggregate (cost=162.74..185.34 rows=2260 width=32) Group Key: interface_a.id, binc.id -> Hash Left Join (cost=97.80..140.14 rows=2260 width=24) Hash Cond: ((interface_a.id = cinc.a_id) AND (binc.id = cinc.b_id)) -> Hash Right Join (cost=60.85..91.33 rows=2260 width=16) Hash Cond: (binc.a_id = interface_a.id) -> Seq Scan on interface_b binc (cost=0.00..28.10 rows=905 width=16) Filter: (NOT complete) -> Hash (cost=32.60..32.60 rows=2260 width=8) -> Seq Scan on interface_a (cost=0.00..32.60 rows=2260 width=8) -> Hash (cost=25.40..25.40 rows=770 width=24) -> Seq Scan on interface_c cinc (cost=0.00..25.40 rows=770 width=24) Filter: (NOT complete) (13 rows) dj_issue_33929=# EXPLAIN SELECT dj_issue_33929-# "interface_a"."id", dj_issue_33929-# COUNT(binc."id") AS "b_count", dj_issue_33929-# COUNT(cinc."id") AS "c_count" dj_issue_33929-# FROM "interface_a" dj_issue_33929-# LEFT OUTER JOIN "interface_b" binc dj_issue_33929-# ON ("interface_a"."id" = binc."a_id" dj_issue_33929(# AND (NOT binc."complete")) dj_issue_33929-# LEFT OUTER JOIN "interface_c" cinc dj_issue_33929-# ON ("interface_a"."id" = cinc."a_id" dj_issue_33929(# AND ((cinc."b_id" = (binc.id) dj_issue_33929(# AND NOT cinc."complete"))) dj_issue_33929-# GROUP BY "interface_a"."id"; QUERY PLAN --------------------------------------------------------------------------------------- HashAggregate (cost=157.09..179.69 rows=2260 width=24) Group Key: interface_a.id -> Hash Left Join (cost=97.80..140.14 rows=2260 width=24) Hash Cond: ((interface_a.id = cinc.a_id) AND (binc.id = cinc.b_id)) -> Hash Right Join (cost=60.85..91.33 rows=2260 width=16) Hash Cond: (binc.a_id = interface_a.id) -> Seq Scan on interface_b binc (cost=0.00..28.10 rows=905 width=16) Filter: (NOT complete) -> Hash (cost=32.60..32.60 rows=2260 width=8) -> Seq Scan on interface_a (cost=0.00..32.60 rows=2260 width=8) -> Hash (cost=25.40..25.40 rows=770 width=24) -> Seq Scan on interface_c cinc (cost=0.00..25.40 rows=770 width=24) Filter: (NOT complete) (13 rows)Instinctively I looked into the implementation of
OuterReffor ideas, but the implementation is quite contrasting to FilteredRelation.