#34748 closed Bug (fixed)
__in lookup crashes with a subquery containing an unused annotation that uses explicit grouping.
Description ¶
Attached is a small Django project to demonstrate the issue.
Tested on:
Django 4.2.2
OS X 13.4.1
Python 3.9.16
For the Oracle backend:
cx-Oracle 8.3.0 with instantclient 19.8
For the Postgres backend:
psycopg/psycopg-binary 3.1.9
The sample query is in test.py
and corresponding models are included in the attached tarball. A snippet of the query is below:
inner_qs = ( Comment.objects.annotate(question_id=F('choice__question__id')) .values('question_id').annotate(cnt=Count('*')) .values_list('question_id') ) outer_qs = Question.objects.filter(id__in=inner_qs).all() print(outer_qs)
In postgres, this generates a query like so (Also tested and same issue on Oracle):
SELECT "polls_question"."id", "polls_question"."question_text", "polls_question"."pub_date" FROM "polls_question" WHERE "polls_question"."id" IN (SELECT U1."question_id" AS "question_id" FROM "polls_comment" U0 INNER JOIN "polls_choice" U1 ON (U0."choice_id" = U1."id") GROUP BY "polls_choice"."question_id") LIMIT 21
The problem is the GROUP BY clause is not referencing the alias U1
. Instead, it's referencing the table name. This doesn't seem to happen on 3.x, and I believe it's because in 4.x, for this scenario, the group by clause is represented as a reference (Ref
) to the column in the select subquery. Ref
implement a no-op on relabeled_clone, relying on something else to modify the Col
. But this doesn't seem correct because relabeled_clone
is not an in-place change -- it returns a new object. Perhaps it should just do super().relabeled_clone()
instead of a no-op?
I understand this example is a little bit convoluted, since the final values_list
overrides the previous values()
and count aggregation calls, making those useless, but I think this still seems like a bug. I also don't have an example, but suspect Ref::resolve_expression
probably has the same issue where the column which Ref
references may be resolved correctly, but the referenced stored by Ref
won't get resolved.
Change History (9)
by , 20 months ago
Attachment: | mysite.tar.gz added |
---|
comment:1 by , 20 months ago
Given Ref.relabeled_clone
and .get_group_by_cols
are not covered by tests I suspect this might actually be an edge case we never tested for.
Could you possibly try to bisect which commit introduced the regression if it's a regression in 4.2 it would qualify as a release blocker and we could possibly squeeze a corrective for it before 4.2.4 is released on August 1st.
comment:2 by , 20 months ago
Cc: | added |
---|---|
Component: | Uncategorized → Database layer (models, ORM) |
Severity: | Normal → Release blocker |
Summary: | Ref expressions should implement relabeled_clone (and probably resolve_expression, too?) → __in lookup crashes with a subquery containing an unused annotation that uses explicit grouping. |
Triage Stage: | Unreviewed → Accepted |
Type: | Uncategorized → Bug |
Thanks for the report! Regression in dd68af62b2b27ece50d434f6a351877212e15c3f.
I was also able to reproduce this issue without the second values_list()
:
>>> inner_qs = ( Comment.objects.annotate(question_id=F('choice__question__id')) .values('question_id') .annotate(cnt=Count('*')) ) >>> outer_qs = Question.objects.filter(id__in=inner_qs).all() >>> print(outer_qs.query) SELECT "ticket_34738_question"."id", "ticket_34738_question"."question_text", "ticket_34738_question"."pub_date" FROM "ticket_34738_question" WHERE "ticket_34738_question"."id" IN (SELECT U1."question_id" AS "question_id", COUNT(*) AS "cnt" FROM "ticket_34738_comment" U0 INNER JOIN "ticket_34738_choice" U1 ON (U0."choice_id" = U1."id") GROUP BY "ticket_34738_choice"."question_id")
comment:3 by , 20 months ago
Owner: | changed from | to
---|---|
Status: | new → assigned |
comment:4 by , 20 months ago
Has patch: | set |
---|
comment:6 by , 20 months ago
Triage Stage: | Accepted → Ready for checkin |
---|
Sample Django project