Opened 3 months ago

Closed 3 months ago

Last modified 3 months ago

#31568 closed Bug (fixed)

Alias used in aggregate filtering is incorrect.

Reported by: Gagaro Owned by: Simon Charette
Component: Database layer (models, ORM) Version: 3.0
Severity: Release blocker Keywords:
Cc: Simon Charette Triage Stage: Accepted
Has patch: no Needs documentation: no
Needs tests: no Patch needs improvement: no
Easy pickings: no UI/UX: no

Description

With the following queryset:

IndicatorValue.objects
.values("freight")
.annotate(
    loading_time=Min("datetime", filter=Q(type=IndicatorValue.TYPE_FREIGHT_CREATED)) - Max("datetime", filter=Q(type=IndicatorValue.TYPE_FREIGHT_COMPLETED)),
    has_top_loading=Exists(OrderItemResult.objects.order_by().filter(order_line__order__freight=OuterRef("freight"), loading_arm__loading_type=LoadingArm.LOADING_TYPE_TOP, ).values('pk')),
    has_bottom_loading=Exists(OrderItemResult.objects.order_by().filter(order_line__order__freight=OuterRef("freight"), loading_arm__loading_type=LoadingArm.LOADING_TYPE_BOTTOM, ).values('pk'))
)
.aggregate(
    top_min=Min("loading_time", filter=Q(has_top_loading=True, has_bottom_loading=False))
)

I get the following SQL generated for the aggregate (notice that both alias used are the same in the SQL, whereas they are not in the queryset):

MIN("loading_time") FILTER (WHERE ("has_top_loading" = false AND "has_top_loading" = true))

The full SQL generated is:

SELECT MIN("loading_time") FILTER (WHERE ("has_top_loading" = false AND "has_top_loading" = true)) FROM (SELECT "indicators_indicatorvalue"."freight_id" AS Col1, (MIN("indicators_indicatorvalue"."datetime") FILTER (WHERE "indicators_indicatorvalue"."type" = \'freight_created\') - MAX("indicators_indicatorvalue"."datetime") FILTER (WHERE "indicators_indicatorvalue"."type" = \'freight_completed\')) AS "loading_time", EXISTS(SELECT U0."id" FROM "orders_orderitemresult" U0 INNER JOIN "loading_terminal_loadingarm" U1 ON (U0."loading_arm_id" = U1."id") INNER JOIN "orders_orderitem" U2 ON (U0."order_line_id" = U2."id") INNER JOIN "orders_order" U3 ON (U2."order_id" = U3."id") WHERE (U1."loading_type" = \'TOP\' AND U3."freight_id" = "indicators_indicatorvalue"."freight_id")) AS "has_top_loading", EXISTS(SELECT U0."id" FROM "orders_orderitemresult" U0 INNER JOIN "loading_terminal_loadingarm" U1 ON (U0."loading_arm_id" = U1."id") INNER JOIN "orders_orderitem" U2 ON (U0."order_line_id" = U2."id") INNER JOIN "orders_order" U3 ON (U2."order_id" = U3."id") WHERE (U1."loading_type" = \'BOTTOM\' AND U3."freight_id" = "indicators_indicatorvalue"."freight_id")) AS "has_bottom_loading" FROM "indicators_indicatorvalue" WHERE "indicators_indicatorvalue"."deleted" IS NULL GROUP BY "indicators_indicatorvalue"."freight_id", "has_top_loading", "has_bottom_loading") subquery

It works fine with Django 2.2 (which does not use alias there if I'm not mistaken).

Change History (8)

comment:1 Changed 3 months ago by Simon Charette

Cc: Simon Charette added
Owner: changed from nobody to Simon Charette
Status: newassigned

Thank you for your report. Did you manage to reproduce against Django 3.0.5 as well? If it's the case could you try reducing your model set and queryset interactions to a minimal that still trigger the issue. That'll help tremendously in reproducing the regression and ensure it gets addressed in a timely manner. Thanks!

comment:2 Changed 3 months ago by felixxm

Severity: NormalRelease blocker
Summary: Alias used in aggregate filtering is incorrectAlias used in aggregate filtering is incorrect.
Triage Stage: UnreviewedAccepted

Thanks for this ticket.

Regression in 691def10a0197d83d2d108bd9043b0916d0f09b4.
Reproduced at 46fe506445666d8097945f0c1e8be11cfd644b28.

comment:3 in reply to:  1 Changed 3 months ago by Gagaro

Replying to Simon Charette:

Thank you for your report. Did you manage to reproduce against Django 3.0.5 as well? If it's the case could you try reducing your model set and queryset interactions to a minimal that still trigger the issue. That'll help tremendously in reproducing the regression and ensure it gets addressed in a timely manner. Thanks!

I tried to reduce it as much as I could in my context without having to try with whole new models. Do you still need me to do that as felixxm apparently was able to reproduce it?

Thanks.

comment:4 Changed 3 months ago by Simon Charette

Submitted a patch but we should also fix Subquery.__eq__ which was broken by 691def10a0197d83d2d108bd9043b0916d0f09b4. Right now Subquery(qs1) == Subquery(qs2).

comment:5 Changed 3 months ago by Mariusz Felisiak <felisiak.mariusz@…>

Resolution: fixed
Status: assignedclosed

In adfbf653:

Fixed #31568 -- Fixed alias reference when aggregating over multiple subqueries.

691def10a0197d83d2d108bd9043b0916d0f09b4 made all Subquery() instances
equal to each other which broke aggregation subquery pushdown which
relied on object equality to determine which alias it should select.

Subquery.eq() will be fixed in an another commit but
Query.rewrite_cols() should haved used object identity from the start.

Refs #30727, #30188.

Thanks Makina Corpus for the report.

comment:6 Changed 3 months ago by Mariusz Felisiak <felisiak.mariusz@…>

In 3913acdb:

[3.1.x] Fixed #31568 -- Fixed alias reference when aggregating over multiple subqueries.

691def10a0197d83d2d108bd9043b0916d0f09b4 made all Subquery() instances
equal to each other which broke aggregation subquery pushdown which
relied on object equality to determine which alias it should select.

Subquery.eq() will be fixed in an another commit but
Query.rewrite_cols() should haved used object identity from the start.

Refs #30727, #30188.

Thanks Makina Corpus for the report.

Backport of adfbf653dc1c1d0e0dacc4ed46602d22ba28b004 from master

comment:7 Changed 3 months ago by Mariusz Felisiak <felisiak.mariusz@…>

In 49bbf657:

[3.0.x] Fixed #31568 -- Fixed alias reference when aggregating over multiple subqueries.

691def10a0197d83d2d108bd9043b0916d0f09b4 made all Subquery() instances
equal to each other which broke aggregation subquery pushdown which
relied on object equality to determine which alias it should select.

Subquery.eq() will be fixed in an another commit but
Query.rewrite_cols() should haved used object identity from the start.

Refs #30727, #30188.

Thanks Makina Corpus for the report.

Backport of adfbf653dc1c1d0e0dacc4ed46602d22ba28b004 from master

comment:8 in reply to:  4 Changed 3 months ago by felixxm

Replying to Simon Charette:

Submitted a patch but we should also fix Subquery.__eq__ which was broken by 691def10a0197d83d2d108bd9043b0916d0f09b4. Right now Subquery(qs1) == Subquery(qs2).

I will create a separate issue for this.

Note: See TracTickets for help on using tickets.
Back to Top