Opened 6 months ago

Closed 6 months ago

Last modified 5 months ago

#34975 closed Bug (fixed)

aggregate() crashes when referencing existing aggregations or window expressions through conditional expressions

Reported by: Sergey Nesterenko Owned by: Simon Charette
Component: Database layer (models, ORM) Version: 4.2
Severity: Release blocker Keywords: QuerySet, Window, Aggregate, F
Cc: Triage Stage: Ready for checkin
Has patch: yes Needs documentation: no
Needs tests: no Patch needs improvement: no
Easy pickings: no UI/UX: no

Description (last modified by Sergey Nesterenko)

I found out that getting refs in QuerySet doesn't work with models.Window when I try to use aggregate method. It happened when I tried to update Django to 4.2.7.

For example:

MyModel.objects.annotate(  
	new_ordering=Window(RowNumber(),  order_by=[F('ordering')])  
).aggregate(  
	new_count=Count('id', filter=Q(ordering=F('new_ordering')))
)

And as a result, it fails with:

Traceback (most recent call last):
  File "/main.py", line 38, in <module>
    result = ordering_query.aggregate(
  File "/python3.10/site-packages/django/db/models/query.py", line 592, in aggregate
    return self.query.chain().get_aggregation(self.db, kwargs)
  File "/python3.10/site-packages/django/db/models/sql/query.py", line 398, in get_aggregation
    aggregate = aggregate_expr.resolve_expression(
  File "/python3.10/site-packages/django/db/models/aggregates.py", line 70, in resolve_expression
    for ref in c.get_refs():
  File "/python3.10/site-packages/django/db/models/expressions.py", line 418, in get_refs
    refs |= expr.get_refs()
  File "/python3.10/site-packages/django/db/models/sql/where.py", line 233, in get_refs
    refs |= child.get_refs()
  File "/python3.10/site-packages/django/db/models/expressions.py", line 418, in get_refs
    refs |= expr.get_refs()
  File "/python3.10/site-packages/django/db/models/expressions.py", line 418, in get_refs
    refs |= expr.get_refs()
AttributeError: 'NoneType' object has no attribute 'get_refs'

It happens because of Window.get_source_expressions return:

def get_source_expressions(self):  
	return [self.source_expression, self.partition_by, self.order_by, self.frame]

and if we don't specify 'frame' (for example) it will be None in the list.

And in models.aggregates.Aggregate.resolve_expression when we try to find all refs https://github.com/django/django/blob/47f9b8dca16b1fdc054b338d81eb080eabad0edf/django/db/models/aggregates.py#L70C30-L70C30 it fails because of we have None in Window.get_source_expressions

def get_refs(self):  
	refs = set()  
	for expr in self.get_source_expressions():  
		refs |= expr.get_refs()  
	return refs

Change History (17)

comment:1 by Sergey Nesterenko, 6 months ago

Description: modified (diff)

comment:2 by Sergey Nesterenko, 6 months ago

Summary: Getting refs is not work properly with models.Window and aggregationGetting refs does not work properly with models.Window and aggregation

comment:3 by Simon Charette, 6 months ago

Triage Stage: UnreviewedAccepted

Another instance of a crash due to mishandling of get_source_expressions() being Expression | None.

comment:4 by Simon Charette, 6 months ago

I'm not sure if something else will break as I don't think we have extensive testing for performing filtered aggregation over a window function but does the following patch helps

diff --git a/django/db/models/expressions.py b/django/db/models/expressions.py
index 3a0c75ebf2..74ae9cab8e 100644
--- a/django/db/models/expressions.py
+++ b/django/db/models/expressions.py
@@ -417,6 +417,8 @@ def replace_expressions(self, replacements):
     def get_refs(self):
         refs = set()
         for expr in self.get_source_expressions():
+            if expr is None:
+                continue
             refs |= expr.get_refs()
         return refs

in reply to:  4 comment:5 by Sergey Nesterenko, 6 months ago

Replying to Simon Charette:

I'm not sure if something else will break as I don't think we have extensive testing for performing filtered aggregation over a window function but does the following patch helps

diff --git a/django/db/models/expressions.py b/django/db/models/expressions.py
index 3a0c75ebf2..74ae9cab8e 100644
--- a/django/db/models/expressions.py
+++ b/django/db/models/expressions.py
@@ -417,6 +417,8 @@ def replace_expressions(self, replacements):
     def get_refs(self):
         refs = set()
         for expr in self.get_source_expressions():
+            if expr is None:
+                continue
             refs |= expr.get_refs()
         return refs

I thought about it, but I wasn't sure it wouldn't break anything, but it might work.

comment:6 by Simon Charette, 6 months ago

Owner: changed from nobody to Simon Charette
Status: newassigned

comment:7 by Simon Charette, 6 months ago

I've got an idea of how to solve this, it's not clear to me if it's a regression as the underlying cause has little to do with window functions.

In the mean time, can you confirm that switching the order of your lookup resolve the issue. That is doing filter=Q(new_ordering=F('ordering')) instead of filter=Q(ordering=F('new_ordering')).

comment:8 by Simon Charette, 6 months ago

Severity: NormalRelease blocker

Confirmed regression in b181cae2e3697b2e53b5b67ac67e59f3b05a6f0d, refs #25307.

comment:9 by Simon Charette, 6 months ago

Has patch: set

comment:10 by Simon Charette, 6 months ago

Summary: Getting refs does not work properly with models.Window and aggregationaggregate() crashes when referencing existing aggregations or window expressions through conditional expressions

in reply to:  7 comment:11 by Sergey Nesterenko, 6 months ago

Replying to Simon Charette:

filter=Q(new_ordering=F('ordering')) instead of filter=Q(ordering=F('new_ordering')).

Yes, it solves my problem. Thankx!

UPD: filter=Q(new_ordering=F('ordering'), ordering__isnull=False)

Last edited 5 months ago by Sergey Nesterenko (previous) (diff)

comment:12 by Mariusz Felisiak, 6 months ago

Triage Stage: AcceptedReady for checkin

comment:13 by Mariusz Felisiak <felisiak.mariusz@…>, 6 months ago

Resolution: fixed
Status: assignedclosed

In 7530cf39:

Fixed #34975 -- Fixed crash of conditional aggregate() over aggregations.

Adjustments made to solve_lookup_type to defer the resolving of
references for summarized aggregates failed to account for similar
requirements for lookup values which can also reference annotations
through Aggregate.filter.

Regression in b181cae2e3697b2e53b5b67ac67e59f3b05a6f0d.

Refs #25307.

Thanks Sergey Nesterenko for the report.

comment:14 by Mariusz Felisiak <felisiak.mariusz@…>, 6 months ago

In 15cb3c26:

Refs #34975 -- Complemented rhs filtering aggregations for in lookup.

While this isn't a regression it's clear that similar logic should be
applied when dealing with lists of expressions passed as a lookup value.

comment:15 by Mariusz Felisiak <felisiak.mariusz@…>, 6 months ago

In 911b161:

Refs #34975 -- Handled optional source expressions in Expression.get_refs().

While no code is directly exercising get_refs in a way that triggers
a crash some expressions such as Window stash None in source_expressions
which can obscure the origin of some bugs.

Handling None values like we do in other source_expression traversing
methods such as .contains_aggregates ensures we don't run into surprises
in the future where get_refs() might be used for a different purpose.

comment:16 by Mariusz Felisiak <felisiak.mariusz@…>, 6 months ago

In 49f1ced8:

[5.0.x] Fixed #34975 -- Fixed crash of conditional aggregate() over aggregations.

Adjustments made to solve_lookup_type to defer the resolving of
references for summarized aggregates failed to account for similar
requirements for lookup values which can also reference annotations
through Aggregate.filter.

Regression in b181cae2e3697b2e53b5b67ac67e59f3b05a6f0d.

Refs #25307.

Thanks Sergey Nesterenko for the report.

Backport of 7530cf3900ab98104edcde69e8a2a415e82b345a from main

comment:17 by Mariusz Felisiak <felisiak.mariusz@…>, 6 months ago

In acf4cee:

[4.2.x] Fixed #34975 -- Fixed crash of conditional aggregate() over aggregations.

Adjustments made to solve_lookup_type to defer the resolving of
references for summarized aggregates failed to account for similar
requirements for lookup values which can also reference annotations
through Aggregate.filter.

Regression in b181cae2e3697b2e53b5b67ac67e59f3b05a6f0d.

Refs #25307.

Thanks Sergey Nesterenko for the report.

Backport of 7530cf3900ab98104edcde69e8a2a415e82b345a from main

Note: See TracTickets for help on using tickets.
Back to Top