Opened 4 years ago
Closed 4 years ago
#31880 closed Bug (fixed)
QuerySet.aggregate() mixes annotated fields names.
Reported by: | Thodoris Sotiropoulos | Owned by: | David Wobrock |
---|---|---|---|
Component: | Database layer (models, ORM) | Version: | 3.1 |
Severity: | Normal | Keywords: | |
Cc: | David Wobrock, Simon Charette | Triage Stage: | Ready for checkin |
Has patch: | yes | Needs documentation: | no |
Needs tests: | no | Patch needs improvement: | no |
Easy pickings: | no | UI/UX: | no |
Description
I have the following model (simplified for opening the issue).
class T(models.Model): id = models.AutoField(primary=True) foo = models.IntegerField()
and I perform the following query on MySQL (simplified for opening the issue)
T.objects.annotate(anon=F('foo')).aggregate(foo=Max(F('anon')), sum=Sum(F('foo')))
this produces the following SQL query
SELECT MAX(`anon`), SUM(`foo`) FROM (SELECT `table`.`foo` AS `anon` FROM `foo`) subquery
that causes the following exception
django.db.utils.OperationalError: (1054, "Unknown column 'foo' in 'field list'")
I would expect django to generate
SELECT MAX(`anon`), SUM(`anon`) FROM (SELECT `table`.`foo` AS `anon` FROM `table`) subquery
Change History (11)
comment:1 by , 4 years ago
Summary: | Incorrect handling of aggregate's keyword name → QuerySet.aggregate() mixes annotated fields names. |
---|---|
Triage Stage: | Unreviewed → Accepted |
Type: | Uncategorized → Bug |
comment:2 by , 4 years ago
Cc: | added |
---|---|
Has patch: | set |
Owner: | changed from | to
Status: | new → assigned |
Hi,
I tried to tackle this ticket by fixing the actual bug.
The PR
It's surely not that an important bug and one could just use a different alias for the aggregation, but I wanted to get to the bottom.
Aggregations are actually also registering annotations, meaning that the annotation alias is overridden, and the subquery is not aware of the field anymore.
I thought of three ways to solve this:
- raise an error when an aggregation is overriding the alias of an annotation (basically, forbid the buggy case) -- easy to implement, but I think it's a pity to forbid something that could work.
- implement the smallest change that will allow this case -- it's the approach implemented in the linked patch.
- handle differently aggregation annotations and simple annotations -- a large change, which would require quite a rewrite of parts of the aggregation logic and would be a lot of work for such a small edge case I guess.
I hope it makes sense to you and I'm open to changing to any other and better approach :)
David
comment:5 by , 4 years ago
Yeah I can see that better now, sorry. I also checked out the patch and see that it doesn't address #31679 as far as I can tell, so better to have two tickets.
comment:6 by , 4 years ago
Cc: | added |
---|
I would expect that the first kwarg (e.g. age_alias=Sum(F('age')
) overrides the previous annotation (e.g. age_alias=F('age')
) and a FieldError
"Cannot compute Avg('age_alias'): 'age_alias' is an aggregate" is raised, like with annotations:
Author.objects.annotate(age_alias=F('age')).annotate(age_alias=Sum(F('age'))).annotate(avg_age=Avg(F('age_alias')))
,Author.objects.annotate(age_alias=F('age')).annotate(age_alias=Sum(F('age')), avg_age=Avg(F('age_alias')))
.
IMO it's not obvious how such querysets should behave. I think we should raise an error if an aggregate()
overrides and refs to the same alias.
comment:7 by , 4 years ago
Has patch: | unset |
---|
To sum up, I think supporting this edge case isn't worth adding extra complexity, we should raise an error. You can always use different aliases.
comment:8 by , 4 years ago
Has patch: | set |
---|
Hi, thanks for taking the time diving into the issue.
I agree that the expected behaviour is not obvious.
My reflection why it could be supported was that:
- it is possible since the aliases are used in two different queries (the outer and the inner) in the case of an aggregation
- You can always use different aliases. it's true, but with a system that automatically generates QuerySets it might not be easy to "just use different aliases".
However, let's go with an error, I agree that it adds complexity for not a lot of added value. There is a fair chance that the first patch missed edge cases.
I pushed a new patch: https://github.com/django/django/pull/13431
Happy to hear what you think about this one :)
comment:9 by , 4 years ago
I think raising an explicit error at the ORM level is the right call for now. Nothing prevents us from eventually fixing this if we can align on an expected behaviour in the future but users will get a clearer error message in the mean time.
comment:10 by , 4 years ago
Triage Stage: | Accepted → Ready for checkin |
---|
I would rather expect:
Nevertheless we should raise a descriptive error or add a column to the
subquery
. Is there any reason to not useSum(F('anon'))
?