Opened 4 years ago

Last modified 18 months ago

#31679 new Bug

Django subtly produces incorrect query when the same keyword appears in both aggregate() and annotate()

Reported by: StefanosChaliasos Owned by: Simon Charette
Component: Database layer (models, ORM) Version: 3.0
Severity: Normal Keywords:
Cc: Triage Stage: Accepted
Has patch: no Needs documentation: no
Needs tests: no Patch needs improvement: no
Easy pickings: no UI/UX: no

Description

I have the following query:

  Model.objects.annotate(foo=F('column')).aggregate(foo=Sum(F('foo')))

Initially, I was running this query on SQLite and Django was producing a result (i.e. 0).
When I switched to MySQL this query crushed with the following exception:

  Traceback (most recent call last):
  File "/dir/.env/lib/python3.8/site-packages/django/db/backends/utils.py", line 86, in _execute
    return self.cursor.execute(sql, params)
  File "/dir/.env/lib/python3.8/site-packages/django/db/backends/mysql/base.py", line 74, in execute
    return self.cursor.execute(query, args)
  File "/dir/.env/lib/python3.8/site-packages/MySQLdb/cursors.py", line 209, in execute
    res = self._query(query)
  File "/dir/.env/lib/python3.8/site-packages/MySQLdb/cursors.py", line 315, in _query
    db.query(q)
  File "/dir/.env/lib/python3.8/site-packages/MySQLdb/connections.py", line 239, in query
    _mysql.connection.query(self, query)
MySQLdb._exceptions.OperationalError: (1054, "Unknown column 'foo' in 'field list'")

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "mysql.py", line 15, in <module>
    ret2 = Model.objects.using('mysql').annotate(foo=F('column')).aggregate(foo=Sum(F('foo')))
  File "/dir/.env/lib/python3.8/site-packages/django/db/models/query.py", line 384, in aggregate
    return query.get_aggregation(self.db, kwargs)
  File "/dir/.env/lib/python3.8/site-packages/django/db/models/sql/query.py", line 502, in get_aggregation
    result = compiler.execute_sql(SINGLE)
  File "/dir/.env/lib/python3.8/site-packages/django/db/models/sql/compiler.py", line 1151, in execute_sql
    cursor.execute(sql, params)
  File "/dir/.env/lib/python3.8/site-packages/django/db/backends/utils.py", line 100, in execute
    return super().execute(sql, params)
  File "/dir/.env/lib/python3.8/site-packages/django/db/backends/utils.py", line 68, in execute
    return self._execute_with_wrappers(sql, params, many=False, executor=self._execute)
  File "/dir/.env/lib/python3.8/site-packages/django/db/backends/utils.py", line 77, in _execute_with_wrappers
    return executor(sql, params, many, context)
  File "/dir/.env/lib/python3.8/site-packages/django/db/backends/utils.py", line 86, in _execute
    return self.cursor.execute(sql, params)
  File "/dir/.env/lib/python3.8/site-packages/django/db/utils.py", line 90, in __exit__
    raise dj_exc_value.with_traceback(traceback) from exc_value
  File "/dir/.env/lib/python3.8/site-packages/django/db/backends/utils.py", line 86, in _execute
    return self.cursor.execute(sql, params)
  File "/dir/.env/lib/python3.8/site-packages/django/db/backends/mysql/base.py", line 74, in execute
    return self.cursor.execute(query, args)
  File "/dir/.env/lib/python3.8/site-packages/MySQLdb/cursors.py", line 209, in execute
    res = self._query(query)
  File "/dir/.env/lib/python3.8/site-packages/MySQLdb/cursors.py", line 315, in _query
    db.query(q)
  File "/dir/.env/lib/python3.8/site-packages/MySQLdb/connections.py", line 239, in query
    _mysql.connection.query(self, query)
django.db.utils.OperationalError: (1054, "Unknown column 'foo' in 'field list'")

After examining the SQL query produced by Django, I realized that the query is not correct.
Specifically, Django produced the following SQL query:

  SELECT SUM(`foo`) AS `foo` FROM `model`

Instead of

  SELECT SUM(`foo`) FROM (SELECT `model`.`column` AS `foo` FROM `model`) subquery

The issue appears when a keyword in aggregate function is the same with one of the keywords in annotate function. I initially thought that Django does not have any restriction on the keywords used in aggregate (i.e. a keyword in aggregate can be the same with the name of a column from the inspected model).

If you think that aggregate should not conflict with the annotate, then Django should produce a warning as running this query in SQLite subtly produces an incorrect result.

Change History (17)

comment:1 by Alexandr Tatarinov, 4 years ago

Component: UncategorizedDatabase layer (models, ORM)
Triage Stage: UnreviewedAccepted
Type: UncategorizedBug

Reproduced on PostgreSQL, Django 3.0.7. Not sure, should this work or produce an error.

comment:2 by Jacob Walls, 4 years ago

Easy pickings: set
Owner: changed from nobody to Jacob Walls
Status: newassigned

I would expect this to fail with a ValueError. aggregate() supplies an annotation, and a Query’s annotations are keyed on aliases. Supporting this would take a massive rewrite, it seems to me.

comment:3 by Jacob Walls, 4 years ago

Summary: Django sublty produces incorrect query when the same keyword appears in both aggregate() and annotate()Django subtly produces incorrect query when the same keyword appears in both aggregate() and annotate()

comment:4 by Jacob Walls, 4 years ago

Has patch: set

comment:5 by Mariusz Felisiak, 4 years ago

Easy pickings: unset
Patch needs improvement: set

comment:6 by Jacob Walls, 4 years ago

Has patch: unset
Owner: Jacob Walls removed
Status: assignednew

Unassigning myself because the attached patch was too restrictive.

comment:7 by Herbert Poul, 4 years ago

in reply to:  7 comment:8 by Mariusz Felisiak, 4 years ago

Replying to Herbert Poul:

Is this the same as https://code.djangoproject.com/ticket/31880 (which was fixed by https://github.com/django/django/pull/13431 )?

No it is not a duplicate, see comment.

in reply to:  description comment:9 by Kristijan Mitrovic, 4 years ago

  Model.objects.annotate(foo=F('column')).aggregate(foo=Sum(F('foo')))

So in the example above, where both annotated and aggregated kwargs are named foo, the output should actually contain just the last one, right? The first annotated foo column can just be renamed and then removed from the final result (to avoid confusion).

[Edit] Note that current resulting query might not contain some significant changes within firstly called annotate method as query.annotations dictionary's value is just overridden with the new value for key foo

[Edit2] And then there's an issue about referencing the previous defined foo :( (the one that would have been randomly renamed)

Last edited 4 years ago by Kristijan Mitrovic (previous) (diff)

comment:10 by Jacob Walls, 4 years ago

Patch needs improvement: unset

comment:11 by Simon Charette, 2 years ago

Has patch: set
Last edited 2 years ago by Mariusz Felisiak (previous) (diff)

comment:12 by Simon Charette, 2 years ago

Owner: set to Simon Charette
Status: newassigned

comment:13 by Mariusz Felisiak, 2 years ago

Triage Stage: AcceptedReady for checkin

comment:14 by Mariusz Felisiak <felisiak.mariusz@…>, 2 years ago

Resolution: fixed
Status: assignedclosed

In 1297c0d:

Fixed #31679 -- Delayed annotating aggregations.

By avoiding to annotate aggregations meant to be possibly pushed to an
outer query until their references are resolved it is possible to
aggregate over a query with the same alias.

Even if #34176 is a convoluted case to support, this refactor seems
worth it given the reduction in complexity it brings with regards to
annotation removal when performing a subquery pushdown.

comment:15 by Mariusz Felisiak <felisiak.mariusz@…>, 18 months ago

In 2ee0174:

Refs #34551 -- Fixed QuerySet.aggregate() crash on precending aggregation reference.

Regression in 1297c0d0d76a708017fe196b61a0ab324df76954.

Refs #31679.

comment:16 by Mariusz Felisiak <felisiak.mariusz@…>, 18 months ago

In 57f499e4:

[4.2.x] Refs #34551 -- Fixed QuerySet.aggregate() crash on precending aggregation reference.

Regression in 1297c0d0d76a708017fe196b61a0ab324df76954.

Refs #31679.

Backport of 2ee01747c32a7275a7a1a5f7862acba7db764921 from main

comment:17 by Mariusz Felisiak, 18 months ago

Has patch: unset
Resolution: fixed
Status: closednew
Triage Stage: Ready for checkinAccepted
Note: See TracTickets for help on using tickets.
Back to Top