Opened 6 years ago
Closed 6 years ago
#31136 closed Bug (fixed)
Multiple annotations with Exists() should not group by aliases.
| Reported by: | Sigurd Ljødal | Owned by: | Mariusz Felisiak |
|---|---|---|---|
| Component: | Database layer (models, ORM) | Version: | 3.0 |
| Severity: | Release blocker | Keywords: | |
| Cc: | Simon Charette, jon.dufresne@… | Triage Stage: | Ready for checkin |
| Has patch: | yes | Needs documentation: | no |
| Needs tests: | no | Patch needs improvement: | no |
| Easy pickings: | no | UI/UX: | no |
Description
There's a regression in Django 3.0 where an annotation that is not selected is included in the group by section of the query on PostgreSQL. I looks a bit like https://code.djangoproject.com/ticket/31109, but it is still present in Django 3.0.2 (I haven't tested earlier versions yet). I haven't had time to look into where the incorrect group by cause is generated yet.
I have created a test case to reproduce the issue here: https://github.com/django/django/compare/master...ljodal:regression/annotate-groupby-postgresql
It only crashes when running against PostgreSQL. Here's the test case query:
latest_book_pubdate_qs = Book.objects.filter( publisher=OuterRef('pk') ).values_list('id') publisher_qs = Publisher.objects.values('pk').annotate( has_published_books=Exists(latest_book_pubdate_qs), ).filter( has_published_books=True ).annotate( book_count=Count('book') ).values_list('id')
The generated query is this:
SELECT "aggregation_publisher"."id" FROM "aggregation_publisher" LEFT OUTER JOIN "aggregation_book" ON ("aggregation_publisher"."id" = "aggregation_book"."publisher_id") WHERE EXISTS(SELECT U0."id" FROM "aggregation_book" U0 WHERE U0."publisher_id" = "aggregation_publisher"."id") GROUP BY "aggregation_publisher"."id", "has_published_books"
Which crashes because "has_published_books" in the "GROUP BY" clause is not a column on the table:
Traceback (most recent call last):
File "/Users/sigurdljodal/Code/django/tests/aggregation/tests.py", line 1166, in test_aggregation_annotation
print(list(publisher_qs))
File "/Users/sigurdljodal/Code/django/django/db/models/query.py", line 270, in __len__
self._fetch_all()
File "/Users/sigurdljodal/Code/django/django/db/models/query.py", line 1284, in _fetch_all
self._result_cache = list(self._iterable_class(self))
File "/Users/sigurdljodal/Code/django/django/db/models/query.py", line 144, in __iter__
return compiler.results_iter(tuple_expected=True, chunked_fetch=self.chunked_fetch, chunk_size=self.chunk_size)
File "/Users/sigurdljodal/Code/django/django/db/models/sql/compiler.py", line 1085, in results_iter
results = self.execute_sql(MULTI, chunked_fetch=chunked_fetch, chunk_size=chunk_size)
File "/Users/sigurdljodal/Code/django/django/db/models/sql/compiler.py", line 1133, in execute_sql
cursor.execute(sql, params)
File "/Users/sigurdljodal/Code/django/django/db/backends/utils.py", line 66, in execute
return self._execute_with_wrappers(sql, params, many=False, executor=self._execute)
File "/Users/sigurdljodal/Code/django/django/db/backends/utils.py", line 75, in _execute_with_wrappers
return executor(sql, params, many, context)
File "/Users/sigurdljodal/Code/django/django/db/backends/utils.py", line 84, in _execute
return self.cursor.execute(sql, params)
File "/Users/sigurdljodal/Code/django/django/db/utils.py", line 90, in __exit__
raise dj_exc_value.with_traceback(traceback) from exc_value
File "/Users/sigurdljodal/Code/django/django/db/backends/utils.py", line 84, in _execute
return self.cursor.execute(sql, params)
django.db.utils.ProgrammingError: column "has_published_books" does not exist
LINE 1: ...her"."id") GROUP BY "aggregation_publisher"."id", "has_publi...
Change History (15)
comment:1 by , 6 years ago
| Cc: | added |
|---|---|
| Severity: | Normal → Release blocker |
| Summary: | Group by with annotation regression on PostgreSQL → Multiple annotations with Exists() should not group by aliases. |
| Triage Stage: | Unreviewed → Accepted |
comment:2 by , 6 years ago
| Owner: | changed from to |
|---|---|
| Status: | new → assigned |
comment:6 by , 6 years ago
| Resolution: | fixed |
|---|---|
| Status: | closed → new |
I found a regression in this fix when I was trying to reproduce #31150, see
diff --git a/tests/aggregation/tests.py b/tests/aggregation/tests.py
index f523b3ca35..0209a9221b 100644
--- a/tests/aggregation/tests.py
+++ b/tests/aggregation/tests.py
@@ -1165,6 +1165,27 @@ class AggregateTestCase(TestCase):
'Sams',
])
+ def test_aggregation_subquery_annotation_distinct(self):
+ books_qs = Book.objects.annotate(
+ first_author_the_same_age=Subquery(
+ Author.objects.filter(
+ age=OuterRef('contact__friends__age'),
+ ).order_by('age').values('id')[:1],
+ )
+ ).filter(
+ publisher=self.p1,
+ first_author_the_same_age__isnull=False,
+ ).annotate(
+ min_age=Min('contact__friends__age'),
+ ).values('name', 'min_age').distinct()
+ self.assertEqual(list(books_qs),[
+ {
+ 'name': 'The Definitive Guide to Django: Web Development Done Right',
+ 'min_age': 29,
+ },
+ {'name': 'Practical Django Projects', 'min_age': 34},
+ ])
+
@skipUnlessDBFeature('supports_subqueries_in_group_by')
def test_group_by_subquery_annotation(self):
"""
comment:8 by , 6 years ago
| Triage Stage: | Accepted → Ready for checkin |
|---|
comment:11 by , 6 years ago
| Resolution: | → fixed |
|---|---|
| Status: | new → closed |
comment:12 by , 6 years ago
| Cc: | added |
|---|
Commit a24686987f21fe6b5339cc13607c13fa906f81a8 causes a regression for my application. I'll try to pin down a minimal example but for now, here is the backtrace:
Traceback (most recent call last):
<MY APP>
File ".../django/db/models/query.py", line 276, in __iter__
self._fetch_all()
File ".../django/db/models/query.py", line 1261, in _fetch_all
self._result_cache = list(self._iterable_class(self))
File ".../django/db/models/query.py", line 184, in __iter__
for row in compiler.results_iter(chunked_fetch=self.chunked_fetch, chunk_size=self.chunk_size):
File ".../django/db/models/sql/compiler.py", line 1096, in results_iter
results = self.execute_sql(MULTI, chunked_fetch=chunked_fetch, chunk_size=chunk_size)
File ".../django/db/models/sql/compiler.py", line 1144, in execute_sql
cursor.execute(sql, params)
File ".../django/db/backends/utils.py", line 68, in execute
return self._execute_with_wrappers(sql, params, many=False, executor=self._execute)
File ".../django/db/backends/utils.py", line 77, in _execute_with_wrappers
return executor(sql, params, many, context)
File ".../django/db/backends/utils.py", line 86, in _execute
return self.cursor.execute(sql, params)
File ".../django/db/utils.py", line 90, in __exit__
raise dj_exc_value.with_traceback(traceback) from exc_value
File ".../django/db/backends/utils.py", line 86, in _execute
return self.cursor.execute(sql, params)
django.db.utils.ProgrammingError: column "myapp_mymodel.myfield" must appear in the GROUP BY clause or be used in an aggregate function
comment:13 by , 6 years ago
| Resolution: | fixed |
|---|---|
| Status: | closed → new |
Here is a POC test to demonstrate the regression. Fails with the postgresql backend.
diff --git a/tests/aggregation/models.py b/tests/aggregation/models.py
index fd441fe51d..211a96d97c 100644
--- a/tests/aggregation/models.py
+++ b/tests/aggregation/models.py
@@ -42,3 +42,17 @@ class Store(models.Model):
def __str__(self):
return self.name
+
+
+class ThisOrThat(models.Model):
+ pass
+
+
+class This(models.Model):
+ thisorthat = models.ForeignKey("ThisOrThat", models.CASCADE)
+ created_at = models.DateTimeField(auto_now_add=True)
+
+
+class That(models.Model):
+ thisorthat = models.OneToOneField("ThisOrThat", models.CASCADE)
+ created_at = models.DateTimeField(auto_now_add=True)
diff --git a/tests/aggregation/tests.py b/tests/aggregation/tests.py
index bef415abd4..508b9cf687 100644
--- a/tests/aggregation/tests.py
+++ b/tests/aggregation/tests.py
@@ -1239,3 +1239,20 @@ class AggregateTestCase(TestCase):
contact_publisher__isnull=False,
).annotate(count=Count('authors'))
self.assertSequenceEqual(books_qs, [book])
+
+ def test_regression(self):
+ from .models import ThisOrThat
+ from django.db.models.aggregates import Max
+ from django.db.models.functions import Coalesce
+
+ qs = (
+ ThisOrThat.objects.annotate(
+ Max("this__created_at"),
+ created_at=Coalesce(
+ "this__created_at__max", "that__created_at"
+ ),
+ )
+ .order_by("created_at")
+ .values_list("pk", flat=True)
+ )
+ list(qs)
Running this test results in the following error:
======================================================================
ERROR: test_regression (aggregation.tests.AggregateTestCase)
----------------------------------------------------------------------
Traceback (most recent call last):
File "/home/jon/devel/django/django/db/backends/utils.py", line 86, in _execute
return self.cursor.execute(sql, params)
psycopg2.errors.GroupingError: column "aggregation_that.created_at" must appear in the GROUP BY clause or be used in an aggregate function
LINE 1: ...BY COALESCE(MAX("aggregation_this"."created_at"), "aggregati...
^
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/home/jon/devel/django/tests/aggregation/tests.py", line 1258, in test_regression
list(qs)
File "/home/jon/devel/django/django/db/models/query.py", line 258, in __len__
self._fetch_all()
File "/home/jon/devel/django/django/db/models/query.py", line 1261, in _fetch_all
self._result_cache = list(self._iterable_class(self))
File "/home/jon/devel/django/django/db/models/query.py", line 184, in __iter__
for row in compiler.results_iter(chunked_fetch=self.chunked_fetch, chunk_size=self.chunk_size):
File "/home/jon/devel/django/django/db/models/sql/compiler.py", line 1096, in results_iter
results = self.execute_sql(MULTI, chunked_fetch=chunked_fetch, chunk_size=chunk_size)
File "/home/jon/devel/django/django/db/models/sql/compiler.py", line 1144, in execute_sql
cursor.execute(sql, params)
File "/home/jon/devel/django/django/db/backends/utils.py", line 68, in execute
return self._execute_with_wrappers(sql, params, many=False, executor=self._execute)
File "/home/jon/devel/django/django/db/backends/utils.py", line 77, in _execute_with_wrappers
return executor(sql, params, many, context)
File "/home/jon/devel/django/django/db/backends/utils.py", line 86, in _execute
return self.cursor.execute(sql, params)
File "/home/jon/devel/django/django/db/utils.py", line 90, in __exit__
raise dj_exc_value.with_traceback(traceback) from exc_value
File "/home/jon/devel/django/django/db/backends/utils.py", line 86, in _execute
return self.cursor.execute(sql, params)
django.db.utils.ProgrammingError: column "aggregation_that.created_at" must appear in the GROUP BY clause or be used in an aggregate function
LINE 1: ...BY COALESCE(MAX("aggregation_this"."created_at"), "aggregati...
comment:14 by , 6 years ago
Jon I think you should create another ticket referring this one instead of reopening this one.
Looks like something is off with contains_aggregate handling in a24686987f21fe6b5339cc13607c13fa906f81a8.
Can you also reproduce the crash by chaining annotate calls? Could you provide the query before and after a24686987f21fe6b5339cc13607c13fa906f81a8.
comment:15 by , 6 years ago
| Resolution: | → fixed |
|---|---|
| Status: | new → closed |
I've create a new ticket #31217 for this regression.
Thanks for this report. As a temporary workaround you can use
Exists()directly inQuerySet.filter()(without an annotation), i.e.Publisher.objects.values('pk').filter( Exists(latest_book_pubdate_qs) ).annotate( book_count=Count('book') ).values_list('id')Regression in fb3f034f1c63160c0ff13c609acd01c18be12f80.
Reproduced at 495d7a1ddf3c8c83d49f099ee3599afbc28306fe.