Opened 10 years ago
Closed 10 years ago
#24233 closed Bug (wontfix)
group-by column construction is broken
Reported by: | megatrump | Owned by: | nobody |
---|---|---|---|
Component: | Database layer (models, ORM) | Version: | 1.8alpha1 |
Severity: | Normal | Keywords: | |
Cc: | Triage Stage: | Accepted | |
Has patch: | no | Needs documentation: | no |
Needs tests: | no | Patch needs improvement: | no |
Easy pickings: | no | UI/UX: | no |
Description
I recently moved to 1.8alpha version from an older version django-e9d1f11 and I started seeing this error. Looks like the way group by is constructed has changed. I am using postgres database.
I could reproduce the behavior with a simple model.
Here is the model that I have.
class everydaycost(models.Model): cost = models.DecimalField(max_digits=20, decimal_places=2) rev = models.DecimalField(max_digits=20, decimal_places=2) ts = models.DateField(db_index=True)
I am trying to get roi (rev/cost) for each day of week.
$ python manage.py shell Python 2.7.5 (default, Mar 9 2014, 22:15:05) [GCC 4.2.1 Compatible Apple LLVM 5.0 (clang-500.0.68)] on darwin Type "help", "copyright", "credits" or "license" for more information. (InteractiveConsole) >>> >>> from testgroupby.models import * >>> from django.db.models import Q, Count, Min, Sum, Func, FloatField >>> qs = everydaycost.objects.all() >>> qs = qs.extra(select={'dow': "TO_CHAR(ts, 'D')"}) >>> qs = qs.values('dow') >>> qs = qs.annotate(sum_cost=Sum('cost')) >>> qs = qs.annotate(sum_rev=Sum('rev')) >>> qs = qs.annotate(roi=Func(A='CAST(Sum("rev") as numeric)', B='CAST(Sum("cost") as numeric)', ... template='ROUND(COALESCE(%(A)s / NULLIF(%(B)s,0), 0), 2)', output_field=FloatField())) >>> print qs DEBUG (0.002) SELECT (TO_CHAR(ts, 'D')) AS "dow", SUM("testgroupby_everydaycost"."cost") AS "sum_cost", SUM("testgroupby_everydaycost"."rev") AS "sum_rev", ROUND(COALESCE(CAST(Sum("rev") as numeric) / NULLIF(CAST(Sum("cost") as numeric),0), 0), 2) AS "roi" FROM "testgroupby_everydaycost" GROUP BY (TO_CHAR(ts, 'D')), ROUND(COALESCE(CAST(Sum("rev") as numeric) / NULLIF(CAST(Sum("cost") as numeric),0), 0), 2) LIMIT 21; args=() Traceback (most recent call last): File "<console>", line 1, in <module> File "/Downloads/Django-1.8a1/django/db/models/query.py", line 139, in __repr__ data = list(self[:REPR_OUTPUT_SIZE + 1]) File "/Downloads/Django-1.8a1/django/db/models/query.py", line 163, in __iter__ self._fetch_all() File "/Downloads/Django-1.8a1/django/db/models/query.py", line 955, in _fetch_all self._result_cache = list(self.iterator()) File "/Downloads/Django-1.8a1/django/db/models/query.py", line 1075, in iterator for row in self.query.get_compiler(self.db).results_iter(): File "/Downloads/Django-1.8a1/django/db/models/sql/compiler.py", line 780, in results_iter results = self.execute_sql(MULTI) File "/Downloads/Django-1.8a1/django/db/models/sql/compiler.py", line 826, in execute_sql cursor.execute(sql, params) File "/Downloads/Django-1.8a1/django/db/backends/utils.py", line 80, in execute return super(CursorDebugWrapper, self).execute(sql, params) File "/Downloads/Django-1.8a1/django/db/backends/utils.py", line 65, in execute return self.cursor.execute(sql, params) File "/Downloads/Django-1.8a1/django/db/utils.py", line 95, in __exit__ six.reraise(dj_exc_type, dj_exc_value, traceback) File "/Downloads/Django-1.8a1/django/db/backends/utils.py", line 65, in execute return self.cursor.execute(sql, params) ProgrammingError: aggregate functions are not allowed in GROUP BY LINE 1: ... GROUP BY (TO_CHAR(ts, 'D')), ROUND(COALESCE(CAST(Sum("rev")... ^ >>> print qs.query SELECT (TO_CHAR(ts, 'D')) AS "dow", SUM("testgroupby_everydaycost"."cost") AS "sum_cost", SUM("testgroupby_everydaycost"."rev") AS "sum_rev", ROUND(COALESCE(CAST(Sum("rev") as numeric) / NULLIF(CAST(Sum("cost") as numeric),0), 0), 2) AS "roi" FROM "testgroupby_everydaycost" GROUP BY (TO_CHAR(ts, 'D')), ROUND(COALESCE(CAST(Sum("rev") as numeric) / NULLIF(CAST(Sum("cost") as numeric),0), 0), 2) >>>
With earlier version, this worked fine for me.
$ python manage.py shell Python 2.7.5 (default, Mar 9 2014, 22:15:05) [GCC 4.2.1 Compatible Apple LLVM 5.0 (clang-500.0.68)] on darwin Type "help", "copyright", "credits" or "license" for more information. (InteractiveConsole) >>> >>> from testgroupby.models import * >>> from django.db.models import Q, Count, Min, Sum, Func, FloatField >>> qs = everydaycost.objects.all() >>> qs = qs.extra(select={'dow': "TO_CHAR(ts, 'D')"}) >>> qs = qs.values('dow') >>> qs = qs.annotate(sum_cost=Sum(' cost')) >>> qs = qs.annotate(sum_rev=Sum('rev')) >>> qs = qs.annotate(roi=Func(A='CAST(Sum("rev") as numeric)', B='CAST(Sum("cost") as numeric)', ... template='ROUND(COALESCE(%(A)s / NULLIF(%(B)s,0), 0), 2)', output_field=FloatField())) >>> print qs DEBUG (0.002) SELECT (TO_CHAR(ts, 'D')) AS "dow", SUM("testgroupby_everydaycost"."cost") AS "sum_cost", SUM("testgroupby_everydaycost"."rev") AS "sum_rev", ROUND(COALESCE(CAST(Sum("rev") as numeric) / NULLIF(CAST(Sum("cost") as numeric),0), 0), 2) AS "roi" FROM "testgroupby_everydaycost" GROUP BY (TO_CHAR(ts, 'D')) LIMIT 21; args=() [] >>> print qs.query SELECT (TO_CHAR(ts, 'D')) AS "dow", SUM("testgroupby_everydaycost"."cost") AS "sum_cost", SUM("testgroupby_everydaycost"."rev") AS "sum_rev", ROUND(COALESCE(CAST(Sum("rev") as numeric) / NULLIF(CAST(Sum("cost") as numeric),0), 0), 2) AS "roi" FROM "testgroupby_everydaycost" GROUP BY (TO_CHAR(ts, 'D')) >>>
Change History (1)
comment:1 by , 10 years ago
Resolution: | → wontfix |
---|---|
Status: | new → closed |
Triage Stage: | Unreviewed → Accepted |
Note:
See TracTickets
for help on using tickets.
Yes, I believe this change was introduced with this PR https://github.com/django/django/pull/3669, but it may have been another. Grouping was pretty broken when using complicated expressions, because the backing columns were being added to the GROUP BY rather than the entire expression. It was lucky you were getting the right results.
Now, the reason this is failing for you now, is because you're effectively writing RawSQL (there's a private expression type for this), without any of the benefits of new expressions. You may as well be writing
.extra()
calls. To fully leverage the power of expressions, you should use the smaller types to build up bigger ones. Expressions know that if they reference an aggregate, then they shouldn't be added to the group by clause. But you've just stuck a raw SUM expression into the roi calculation, and Django can't know that it's now referencing an aggregate.I've replicated (mostly..) your query only using full expressions. I cheated with a couple (using Func(function='..')) rather than building out the types fully, but they'll serve for now. I've broken each part down into logical chunks, and have printed the underlying query at each stage to make this easier to follow.
First, annotate with the day of week. Annotate can do everything extra can do with regards to adding columns to the select list:
Next, add the aggregates we're interested in:
Now, the ROI calculation. Note, this could be improved by using Case/When expressions
And a quick Case/When example, to show how you could avoid doing division by NULL or 0:
I've probably put a little too much detail above, but hopefully it'll help you solve your problem in a supported way. I'm going to close this as wontfix, because the types aren't being used in the right way. If the documentation for the expressions or function types needs to be improved, please reopen with some comments on where the clarification is required, and I'll work on updating them to make it clearer.