Opened 4 years ago

Closed 4 years ago

Last modified 4 years ago

#32006 closed Bug (wontfix)

Aggregation generates wrong SQL query on PostgreSQL partionned tables.

Reported by: Konstantin Popov Owned by: nobody
Component: Database layer (models, ORM) Version: 3.1
Severity: Normal Keywords: QuerySet values annotate GROUP BY
Cc: Triage Stage: Unreviewed
Has patch: no Needs documentation: no
Needs tests: no Patch needs improvement: no
Easy pickings: no UI/UX: no

Description

I use values and annotate on QuerySet to make query for my report.
When I add 'id' or 'pk' field to another fields in values it becomes the only field in GROUP BY clause.
This, obviously, generates an invalid SQL query.

str(Currency.objects.filter(status=3).values('sifr','guid').annotate(cnt=Count('*')).query)

'SELECT "rk7data_currency"."sifr", "rk7data_currency"."guid", COUNT(*) AS "cnt" FROM "rk7data_currency" WHERE "rk7data_currency"."status" = 3 GROUP BY "rk7data_currency"."sifr", "rk7data_currency"."guid"'

str(Currency.objects.filter(status=3).values('sifr','guid','id').annotate(cnt=Count('*')).query)

'SELECT "rk7data_currency"."sifr", "rk7data_currency"."guid", "rk7data_currency"."id", COUNT(*) AS "cnt" FROM "rk7data_currency" WHERE "rk7data_currency"."status" = 3 GROUP BY "rk7data_currency"."id"'

I have checked this on several models with the same result.
I have discovered this on django 2.2.10. Upgrade to 2.2.16 and even to 3.1 does not solve the problem.

Change History (5)

comment:1 by Mariusz Felisiak, 4 years ago

Resolution: invalid
Status: newclosed
Summary: QuerySet with values with annotate generates wrong SQL queryQuerySet with values with annotate generates wrong SQL query on PostgreSQL.

PostgreSQL allows grouping by selected PKs, so optimized query works properly, see #19259.

comment:2 by Konstantin Popov, 4 years ago

But this does not work properly! I receive error from postgresql. It says other fields should be in GROUP BY.
This example query is simplified, the real query includes several joined tables.

SELECT "rk7data_globalshift"."shiftdate" AS "date", 
 "rk7data_restaurant"."name" AS "rest", "rk7data_printcheck"."check_number" 
 AS "chck", "rk7data_printcheck"."moment" AS "check_mom", 
 "rk7data_printcheck"."id" AS "check_id", "rk7data_printcheck"."period" AS 
 "check_period", COUNT("rk7data_printcheck"."check_number") AS "qnt", 
 SUM("rk7data_printcheck"."binded_sum") AS "pay" FROM "rk7data_printcheck" 
 INNER JOIN "rk7data_order" ON ("rk7data_printcheck"."order_id" = 
 "rk7data_order"."id" AND "rk7data_printcheck"."period" = 
 "rk7data_order"."period") INNER JOIN "rk7data_globalshift" ON 
 ("rk7data_order"."shift_id" = "rk7data_globalshift"."id") INNER JOIN 
 "rk7data_midserver" ON ("rk7data_globalshift"."midserver_id" = 
 "rk7data_midserver"."id") INNER JOIN "rk7data_restaurant" ON 
 ("rk7data_midserver"."restaurant_id" = "rk7data_restaurant"."id") WHERE 
 ("rk7data_printcheck"."deleted" = %s AND "rk7data_printcheck"."isbill" = %s 
 AND "rk7data_order"."shift_id" IN (SELECT U0."id" FROM "rk7data_globalshift" 
 U0 WHERE U0."shiftdate" BETWEEN %s AND %s) AND "rk7data_printcheck"."period" 
 IN (%s, %s, %s)) GROUP BY "rk7data_globalshift"."shiftdate", 
 "rk7data_restaurant"."name", "rk7data_printcheck"."id" ORDER BY "rest" ASC, 
 "date" ASC, "chck" ASC

And the error message is

ERROR: column "rk7data_printcheck.check_number" must appear in GROUP BY clause or used in aggregate function 
LINE 1: ... AS "date", "rk7data_restaurant". "Name" AS "rest", "rk7data_p ...

I have found workaround for this, I have wrapped 'id' field with Coalesce function, but I don't think it is good solution.

Will you reopen this issue now?

comment:3 by Simon Charette, 4 years ago

Please provide a reduced set of model that results in the must appear in GROUP BY clause or used in aggregate function exception in order to help diagnose your issue.

Is rk7data_printcheck a view or table? If it's a view you'll likely need to mark it as Meta.managed = False.

Something seems also odd with this query, why is "rk7data_printcheck"."check_number" both aliased an counted against via COUNT("rk7data_printcheck"."check_number").

comment:4 by Konstantin Popov, 4 years ago

I just now realized what is causing the problem.

I use declarative partitioning in some tables, so the primary key must include the partitioning key and has two fields, 'id' and 'period'.

To use such tables with Django, I developed my own versions of the CreateModel and AddField migration operations, as well as my own version of the ForeignKey field.

However, the Django Model class code still considers its primary key to be a single 'id' field.
I have already discovered a number of side effects of this solution. Now there is another one.

It may be best to leave this ticket closed. Please accept my apologies for the needless concern.

I am ready to provide an abbreviated version of my code here if anyone is interested.

comment:5 by Simon Charette, 4 years ago

Resolution: invalidwontfix
Summary: QuerySet with values with annotate generates wrong SQL query on PostgreSQL.Aggregation generates wrong SQL query on PostgreSQL partionned tables.

I have already discovered a number of side effects of this solution. Now there is another one. It may be best to leave this ticket closed. Please accept my apologies for the needless concern. I am ready to provide an abbreviated version of my code here if anyone is interested

Thanks for the following report. The reason why I was asking what kind of table rk7data_printcheck was is that the group by selected table primary key optimization is known to break for views, foreign tables, and any other table-like objects that PostgreSQL cannot introspect enough to determine this optimization is safe.

The optimization does make a huge difference when aggregating over tables containing columns costly to group so we decided to keep it around even with the aforementioned caveats. We did add a per-model hook to disable though if that can be of any help to you #28107.

Last edited 4 years ago by Simon Charette (previous) (diff)
Note: See TracTickets for help on using tickets.
Back to Top