Opened 4 years ago

Closed 17 months ago

Last modified 4 months ago

#31331 closed Bug (fixed)

GROUP BY optimization doesn't work properly when ONLY_FULL_GROUP_BY mode is enabled on MySQL.

Reported by: Mariusz Felisiak Owned by: Simon Charette
Component: Database layer (models, ORM) Version: dev
Severity: Normal Keywords: ONLY_FULL_GROUP_BY MySQL MariaDB
Cc: Simon Charette, Adam Johnson, Charlie Denton Triage Stage: Ready for checkin
Has patch: yes Needs documentation: no
Needs tests: no Patch needs improvement: no
Easy pickings: no UI/UX: no

Description

GROUP BY optimization doesn't work properly when ONLY_FULL_GROUP_BY mode is enabled on MySQL/MariaDB (default on MySQL 5.7+) and can cause:

(1055, "Expression #1 of SELECT list is not in GROUP BY clause and contains nonaggregated column 'TABLE.COLUMN' which is not functionally dependent on columns in GROUP BY clause; this is incompatible with sql_mode=only_full_group_by")

See tests.aggregation.tests.AggregateTestCase.test_aggregation_subquery_annotation_multivalued().

Change History (17)

comment:1 by Simon Charette, 4 years ago

Cc: Simon Charette added

comment:2 by Mariusz Felisiak <felisiak.mariusz@…>, 4 years ago

In a19505eb:

Refs #31331 -- Added DatabaseWrapper.sql_mode to MySQL.

comment:3 by Mariusz Felisiak <felisiak.mariusz@…>, 4 years ago

In aee0beb:

Refs #31331 -- Checked ONLY_FULL_GROUP_BY mode in AggregateTestCase.test_aggregation_subquery_annotation_multivalued().

comment:4 by Adam Johnson, 4 years ago

Cc: Adam Johnson added

comment:5 by Charlie Denton, 3 years ago

Cc: Charlie Denton added

comment:6 by Simon Charette, 3 years ago

For anyone interested in tackling this, proper support for ONLY_FULL_GROUP_BY could likely be added by adjusting the MySQL's group by collapsing feature to take the presence of this flag into account.

Last edited 3 years ago by Simon Charette (previous) (diff)

comment:7 by Jordan Bae, 21 months ago

Owner: changed from nobody to Jordan Bae
Status: newassigned

I will look into this and make fix! Thanks!

comment:8 by Simon Charette, 19 months ago

Jordan, if you're still working on this issue the solution is likely to have the allows_group_by_pk feature flag return "ONLY_FULL_GROUP_BY" not in self.connection.sql_mode instead of always True for the MySQL backend and then revert aee0bebc2faf9c6de8211b05d5f1281dc016084f.

comment:9 by Jordan Bae, 19 months ago

Thanks for giving advise. I will refer your advise.

comment:10 by Jordan Bae, 19 months ago

Hi, Mariusz Felisiak.

I don't understand fully GROUP BY optimization meaning.
Do you mean when you add column which doesn't include on group by into select list, if it is functionally dependent, it work well?

ex)

SELECT
	id,
	name,
	count(*)
FROM
	aggregation_book
GROUP BY
	id

when i check query with tests.aggregation.tests.AggregateTestCase.test_aggregation_subquery_annotation_multivalued()., author_qs create below query.

SELECT
	`aggregation_author`.`id`,
	`aggregation_author`.`name`,
	`aggregation_author`.`age`,
	`aggregation_author`.`rating`,
	(
		SELECT
			U0. `id`
		FROM
			`aggregation_author` U0
			INNER JOIN `aggregation_book_authors` U1 ON (U0. `id` = U1. `author_id`)
			INNER JOIN `aggregation_book` U2 ON (U1. `book_id` = U2. `id`)
		WHERE (U2. `name` = (`aggregation_book`.`name`)
			AND U0. `id` = (`aggregation_author`.`id`))
	) AS `subquery_id`,
 	COUNT(`aggregation_book_authors`.`book_id`) AS `count`
FROM
	`aggregation_author`
	LEFT OUTER JOIN `aggregation_book_authors` ON (`aggregation_author`.`id` = `aggregation_book_authors`.`author_id`)
	LEFT OUTER JOIN `aggregation_book` ON (`aggregation_book_authors`.`book_id` = `aggregation_book`.`id`)
GROUP BY
	`aggregation_author`.`id`
ORDER BY
	NULL

And MySQL return error with below comments.

Query 1 ERROR: Expression #5 of SELECT list is not in GROUP BY clause and contains nonaggregated column 'djangosample.aggregation_book.name' which is not functionally dependent on columns in GROUP BY clause; this is incompatible with sql_mode=only_full_group_by

It is because of aggregation_book's column.
if we changed query like below, it works.

SELECT
	`aggregation_author`.`id`,
	`aggregation_author`.`name`,
	`aggregation_author`.`age`,
	`aggregation_author`.`rating`,
	(
		SELECT
			U0. `id`
		FROM
			`aggregation_author` U0
			INNER JOIN `aggregation_book_authors` U1 ON (U0. `id` = U1. `author_id`)
			INNER JOIN `aggregation_book` U2 ON (U1. `book_id` = U2. `id`)
		WHERE (U0. `id` = (`aggregation_author`.`id`))
	) AS `subquery_id`,
 	COUNT(`aggregation_book_authors`.`book_id`) AS `count`
FROM
	`aggregation_author`
	LEFT OUTER JOIN `aggregation_book_authors` ON (`aggregation_author`.`id` = `aggregation_book_authors`.`author_id`)
	LEFT OUTER JOIN `aggregation_book` ON (`aggregation_book_authors`.`book_id` = `aggregation_book`.`id`)
GROUP BY
	`aggregation_author`.`id`
ORDER BY
	NULL

I think if we exclude book_name filter, it will work with MySQ, too. or we can add new testcases.

And i didn't understand why Simon said allows_group_by_pk feature flag return "ONLY_FULL_GROUP_BY" not in self.connection.sql_mode instead of always True. I think we can use allows_group_by_pk feature when there is no other table's column in the select list and having and order by.

comment:11 by Jordan Bae, 19 months ago

And i checked in the postgresql it generate query like below.

SELECT
	"aggregation_author"."id",
	"aggregation_author"."name",
	"aggregation_author"."age",
	"aggregation_author"."rating",
	(
		SELECT
			U0. "id"
		FROM
			"aggregation_author" U0
			INNER JOIN "aggregation_book_authors" U1 ON (U0. "id" = U1. "author_id")
			INNER JOIN "aggregation_book" U2 ON (U1. "book_id" = U2. "id")
		WHERE (U2. "name" = ("aggregation_book"."name")
			AND U0. "id" = ("aggregation_author"."id"))) AS "subquery_id", COUNT("aggregation_book_authors"."book_id") AS "count"
FROM
	"aggregation_author"
	LEFT OUTER JOIN "aggregation_book_authors" ON ("aggregation_author"."id" = "aggregation_book_authors"."author_id")
	LEFT OUTER JOIN "aggregation_book" ON ("aggregation_book_authors"."book_id" = "aggregation_book"."id")
GROUP BY
	"aggregation_author"."id",
	(
		SELECT
			U0. "id"
		FROM
			"aggregation_author" U0
			INNER JOIN "aggregation_book_authors" U1 ON (U0. "id" = U1. "author_id")
			INNER JOIN "aggregation_book" U2 ON (U1. "book_id" = U2. "id")
		WHERE (U2. "name" = ("aggregation_book"."name")
			AND U0. "id" = ("aggregation_author"."id")))

And in mysql, The same error occurs when subquery is added to group by like postgresql.

SELECT
	`aggregation_author`.`id`,
	`aggregation_author`.`name`,
	`aggregation_author`.`age`,
	`aggregation_author`.`rating`,
	(
		SELECT
			U0. `id`
		FROM
			`aggregation_author` U0
			INNER JOIN `aggregation_book_authors` U1 ON (U0. `id` = U1. `author_id`)
			INNER JOIN `aggregation_book` U2 ON (U1. `book_id` = U2. `id`)
		WHERE (U2. `name` = (`aggregation_book`.`name`)
			AND U0. `id` = (`aggregation_author`.`id`))
	) AS `subquery_id`,
 	COUNT(`aggregation_book_authors`.`book_id`) AS `count`
FROM
	`aggregation_author`
	LEFT OUTER JOIN `aggregation_book_authors` ON (`aggregation_author`.`id` = `aggregation_book_authors`.`author_id`)
	LEFT OUTER JOIN `aggregation_book` ON (`aggregation_book_authors`.`book_id` = `aggregation_book`.`id`)
GROUP BY
	`aggregation_author`.`id`,
	(
		SELECT
			U0. `id`
		FROM
			`aggregation_author` U0
			INNER JOIN `aggregation_book_authors` U1 ON (U0. `id` = U1. `author_id`)
			INNER JOIN `aggregation_book` U2 ON (U1. `book_id` = U2. `id`)
		WHERE (U2. `name` = (`aggregation_book`.`name`)
			AND U0. `id` = (`aggregation_author`.`id`))
	)

Query 1 ERROR: Expression #5 of SELECT list is not in GROUP BY clause and contains nonaggregated column 'djangosample.aggregation_book.name' which is not functionally dependent on columns in GROUP BY clause; this is incompatible with sql_mode=only_full_group_by

I think it's some kind issue on mysql logic for functionally dependent check.

comment:12 by Simon Charette, 17 months ago

Has patch: set
Owner: changed from Jordan Bae to Simon Charette

Jordan, I ended up working on a solution to the issue in an attempt to make the cut for the 4.2 release.

Hopefully the proposed MR helps you understand the issues you might have run into.

comment:13 by Mariusz Felisiak, 17 months ago

Triage Stage: AcceptedReady for checkin

comment:14 by Mariusz Felisiak <felisiak.mariusz@…>, 17 months ago

Resolution: fixed
Status: assignedclosed

In 041551d7:

Fixed #31331 -- Switched MySQL to group by selected primary keys.

MySQL 5.7.15 supports group by functional dependences so there is no
need to special case group by main table primary key anymore and
special case the ONLY_FULL_GROUP_BY sql mode.

comment:15 by Mariusz Felisiak <felisiak.mariusz@…>, 4 months ago

In a652f07:

Fixed #34978, Refs #31331 -- Added backward incompatibility note about raw aggregations on MySQL.

Thanks Matthew Somerville for the report.

comment:16 by Natalia <124304+nessita@…>, 4 months ago

In cdb14cc1:

[4.2.x] Fixed #34978, Refs #31331 -- Added backward incompatibility note about raw aggregations on MySQL.

Thanks Matthew Somerville for the report.

Backport of a652f0759651dd7103ed04336ef85dc410f680c1 from main

comment:17 by Natalia <124304+nessita@…>, 4 months ago

In cbd1e91:

[5.0.x] Fixed #34978, Refs #31331 -- Added backward incompatibility note about raw aggregations on MySQL.

Thanks Matthew Somerville for the report.

Backport of a652f0759651dd7103ed04336ef85dc410f680c1 from main

Note: See TracTickets for help on using tickets.
Back to Top