﻿id	summary	reporter	owner	description	type	status	component	version	severity	resolution	keywords	cc	stage	has_patch	needs_docs	needs_tests	needs_better_patch	easy	ui_ux
29416	Undesired subquery added to the GROUP BY clause	Antoine Pinsard	Mariusz Felisiak	"I am facing an issue while upgrading from Django 1.11 to Django 2.0.

I have a complex query interacting with a legacy MySQL database, which I simplified below to highlight the issue:


{{{
>>> from user.models import Sponsor
>>> from django.db.models import ExpressionWrapper, Count, DecimalField
>>> from django.db.models.expressions import RawSQL
>>> nb_reports = RawSQL(""SELECT COUNT(*) FROM pro_moderation WHERE objType='sponsor' AND objId=ala_sponsor.sponId AND state=2"", [])
>>> str(Sponsor.objects.all().annotate(report_rate=ExpressionWrapper(nb_reports / Count('deliveries'), output_field=DecimalField())).order_by('-report_rate').query)
}}}

This code, in **Django 1.11.9,** gives me the following query:

{{{
SELECT `ala_sponsor`.`sponId`, [...], ((SELECT COUNT(*) FROM pro_moderation WHERE objType='sponsor' AND objId=ala_sponsor.sponId AND state=2) / COUNT(`ala_sponsor_need`.`asnId`)) AS `report_rate`
FROM `ala_sponsor` LEFT OUTER JOIN `ala_sponsor_need` ON (`ala_sponsor`.`sponId` = `ala_sponsor_need`.`asnSponId`)
GROUP BY `ala_sponsor`.`sponId` ORDER BY `report_rate` DESC
}}}

This is the expected behavior and it works well.

However, in **Django 2.0.5**, the same code gives me this query:

{{{
SELECT `ala_sponsor`.`sponId`, [...], ((SELECT COUNT(*) FROM pro_moderation WHERE objType='sponsor' AND objId=ala_sponsor.sponId AND state=2) / COUNT(`ala_sponsor_need`.`asnId`)) AS `report_rate`
FROM `ala_sponsor` LEFT OUTER JOIN `ala_sponsor_need` ON (`ala_sponsor`.`sponId` = `ala_sponsor_need`.`asnSponId`)
GROUP BY `ala_sponsor`.`sponId`, (SELECT COUNT(*) FROM pro_moderation WHERE objType='sponsor' AND objId=ala_sponsor.sponId AND state=2)
ORDER BY `report_rate` DESC
}}}

As you can see, the ORM appended the subquery `(SELECT COUNT(*) FROM pro_moderation WHERE objType='sponsor' AND objId=ala_sponsor.sponId AND state=2)` to the GROUP BY clause. Which is wrong, and takes forever to execute.

-----
-----

I tried to play with `.values('id')` or such as I usually do when I get unexpected GROUP BY. I spent an afternoon on it but there's no way I could get rid of this undesired group by clause. The `order_by` is not to blame either. Here is another example of what I tried:

{{{
str(Sponsor.objects.all().values('id').annotate(report_rate=ExpressionWrapper(nb_reports / Count('deliveries'), output_field=DecimalField())).order_by().query)
}}}

Which gives:

{{{
SELECT `ala_sponsor`.`sponId`, ((SELECT COUNT(*) FROM pro_moderation WHERE objType='sponsor' AND objId=ala_sponsor.sponId AND state=2) / COUNT(`ala_sponsor_need`.`asnId`)) AS `report_rate`
FROM `ala_sponsor` LEFT OUTER JOIN `ala_sponsor_need` ON (`ala_sponsor`.`sponId` = `ala_sponsor_need`.`asnSponId`)
GROUP BY `ala_sponsor`.`sponId`, (SELECT COUNT(*) FROM pro_moderation WHERE objType='sponsor' AND objId=ala_sponsor.sponId AND state=2)
ORDER BY NULL
}}}

Also note that this is the `annotate(report_rate=ExpressionWrapper(nb_reports / Count('deliveries'), output_field=DecimalField()))` that causes this issue. If I only do `annotate(nb_reports=nb_reports)` or `annotate(nb_deliveries=COUNT('deliveries'))` there is no additional GROUP BY clause generated.

{{{
In [40]: str(Sponsor.objects.all().values('id').annotate(nb_reports=nb_reports).order_by().query)
Out[40]: ""SELECT `ala_sponsor`.`sponId`, (SELECT COUNT(*) FROM pro_moderation WHERE objType='sponsor' AND objId=ala_sponsor.sponId AND state=2) AS `nb_reports` FROM `ala_sponsor`""

In [41]: str(Sponsor.objects.all().values('id').annotate(nb_deliveries=Count('deliveries')).order_by().query)
Out[41]: 'SELECT `ala_sponsor`.`sponId`, COUNT(`ala_sponsor_need`.`asnId`) AS `nb_deliveries` FROM `ala_sponsor` LEFT OUTER JOIN `ala_sponsor_need` ON (`ala_sponsor`.`sponId` = `ala_sponsor_need`.`asnSponId`) GROUP BY `ala_sponsor`.`sponId` ORDER BY NULL'

In [42]: str(Sponsor.objects.all().values('id').annotate(nb_reports=nb_reports, nb_deliveries=Count('deliveries')).order_by().query)
Out[42]: ""SELECT `ala_sponsor`.`sponId`, COUNT(`ala_sponsor_need`.`asnId`) AS `nb_deliveries`, (SELECT COUNT(*) FROM pro_moderation WHERE objType='sponsor' AND objId=ala_sponsor.sponId AND state=2) AS `nb_reports` FROM `ala_sponsor` LEFT OUTER JOIN `ala_sponsor_need` ON (`ala_sponsor`.`sponId` = `ala_sponsor_need`.`asnSponId`) GROUP BY `ala_sponsor`.`sponId`, (SELECT COUNT(*) FROM pro_moderation WHERE objType='sponsor' AND objId=ala_sponsor.sponId AND state=2) ORDER BY NULL""
}}}"	Bug	closed	Database layer (models, ORM)	2.0	Release blocker	fixed	groupby, subquery	Mariusz Felisiak	Accepted	1	0	0	0	0	0
