﻿id	summary	reporter	owner	description	type	status	component	version	severity	resolution	keywords	cc	stage	has_patch	needs_docs	needs_tests	needs_better_patch	easy	ui_ux
22278	Excessive GROUP BY when using the Count annotation	Mark Jones	nobody	"The following code illustrates a group by that seems computationally expensive since it is grouping by a large number of fields when it could be grouping by ID.

The generated SQL looks like this:
{{{
SELECT ""foo_employee"".""id"", ""foo_employee"".""last_name"", ""foo_employee"".""first_name"", ""foo_employee"".""another_attribute"", COUNT(""foo_shift"".""id"") AS ""overlapping_events"" 
FROM ""foo_employee"" LEFT OUTER JOIN ""foo_shift"" ON ( ""foo_employee"".""id"" = ""foo_shift"".""employee_id"" ) 
WHERE (""foo_shift"".""end_at"" <= 2014-03-16 08:28:43.040428-07:00  AND ""foo_shift"".""start_at"" >= 2014-03-16 08:28:43.040407-07:00 ) 
GROUP BY ""foo_employee"".""id"", ""foo_employee"".""last_name"", ""foo_employee"".""first_name"", ""foo_employee"".""another_attribute"" 
ORDER BY ""overlapping_events"" ASC, ""foo_employee"".""last_name"" ASC, ""foo_employee"".""first_name"" ASC
}}}

Everything after foo_employee.id in the GROUP BY isn't needed.  Now the sql statement optimizer might be smart enough to figure this out, but then again, it might not....  Seems like if the pk is in the GROUP BY list, the rest of it isn't needed.

{{{
import datetime
from django.db import models

class Employee(models.Model):
    last_name = models.CharField(max_length=64)
    first_name = models.CharField(max_length=64)
    another_attribute = models.CharField(max_length=64)

class Shift(models.Model):
    start_at = models.DateTimeField()
    end_at = models.DateTimeField()
    employee = models.ForeignKey(Employee, related_name=""shift_set"")


start_at = datetime.datetime.now()
end_at = datetime.datetime.now()
qs = Employee.objects.filter(shift_set__start_at__gte=start_at,
                             shift_set__end_at__lte=end_at
                             ).annotate(overlapping_events=models.Count('shift_set'))
print qs.order_by('overlapping_events', 'last_name', 'first_name').query
}}}

Changing the {{{ Count('shift_set') }}} to {{{ Count('id') }}} has no effect on the group by."	Uncategorized	closed	Uncategorized	1.6	Normal	invalid		Shai Berger	Unreviewed	0	0	0	0	0	0
