﻿id	summary	reporter	owner	description	type	status	component	version	severity	resolution	keywords	cc	stage	has_patch	needs_docs	needs_tests	needs_better_patch	easy	ui_ux
23557	Prevent silent extension of explicit GROUP BY when using order_by	Brian May	ontowhee	"Hello,

With the following db model:

{{{
@python_2_unicode_compatible
class CPUJob(models.Model):
    account = models.ForeignKey(Account, blank=True, null=True)
    username = models.CharField(max_length=50, blank=True, null=True)
    project = models.ForeignKey(Project, null=True, blank=True)
    machine = models.ForeignKey(Machine, blank=True, null=True)
    date = models.DateField(db_index=True, blank=True, null=True)
    queue = models.ForeignKey(Queue, blank=True, null=True)
    cpu_usage = models.BigIntegerField(blank=True, null=True)
    mem = models.BigIntegerField(blank=True, null=True)
    vmem = models.BigIntegerField(blank=True, null=True)
    ctime = models.DateTimeField(blank=True, null=True)
    qtime = models.DateTimeField(blank=True, null=True)
    etime = models.DateTimeField(blank=True, null=True)
    start = models.DateTimeField(blank=True, null=True)
    act_wall_time = models.BigIntegerField(blank=True, null=True)
    est_wall_time = models.BigIntegerField(blank=True, null=True)
    jobid = models.CharField(max_length=50, blank=True, null=True, unique=True)
    cores = models.BigIntegerField(blank=True, null=True)
    list_mem = models.BigIntegerField(blank=True, null=True)
    list_pmem = models.BigIntegerField(blank=True, null=True)
    list_vmem = models.BigIntegerField(blank=True, null=True)
    list_pvmem = models.BigIntegerField(blank=True, null=True)
    exit_status = models.BigIntegerField(blank=True, null=True)
    jobname = models.CharField(max_length=256, blank=True, null=True)
    software = models.ManyToManyField(SoftwareVersion, blank=True, null=True)

    class Meta:                                                                  
        ordering = ['-date']                                                     
        db_table = 'cpu_job'                                                     
}}}

The following on mysql produces a good result:

{{{
q = CPUJob.objects.values('project').annotate(usage=Sum('cpu_usage'), jobs=Count('id'))
print q.query
}}}

of

{{{
SELECT `cpu_job`.`project_id`, SUM(`cpu_job`.`cpu_usage`) AS `usage`, COUNT(`cpu_job`.`id`) AS `jobs` FROM `cpu_job` GROUP BY `cpu_job`.`project_id` ORDER BY `cpu_job`.`date` DESC
}}}

However on Postgresql, with the same data, I get the following query:

{{{
SELECT ""cpu_job"".""project_id"", SUM(""cpu_job"".""cpu_usage"") AS ""usage"", COUNT(""cpu_job"".""id"") AS ""jobs"" FROM ""cpu_job"" GROUP BY ""cpu_job"".""project_id"", ""cpu_job"".""date"" ORDER BY ""cpu_job"".""date"" DESC
}}}

Note additional term ""cpu_job"".""date"" in the GROUP BY. I did not ask for it, it got put there.

I suspect the problem is the sort order on the table. The SQL seems to be fine on mysql (really?????), but appears to be invalid on Postgresql (this actually makes more sense to me). So Django appears to be silently adjusting the request to make it valid on Postgresql.

{{{
karaage=> SELECT ""cpu_job"".""project_id"", SUM(""cpu_job"".""cpu_usage"") AS ""usage"", COUNT(""cpu_job"".""id"") AS ""jobs"" FROM ""cpu_job"" GROUP BY ""cpu_job"".""project_id"" ORDER BY ""cpu_job"".""date"" DESC
;
ERROR:  column ""cpu_job.date"" must appear in the GROUP BY clause or be used in an aggregate function
LINE 1: ...cpu_job"" GROUP BY ""cpu_job"".""project_id"" ORDER BY ""cpu_job""....
}}}

I would much rather Django gives an error under Postgresql (and maybe even MYSQL too) rather then silently changing the query, and giving different results to what I expected.

(for this particular query, I don't need sorting, and had not noticed it would cause problems - specifying order_by() seems to solve this problem).

Thanks"	Bug	assigned	Database layer (models, ORM)	1.7	Normal			Ryan Cheley	Accepted	1	0	0	1	0	0
