Opened 10 years ago

Closed 9 years ago

#24044 closed Cleanup/optimization (needsinfo)

filtering by annotation executes sql function twice

Reported by: Tim-Erwin Owned by: nobody
Component: Database layer (models, ORM) Version: dev
Severity: Normal Keywords: annotate filter
Cc: josh.smeaton@… Triage Stage: Accepted
Has patch: no Needs documentation: no
Needs tests: no Patch needs improvement: no
Easy pickings: no UI/UX: no

Description

I have two models A and B connected by a ManyToOne relation (A has many Bs). When I filter A by the number of related Bs

A.objects.annotate(myCount=Count('bs')).filter(myCount__gt=2)

Django creates a (my)SQL query like so:

SELECT `a`.`id`, `a`.`name`, COUNT(`b`.`id`) AS `myCount` FROM `a` LEFT OUTER JOIN `b` ON ( `a`.`id` = `b`.`a_id` ) GROUP BY `a`.`id` HAVING COUNT(`b`.`id`) > 2 ORDER BY NULL LIMIT 21;

In principle this works, but COUNT is executed twice instead of using the alias 'myCount'. This might not matter in this particular example but I have a custom SQL function that I want to use the same way. It is much more costly and thus the query execution time is basically doubled.

I cannot see any scenario where executing the function twice is of any benefit. I'd suggest to fix this so the alias is used in the HAVING clause.

Change History (4)

comment:1 by Tim Graham, 10 years ago

Triage Stage: UnreviewedAccepted
Type: BugCleanup/optimization
Version: 1.7master

Not sure how difficult this might be to implement or if there are problems with doing so that I'm not seeing. If not, we could document the limitation.

comment:2 by Josh Smeaton, 10 years ago

Referring to aliases in the having clause is only supported by MySQL, and only if the ONLY_FULL_GROUP_BY option is not enabled (so it's supported by default). It is not standard SQL.

To support the SQL syntax of referring to aliases, Django would have to detect (or document) whether or not ONLY_FULL_GROUP_BY is active, and then only generate this kind of SQL for MySQL backend with ONLY_FULL_GROUP_BY disabled.

Are you sure that there is a performance difference between using the alias or using the function again? I would be extremely surprised if the query with the alias wasn't rewritten by the query optimiser to the standard structure anyway. I'm finding it very difficult to determine whether or not there is a performance difference without doing some tests - google-fu isn't turning anything up.

If we can determine a performance difference, then this ticket will be worth following up. If there is no performance difference though, we should close as wontfix, because the style of SQL used shouldn't matter in this case.

Tim-Erwin, are you able to share any concrete performance figures or execution plans?

comment:3 by Josh Smeaton, 10 years ago

Cc: josh.smeaton@… added

comment:4 by Tim Graham, 9 years ago

Resolution: needsinfo
Status: newclosed

Closing as needsinfo in absence of follow up.

Note: See TracTickets for help on using tickets.
Back to Top