#25136 closed Cleanup/optimization (fixed)
Add an example with Count('X', distinct=True) to the queries topic guide
Reported by: | Michał Domański | Owned by: | Caio Ariede |
---|---|---|---|
Component: | Documentation | Version: | 1.8 |
Severity: | Normal | Keywords: | |
Cc: | Triage Stage: | Accepted | |
Has patch: | no | Needs documentation: | no |
Needs tests: | no | Patch needs improvement: | no |
Easy pickings: | yes | UI/UX: | no |
Description
The problem is the difference between the results of count() on entries relation and num_entries if I also want an annotation for alerts relation
reports = Report.objects.filter( user=request.user).annotate(num_entries=Count('entries'), num_alerts=Count('alerts')) print report.num_entries, report.entries.count()
Yet, this problem did not happen if there was only 1 argument to annotate.
What's happening? Well, both entries and alerts are FK relations, thus they result in this two left outer joins. The SQL is something like this:
SELECT `coverage_reports`.* COUNT(`alerts_alert`.`id`) AS `num_alerts`, COUNT(`coverage_reports_entry`.`id`) AS `num_entries` FROM `coverage_reports_coveragereport` LEFT OUTER JOIN `alerts_alert` ON ( `coverage_reports_coveragereport`.`id` = `alerts_alert`.`coverage_report_id` ) LEFT OUTER JOIN `coverage_reports_entry` ON (`coverage_reports_coveragereport`.`id` = `coverage_reports_entry`.`coverage_report_id` ) WHERE `coverage_reports_coveragereport`.`user_id` = 1 GROUP BY `coverage_reports_coveragereport`.`id` ;
This is problematic as joins will duplicate parent records if more than one child record is associated to it. This is what can inflate values from aggregate functions.
The fix to this particular query was such:
SELECT reports.id, alerts.num_alerts, entries.num_entries FROM coverage_reports_coveragereport AS reports LEFT JOIN (SELECT coverage_report_id, COUNT(*) AS num_alerts FROM alerts_alert GROUP BY coverage_report_id) AS alerts ON reports.id = alerts.coverage_report_id LEFT JOIN (SELECT coverage_report_id, COUNT(*) AS num_entries FROM coverage_reports_entry GROUP BY coverage_report_id) AS entries ON reports.id = entries.coverage_report_id WHERE reports.user_id = 1 GROUP BY reports.id;
Change History (10)
comment:1 by , 9 years ago
comment:3 by , 9 years ago
Component: | Database layer (models, ORM) → Documentation |
---|---|
Easy pickings: | set |
Summary: | Multiple annotations result in wrong results with MySQL backend → Add an example with Count('X', distinct=True) to the queries topic guide |
Triage Stage: | Unreviewed → Accepted |
Type: | Bug → Cleanup/optimization |
comment:4 by , 9 years ago
Owner: | changed from | to
---|---|
Status: | new → assigned |
Does adding distinct=True to the the
Count
s resolve the issue? If so, we could probably add an example totopics/db/aggregation
about this.