#27577 closed Bug (invalid)
Django annotated query returns wrong results
Reported by: | Mahmood Khan | Owned by: | nobody |
---|---|---|---|
Component: | Database layer (models, ORM) | Version: | 1.10 |
Severity: | Normal | Keywords: | annotated query |
Cc: | Triage Stage: | Unreviewed | |
Has patch: | no | Needs documentation: | no |
Needs tests: | no | Patch needs improvement: | no |
Easy pickings: | no | UI/UX: | no |
Description
I have three related models listed below and I would like to run an aggregate query to give me the total number of tasks completed by person but only those tasks that are in a specific folder.
class Person(models.Model): firstName = models.CharField(max_length=100, null=True, blank=True) lastName = models.CharField(max_length=100, null=True, blank=True) class Folder(models.Model): title = models.CharField(max_length=254, null=True, blank=True) assignees = models.ManyToManyField(Person, related_name="projects") completedDate = models.DateTimeField(blank=True, null=True) class Task(models.Model): title = models.CharField(max_length=254, null=True, blank=True) assignees = models.ManyToManyField(Person, related_name="tasks") folders = models.ManyToManyField(Folder, related_name="tasks")
And here is my attempted query:
tasks = Contact.objects.filter(tasks__folders__id='I58343DS89ASDF').distinct().filter(tasks__completedDate__gte='2016-10-01 00:00:00').annotate(total=Count('tasks'), name=F('firstName')).values('total', 'name')
This is giving me an incorrect count. So I inspected the SQL statement that is generated by this Django query. The SQL statement is:
SELECT DISTINCT COUNT(`wrike_task_assignees`.`task_id`) AS `total`, `wrike_contact`.`firstName` AS `name` FROM `wrike_contact` INNER JOIN `wrike_task_assignees` ON (`wrike_contact`.`id` = `wrike_task_assignees`.`contact_id`) INNER JOIN `wrike_task` ON (`wrike_task_assignees`.`task_id` = `wrike_task`.`id`) INNER JOIN `wrike_task_folders` ON (`wrike_task`.`id` = `wrike_task_folders`.`task_id`) INNER JOIN `wrike_task_assignees` T6 ON (`wrike_contact`.`id` = T6.`contact_id`) INNER JOIN `wrike_task` T7 ON (T6.`task_id` = T7.`id`) WHERE (`wrike_task_folders`.`folder_id` = I58343DS89ASDF AND T7.`completedDate` >= 2016-10-01 00:00:00) GROUP BY `wrike_contact`.`id` ORDER BY NULL
And a nicely formatted version is:
SELECT DISTINCT COUNT(ta.task_id) AS `total`, c.firstName AS `name` FROM wrike_contact c INNER JOIN wrike_task_assignees ta ON (c.id = ta.contact_id) INNER JOIN wrike_task t ON (ta.task_id = t.id) INNER JOIN wrike_task_folders tf ON (t.id = tf.task_id) INNER JOIN wrike_task_assignees T6 ON (c.id = T6.contact_id) INNER JOIN wrike_task T7 ON (ta.task_id = T7.id) WHERE (tf.folder_id = 'I58343DS89ASDF' AND t.completedDate >= '2016-10-01 00:00:00') GROUP BY c.id ORDER BY NULL
Notice that a couple INNER JOIN are repeated twice -- not sure why.
If I remove those duplicate INNER JOIN and then run the query, the results are correct. Here is the version of SQL statement in which I removed the duplicate INNER JOIN segments:
SELECT DISTINCT c.firstName AS name, COUNT(ta.task_id) AS `total` FROM wrike_contact c INNER JOIN wrike_task_assignees ta ON (c.id = ta.contact_id) INNER JOIN wrike_task t ON (ta.task_id = t.id) INNER JOIN wrike_task_folders tf ON (t.id = tf.task_id) WHERE (tf.folder_id = 'I58343DS89ASDF' AND t.completedDate >= '2016-10-01 00:00:00') GROUP BY c.id ORDER BY name;
I have searched the forum and couldn't find that this issue is already reported. So I reporting it here.
Change History (1)
comment:1 by , 8 years ago
Resolution: | → invalid |
---|---|
Status: | new → closed |
Hi Mahmoud,
Thanks for filing this report with all the details. The behavior you're seeing may not be completely intuitive, but it is done this way on purpose and it is documented: https://docs.djangoproject.com/en/1.10/topics/db/queries/#spanning-multi-valued-relationships
Hope this helps,
Shai.