#27577 closed Bug (invalid)
Django annotated query returns wrong results
| Reported by: | Mahmood Khan | Owned by: | nobody |
|---|---|---|---|
| Component: | Database layer (models, ORM) | Version: | 1.10 |
| Severity: | Normal | Keywords: | annotated query |
| Cc: | Triage Stage: | Unreviewed | |
| Has patch: | no | Needs documentation: | no |
| Needs tests: | no | Patch needs improvement: | no |
| Easy pickings: | no | UI/UX: | no |
Description
I have three related models listed below and I would like to run an aggregate query to give me the total number of tasks completed by person but only those tasks that are in a specific folder.
class Person(models.Model):
firstName = models.CharField(max_length=100, null=True, blank=True)
lastName = models.CharField(max_length=100, null=True, blank=True)
class Folder(models.Model):
title = models.CharField(max_length=254, null=True, blank=True)
assignees = models.ManyToManyField(Person, related_name="projects")
completedDate = models.DateTimeField(blank=True, null=True)
class Task(models.Model):
title = models.CharField(max_length=254, null=True, blank=True)
assignees = models.ManyToManyField(Person, related_name="tasks")
folders = models.ManyToManyField(Folder, related_name="tasks")
And here is my attempted query:
tasks = Contact.objects.filter(tasks__folders__id='I58343DS89ASDF').distinct().filter(tasks__completedDate__gte='2016-10-01 00:00:00').annotate(total=Count('tasks'), name=F('firstName')).values('total', 'name')
This is giving me an incorrect count. So I inspected the SQL statement that is generated by this Django query. The SQL statement is:
SELECT DISTINCT COUNT(`wrike_task_assignees`.`task_id`) AS `total`, `wrike_contact`.`firstName` AS `name` FROM `wrike_contact` INNER JOIN `wrike_task_assignees` ON (`wrike_contact`.`id` = `wrike_task_assignees`.`contact_id`) INNER JOIN `wrike_task` ON (`wrike_task_assignees`.`task_id` = `wrike_task`.`id`) INNER JOIN `wrike_task_folders` ON (`wrike_task`.`id` = `wrike_task_folders`.`task_id`) INNER JOIN `wrike_task_assignees` T6 ON (`wrike_contact`.`id` = T6.`contact_id`) INNER JOIN `wrike_task` T7 ON (T6.`task_id` = T7.`id`) WHERE (`wrike_task_folders`.`folder_id` = I58343DS89ASDF AND T7.`completedDate` >= 2016-10-01 00:00:00) GROUP BY `wrike_contact`.`id` ORDER BY NULL
And a nicely formatted version is:
SELECT DISTINCT COUNT(ta.task_id) AS `total`, c.firstName AS `name`
FROM wrike_contact c INNER JOIN wrike_task_assignees ta ON (c.id = ta.contact_id)
INNER JOIN wrike_task t ON (ta.task_id = t.id)
INNER JOIN wrike_task_folders tf ON (t.id = tf.task_id)
INNER JOIN wrike_task_assignees T6 ON (c.id = T6.contact_id)
INNER JOIN wrike_task T7 ON (ta.task_id = T7.id)
WHERE (tf.folder_id = 'I58343DS89ASDF'
AND t.completedDate >= '2016-10-01 00:00:00')
GROUP BY c.id
ORDER BY NULL
Notice that a couple INNER JOIN are repeated twice -- not sure why.
If I remove those duplicate INNER JOIN and then run the query, the results are correct. Here is the version of SQL statement in which I removed the duplicate INNER JOIN segments:
SELECT DISTINCT c.firstName AS name, COUNT(ta.task_id) AS `total`
FROM wrike_contact c
INNER JOIN wrike_task_assignees ta ON (c.id = ta.contact_id)
INNER JOIN wrike_task t ON (ta.task_id = t.id)
INNER JOIN wrike_task_folders tf ON (t.id = tf.task_id)
WHERE (tf.folder_id = 'I58343DS89ASDF'
AND t.completedDate >= '2016-10-01 00:00:00')
GROUP BY c.id
ORDER BY name;
I have searched the forum and couldn't find that this issue is already reported. So I reporting it here.
Change History (1)
comment:1 by , 9 years ago
| Resolution: | → invalid |
|---|---|
| Status: | new → closed |
Hi Mahmoud,
Thanks for filing this report with all the details. The behavior you're seeing may not be completely intuitive, but it is done this way on purpose and it is documented: https://docs.djangoproject.com/en/1.10/topics/db/queries/#spanning-multi-valued-relationships
Hope this helps,
Shai.