Opened 10 years ago
Closed 5 years ago
#26368 closed Bug (duplicate)
Order of &-ing Q objects affects results in edge case
| Reported by: | Floris den Hengst | Owned by: | nobody |
|---|---|---|---|
| Component: | Database layer (models, ORM) | Version: | 1.9 |
| Severity: | Normal | Keywords: | Query Q order |
| Cc: | Can Sarıgöl | Triage Stage: | Accepted |
| Has patch: | no | Needs documentation: | no |
| Needs tests: | no | Patch needs improvement: | no |
| Easy pickings: | no | UI/UX: | no |
Description
Consider this models.py, where every employee has an IdealJob but some Jobs have no Employee (describing vacant Positions):
from django.db import models # Create your models here. class Employee(models.Model): age = models.IntegerField(null=True) class Position(models.Model): pass class Job(models.Model): employee = models.ForeignKey(Employee, null=True) position = models.ForeignKey(Position) class IdealJob(models.Model): employee = models.ForeignKey(Employee) position = models.ForeignKey(Position)
It is possible to filter the IdealJobs for specific Employees in the following way:
not_in_age_range = ~Q(position__job__employee__age__range=(0,100)) has_job = Q(employee__job__id__isnull=False) IdealJob.objects.filter(not_in_age_range & has_job)
or
IdealJob.objects.filter(has_job & not_in_age_range)
When the database contains a Job with no Employee as in [1], the order in which the Q objects are combined matters when one of the Q objects is negated.
Tested with Postgres and SQlite backend:
>>> # from some_app.models import *
>>> from django.db.models import *
>>> from django.db import connection
>>> not_in_age_range = ~Q(position__job__employee__age__range=(0,100))
>>>
>>> has_job = Q(employee__job__id__isnull=False)
>>>
>>> c_1 = IdealJob.objects.filter(has_job & not_in_age_range).count()
>>> c_2 = IdealJob.objects.filter(not_in_age_range & has_job).count()
>>>
>>> # these differ!!!
>>> print(c_1, c_2)
3 2
>>>
>>> print(connection.queries[-1]['sql'])
SELECT COUNT(*) AS "__count" FROM "negated_query_idealjob" INNER JOIN "negated_query_employee" ON ("negated_query_idealjob"."employee_id" = "negated_query_employee"."id") INNER JOIN "negated_query_job" ON ("negated_query_employee"."id" = "negated_query_job"."employee_id") WHERE (NOT ("negated_query_idealjob"."position_id" IN (SELECT U2."position_id" AS Col1 FROM "negated_query_job" U2 INNER JOIN "negated_query_employee" U3 ON (U2."employee_id" = U3."id") WHERE U3."age" BETWEEN 0 AND 100)) AND "negated_query_job"."id" IS NOT NULL)
>>> print(connection.queries[-2]['sql'])
SELECT COUNT(*) AS "__count" FROM "negated_query_idealjob" INNER JOIN "negated_query_employee" ON ("negated_query_idealjob"."employee_id" = "negated_query_employee"."id") INNER JOIN "negated_query_job" ON ("negated_query_employee"."id" = "negated_query_job"."employee_id") WHERE ("negated_query_job"."id" IS NOT NULL AND NOT ("negated_query_idealjob"."position_id" IN (SELECT U2."position_id" AS Col1 FROM "negated_query_job" U2 INNER JOIN "negated_query_employee" U3 ON (U2."employee_id" = U3."id") WHERE (U3."age" BETWEEN 0 AND 100 AND U2."id" = ("negated_query_job"."id")))))
>>>
>>> in_age_range = Q(position__job__employee__age__range=(0,100))
>>> c_1 = IdealJob.objects.filter(has_job & in_age_range).count()
>>> c_2 = IdealJob.objects.filter(in_age_range & has_job).count()
>>>
>>> # these do not differ
>>> print(c_1, c_2)
1 1
>>> print(connection.queries[-1]['sql'])
SELECT COUNT(*) AS "__count" FROM "negated_query_idealjob" INNER JOIN "negated_query_position" ON ("negated_query_idealjob"."position_id" = "negated_query_position"."id") INNER JOIN "negated_query_job" ON ("negated_query_position"."id" = "negated_query_job"."position_id") INNER JOIN "negated_query_employee" ON ("negated_query_job"."employee_id" = "negated_query_employee"."id") INNER JOIN "negated_query_employee" T5 ON ("negated_query_idealjob"."employee_id" = T5."id") INNER JOIN "negated_query_job" T6 ON (T5."id" = T6."employee_id") WHERE ("negated_query_employee"."age" BETWEEN 0 AND 100 AND T6."id" IS NOT NULL)
>>> print(connection.queries[-2]['sql'])
SELECT COUNT(*) AS "__count" FROM "negated_query_idealjob" INNER JOIN "negated_query_employee" ON ("negated_query_idealjob"."employee_id" = "negated_query_employee"."id") INNER JOIN "negated_query_job" ON ("negated_query_employee"."id" = "negated_query_job"."employee_id") INNER JOIN "negated_query_position" ON ("negated_query_idealjob"."position_id" = "negated_query_position"."id") INNER JOIN "negated_query_job" T5 ON ("negated_query_position"."id" = T5."position_id") INNER JOIN "negated_query_employee" T6 ON (T5."employee_id" = T6."id") WHERE ("negated_query_job"."id" IS NOT NULL AND T6."age" BETWEEN 0 AND 100)
[1]: minimal test case
from negated_query.models import * emp_1 = Employee(age=3) emp_1.save() emp_2 = Employee() emp_2.save() pos_1 = Position() pos_1.save() pos_2 = Position() pos_2.save() pos_3 = Position() pos_3.save() job_0 = Job(employee=None, position=pos_1) job_0.save() job_12 = Job(employee=emp_1, position=pos_2) job_12.save() job_13 = Job(employee=emp_1, position=pos_3) job_13.save() job_22 = Job(employee=emp_2, position=pos_2) job_22.save() ideal_job = IdealJob(position=pos_1, employee=emp_1) ideal_job.save() ideal_job_2 = IdealJob(position=pos_2, employee=emp_2) ideal_job_2.save()
Change History (5)
comment:1 by , 10 years ago
| Component: | Uncategorized → Database layer (models, ORM) |
|---|---|
| Type: | Uncategorized → Bug |
comment:2 by , 10 years ago
| Triage Stage: | Unreviewed → Accepted |
|---|
comment:3 by , 6 years ago
| Cc: | added |
|---|
Hi, I'm working on this issue. to fix it and understand deeply. as far as I understand:), the problem is here. When we add a combine q node and this q contains negated=True clauses, this ~Qs are associated with other Q. I couldn't find why but if I change the code like this: self.children.insert(0, data), I can see the query is fixed.
I'm not sure what the expected behavior is. Fixing this could be tricky and/or break backwards-compatibility. If so, maybe we can document the reason for the discrepancy.