Opened 21 months ago

Last modified 21 months ago

#26368 new Bug

Order of &-ing Q objects affects results in edge case

Reported by: Floris den Hengst Owned by: nobody
Component: Database layer (models, ORM) Version: 1.9
Severity: Normal Keywords: Query Q order
Cc: Triage Stage: Accepted
Has patch: no Needs documentation: no
Needs tests: no Patch needs improvement: no
Easy pickings: no UI/UX: no

Description

Consider this models.py, where every employee has an IdealJob but some Jobs have no Employee (describing vacant Positions):

from django.db import models

# Create your models here.
class Employee(models.Model):
  age = models.IntegerField(null=True)

class Position(models.Model):
  pass

class Job(models.Model):
  employee = models.ForeignKey(Employee, null=True)
  position = models.ForeignKey(Position)

class IdealJob(models.Model):
  employee = models.ForeignKey(Employee)
  position = models.ForeignKey(Position)

It is possible to filter the IdealJobs for specific Employees in the following way:

not_in_age_range = ~Q(position__job__employee__age__range=(0,100))

has_job = Q(employee__job__id__isnull=False)

IdealJob.objects.filter(not_in_age_range & has_job)

or

IdealJob.objects.filter(has_job & not_in_age_range)

When the database contains a Job with no Employee as in [1], the order in which the Q objects are combined matters when one of the Q objects is negated.

Tested with Postgres and SQlite backend:

>>> # from some_app.models import *
>>> from django.db.models import *
>>> from django.db import connection
>>> not_in_age_range = ~Q(position__job__employee__age__range=(0,100))
>>> 
>>> has_job = Q(employee__job__id__isnull=False)
>>> 
>>> c_1 = IdealJob.objects.filter(has_job & not_in_age_range).count()
>>> c_2 = IdealJob.objects.filter(not_in_age_range & has_job).count()
>>> 
>>> # these differ!!!
>>> print(c_1, c_2)
3 2
>>> 
>>> print(connection.queries[-1]['sql'])
SELECT COUNT(*) AS "__count" FROM "negated_query_idealjob" INNER JOIN "negated_query_employee" ON ("negated_query_idealjob"."employee_id" = "negated_query_employee"."id") INNER JOIN "negated_query_job" ON ("negated_query_employee"."id" = "negated_query_job"."employee_id") WHERE (NOT ("negated_query_idealjob"."position_id" IN (SELECT U2."position_id" AS Col1 FROM "negated_query_job" U2 INNER JOIN "negated_query_employee" U3 ON (U2."employee_id" = U3."id") WHERE U3."age" BETWEEN 0 AND 100)) AND "negated_query_job"."id" IS NOT NULL)
>>> print(connection.queries[-2]['sql'])
SELECT COUNT(*) AS "__count" FROM "negated_query_idealjob" INNER JOIN "negated_query_employee" ON ("negated_query_idealjob"."employee_id" = "negated_query_employee"."id") INNER JOIN "negated_query_job" ON ("negated_query_employee"."id" = "negated_query_job"."employee_id") WHERE ("negated_query_job"."id" IS NOT NULL AND NOT ("negated_query_idealjob"."position_id" IN (SELECT U2."position_id" AS Col1 FROM "negated_query_job" U2 INNER JOIN "negated_query_employee" U3 ON (U2."employee_id" = U3."id") WHERE (U3."age" BETWEEN 0 AND 100 AND U2."id" = ("negated_query_job"."id")))))
>>>
>>> in_age_range = Q(position__job__employee__age__range=(0,100))
>>> c_1 = IdealJob.objects.filter(has_job & in_age_range).count()
>>> c_2 = IdealJob.objects.filter(in_age_range & has_job).count()
>>>
>>> # these do not differ
>>> print(c_1, c_2)
1 1
>>> print(connection.queries[-1]['sql'])
SELECT COUNT(*) AS "__count" FROM "negated_query_idealjob" INNER JOIN "negated_query_position" ON ("negated_query_idealjob"."position_id" = "negated_query_position"."id") INNER JOIN "negated_query_job" ON ("negated_query_position"."id" = "negated_query_job"."position_id") INNER JOIN "negated_query_employee" ON ("negated_query_job"."employee_id" = "negated_query_employee"."id") INNER JOIN "negated_query_employee" T5 ON ("negated_query_idealjob"."employee_id" = T5."id") INNER JOIN "negated_query_job" T6 ON (T5."id" = T6."employee_id") WHERE ("negated_query_employee"."age" BETWEEN 0 AND 100 AND T6."id" IS NOT NULL)
>>> print(connection.queries[-2]['sql'])
SELECT COUNT(*) AS "__count" FROM "negated_query_idealjob" INNER JOIN "negated_query_employee" ON ("negated_query_idealjob"."employee_id" = "negated_query_employee"."id") INNER JOIN "negated_query_job" ON ("negated_query_employee"."id" = "negated_query_job"."employee_id") INNER JOIN "negated_query_position" ON ("negated_query_idealjob"."position_id" = "negated_query_position"."id") INNER JOIN "negated_query_job" T5 ON ("negated_query_position"."id" = T5."position_id") INNER JOIN "negated_query_employee" T6 ON (T5."employee_id" = T6."id") WHERE ("negated_query_job"."id" IS NOT NULL AND T6."age" BETWEEN 0 AND 100)

[1]: minimal test case

from negated_query.models import *

emp_1 = Employee(age=3)
emp_1.save()

emp_2 = Employee()
emp_2.save()

pos_1 = Position()
pos_1.save()

pos_2 = Position()
pos_2.save()

pos_3 = Position()
pos_3.save()

job_0 = Job(employee=None, position=pos_1)
job_0.save()

job_12 = Job(employee=emp_1, position=pos_2)
job_12.save()

job_13 = Job(employee=emp_1, position=pos_3)
job_13.save()

job_22 = Job(employee=emp_2, position=pos_2)
job_22.save()

ideal_job = IdealJob(position=pos_1, employee=emp_1)
ideal_job.save()

ideal_job_2 = IdealJob(position=pos_2, employee=emp_2)
ideal_job_2.save()

Change History (2)

comment:1 Changed 21 months ago by Tim Graham

Component: UncategorizedDatabase layer (models, ORM)
Type: UncategorizedBug

I'm not sure what the expected behavior is. Fixing this could be tricky and/or break backwards-compatibility. If so, maybe we can document the reason for the discrepancy.

comment:2 Changed 21 months ago by Tim Graham

Triage Stage: UnreviewedAccepted
Note: See TracTickets for help on using tickets.
Back to Top