Opened 13 years ago

Last modified 4 years ago

#17276 closed Cleanup/optimization

Slow anti-join query against Postgres — at Initial Version

Reported by: dmitry@… Owned by: nobody
Component: Database layer (models, ORM) Version: 1.3
Severity: Normal Keywords: orm, postgres, join, anti-join
Cc: Triage Stage: Accepted
Has patch: no Needs documentation: no
Needs tests: no Patch needs improvement: no
Easy pickings: no UI/UX: no

Description

For background on this ticket please see the following two discussions:

Basically, in a many-to-many mappings between models Student and Course, if I want to find all instances of Students that aren't registered for classes, I would issue the following Django query:

Student.objects.filter(courseisnull=True)

Django translates this into the following query:

SELECT "student"."id", "student"."name"

FROM "student"

LEFT OUTER JOIN "course_students"

ON ("student"."id" = "course_students"."student_id")

LEFT OUTER JOIN "course"

ON ("course_students"."course_id" = "course"."id")

WHERE "course"."id" IS NULL

The problem is that the way the WHERE clause is generated is very inefficient (at least when used with Postgres). Changing WHERE to "course_students"."student_id" IS NOT NULL yields orders of magnitude improved query plan. Here's the difference I'm seeing on real data:

Django way: (cost=1072479.36..6256437.79 rows=1 width=145)
Hand-crafted: (cost=1518.71..1533.35 rows=1 width=145)

I'm attaching a sample project with the model already set up. To see the generated SQL query, simply run "python manage.py anti-join."

Change History (1)

by dmitry@…, 13 years ago

Attachment: AntiJoin.rar added

sample project

Note: See TracTickets for help on using tickets.
Back to Top