id summary reporter owner description type status component version severity resolution keywords cc stage has_patch needs_docs needs_tests needs_better_patch easy ui_ux 17276 Slow anti-join query against Postgres dmitry@… nobody "For background on this ticket please see the following two discussions: * http://stackoverflow.com/questions/8190474/postgres-slow-outer-query-when-using-a-primary-key * https://groups.google.com/group/django-users/browse_thread/thread/7d13f2d8748b4f9f Basically, in a many-to-many mappings between models Student and Course, if I want to find all instances of Students that aren't registered for classes, I would issue the following Django query: {{{ Student.objects.filter(course__isnull=True) }}} Django translates this into the following query: {{{ SELECT ""student"".""id"", ""student"".""name"" FROM ""student"" LEFT OUTER JOIN ""course_students"" ON (""student"".""id"" = ""course_students"".""student_id"") LEFT OUTER JOIN ""course"" ON (""course_students"".""course_id"" = ""course"".""id"") WHERE ""course"".""id"" IS NULL }}} The problem is that the way the WHERE clause is generated is very inefficient (at least when used with Postgres). Changing WHERE to ""course_students"".""student_id"" IS NOT NULL yields orders of magnitude improved query plan. Here's the difference I'm seeing on real data: * Django way: (cost=1072479.36..6256437.79 rows=1 width=145) * Hand-crafted: (cost=1518.71..1533.35 rows=1 width=145) I'm attaching a sample project with the model already set up. To see the generated SQL query, simply run ""python manage.py anti-join.""" Cleanup/optimization closed Database layer (models, ORM) 1.3 Normal fixed orm, postgres, join, anti-join Accepted 0 0 0 0 0 0