#26194 closed Uncategorized (duplicate)
WHERE EXISTS / WHERE NOT EXISTS clause without using QuerySet.extra
Reported by: | john-parton | Owned by: | nobody |
---|---|---|---|
Component: | Database layer (models, ORM) | Version: | 1.9 |
Severity: | Normal | Keywords: | QuerySet.extra |
Cc: | Triage Stage: | Unreviewed | |
Has patch: | no | Needs documentation: | no |
Needs tests: | no | Patch needs improvement: | no |
Easy pickings: | no | UI/UX: | no |
Description
I initially posted this to the Django users group here: https://groups.google.com/forum/#!topic/django-users/nBySOT3C27E
Greetings!
I'm trying to refactor a query to avoid using QuerySet.extra (as per the recommendation here: https://docs.djangoproject.com/en/1.9/ref/models/querysets/#extra)
Here's a simplified version of my code:
# testapp.models class Parent(models.Model): pass class Child(models.Model): parent = models.ForeignKey('testapp.Parent')
This is the query I am attempting to replace:
Parent.objects.extra(where=[""" NOT EXISTS (SELECT 1 FROM testapp_child WHERE testapp_child.parent_id = testapp_parent.id) """])
This is extremely similar to, but not the same as
Parent.objects.exclude(id__in=Child.objects.all().values('parent_id'))
Both queries should return the same rows, but the biggest difference is that PostgreSQL's query planner produces completely different plans. The performance difference can be huge, even for a relatively modest data set. (I believe my data set has something like 270k "parent" instances and 80k "child" instances.)
I tried searching and couldn't find a solution to this exact problem.
Thanks for taking the time to read through all of this!
Change History (4)
comment:1 by , 9 years ago
Resolution: | → invalid |
---|---|
Status: | new → closed |
comment:2 by , 9 years ago
Oh, sorry, I missed the mention that you did use our support channels; still, the reason for closing this ticket is exactly described in that link.
comment:3 by , 9 years ago
Resolution: | invalid → duplicate |
---|
I'm not sure the query can be considered equivalent given the LEFT JOIN
will return duplicate rows for parent with multiple children.
Parent.objects.filter(child=None).distinct()
should return the same results but I wouldn't be surprised if it performed worse than the original EXISTS
query.
I'll re-close as duplicate of #25789 instead.
comment:4 by , 9 years ago
Thanks for looking into this.
Implementing a feature like .filter(models.Exists(somerel__col=val))
as mentioned by akaariai in reference to #25789 would fix my issue.
I tried Parent.objects.filter(child=None).distinct()
and it resulted in a query that was about 3 times as slow as the NOT EXISTS
query. This is definitely better than my attempt with the subquery (which was about 2000 times as slow), but it's still irksome that I can't easily compose the query using the ORM.
I couldn't find #25789 because I was looking for tickets with the "QuerySet.extra" keyword. Can we add that keyword to #25789?
Consider using the reverse relation:
It should create a left join, which should perform comparably with the
EXISTS
query.Also, please consider using our support channels next time.