#26194 closed Uncategorized (duplicate)
WHERE EXISTS / WHERE NOT EXISTS clause without using QuerySet.extra
| Reported by: | john-parton | Owned by: | nobody |
|---|---|---|---|
| Component: | Database layer (models, ORM) | Version: | 1.9 |
| Severity: | Normal | Keywords: | QuerySet.extra |
| Cc: | Triage Stage: | Unreviewed | |
| Has patch: | no | Needs documentation: | no |
| Needs tests: | no | Patch needs improvement: | no |
| Easy pickings: | no | UI/UX: | no |
Description
I initially posted this to the Django users group here: https://groups.google.com/forum/#!topic/django-users/nBySOT3C27E
Greetings!
I'm trying to refactor a query to avoid using QuerySet.extra (as per the recommendation here: https://docs.djangoproject.com/en/1.9/ref/models/querysets/#extra)
Here's a simplified version of my code:
# testapp.models
class Parent(models.Model):
pass
class Child(models.Model):
parent = models.ForeignKey('testapp.Parent')
This is the query I am attempting to replace:
Parent.objects.extra(where=["""
NOT EXISTS (SELECT 1 FROM testapp_child WHERE testapp_child.parent_id = testapp_parent.id)
"""])
This is extremely similar to, but not the same as
Parent.objects.exclude(id__in=Child.objects.all().values('parent_id'))
Both queries should return the same rows, but the biggest difference is that PostgreSQL's query planner produces completely different plans. The performance difference can be huge, even for a relatively modest data set. (I believe my data set has something like 270k "parent" instances and 80k "child" instances.)
I tried searching and couldn't find a solution to this exact problem.
Thanks for taking the time to read through all of this!
Change History (4)
comment:1 by , 10 years ago
| Resolution: | → invalid |
|---|---|
| Status: | new → closed |
comment:2 by , 10 years ago
Oh, sorry, I missed the mention that you did use our support channels; still, the reason for closing this ticket is exactly described in that link.
comment:3 by , 10 years ago
| Resolution: | invalid → duplicate |
|---|
I'm not sure the query can be considered equivalent given the LEFT JOIN will return duplicate rows for parent with multiple children.
Parent.objects.filter(child=None).distinct() should return the same results but I wouldn't be surprised if it performed worse than the original EXISTS query.
I'll re-close as duplicate of #25789 instead.
comment:4 by , 10 years ago
Thanks for looking into this.
Implementing a feature like .filter(models.Exists(somerel__col=val)) as mentioned by akaariai in reference to #25789 would fix my issue.
I tried Parent.objects.filter(child=None).distinct() and it resulted in a query that was about 3 times as slow as the NOT EXISTS query. This is definitely better than my attempt with the subquery (which was about 2000 times as slow), but it's still irksome that I can't easily compose the query using the ORM.
I couldn't find #25789 because I was looking for tickets with the "QuerySet.extra" keyword. Can we add that keyword to #25789?
Consider using the reverse relation:
It should create a left join, which should perform comparably with the
EXISTSquery.Also, please consider using our support channels next time.