Opened 8 years ago

Closed 8 years ago

Last modified 8 years ago

#26172 closed Cleanup/optimization (wontfix)

Don't query the database when filtering an already empty queryset

Reported by: David Seddon Owned by: nobody
Component: Database layer (models, ORM) Version: 1.9
Severity: Normal Keywords: orm, queryset
Cc: Triage Stage: Unreviewed
Has patch: no Needs documentation: no
Needs tests: no Patch needs improvement: no
Easy pickings: no UI/UX: no

Description

If you have a queryset that has already been evaluated as empty, and you filter (or exclude) it, Django will hit the database again, even though the result will always be an empty queryset.

>>> queryset = MyModel.objects.filter(my_field='foo')
>>> bool(queryset)
# Hits the database - running bool will cache it
False
>>> queryset
[]
>>> queryset.filter(another_field='bar')
# Hits the database
[]
>>> queryset.exclude(another_field='bar')
# Hits the database
[]

The ORM should check to see if the queryset is empty before running any further filters / excludes pointlessly.

Change History (4)

comment:1 by Anssi Kääriäinen, 8 years ago

Resolution: wontfix
Status: newclosed

I don't see this as possible. The reason is that the ORM can't know if the queryset will be empty when the filter() clause is applied. Consider:

Foo.objects.all().delete()
qs = Foo.objects.filter(pk__gte=0)
bool(qs)
Foo.objects.create(pk=1)
qs = qs.filter(pk=1)  # There is a match even if Foo.objects.filter(pk__gte=0) didn't match anything!

comment:2 by David Seddon, 8 years ago

Hmm, I see what you mean, but couldn't you make the same argument for any queryset caching? Consider:

>>> Foo.objects.all().delete()
>>> qs = Foo.objects.all()
>>> bool(qs)
False
>>> Foo.objects.create(pk=1)
>>> qs
# The cached results will be fetched
[]

I guess it depends on what you want the caching strategy to be. On the one hand, returning an empty cached queryset is simply using the cache that's available, which might be the right thing to do in some circumstances. On the other, perhaps whenever filter() or exclude() are applied, it should always invalidate the cache.

comment:3 by Anssi Kääriäinen, 8 years ago

The main reason why we use the caching approach we currently use is that we have done it in this way always.

If we started from scratch, it might be we would opt for an explicit approach - queries are only executed when qs.execute() is called. That way the user would know exactly when the cache is created and cleared. But we are extremely unlikely to change any of this now...

comment:4 by David Seddon, 8 years ago

Fair enough. After all, I guess there is nothing to stop someone making a custom queryset with this behaviour.

Thanks for considering the ticket anyway, it's useful to know the reasoning behind it.

Note: See TracTickets for help on using tickets.
Back to Top