Opened 10 years ago

Closed 10 years ago

#3541 closed (fixed)

queryset.count() does not use _result_cache

Reported by: webograph@… Owned by: Adrian Holovaty
Component: Database layer (models, ORM) Version: master
Severity: Keywords:
Cc: Triage Stage: Ready for checkin
Has patch: yes Needs documentation: no
Needs tests: no Patch needs improvement: no
Easy pickings: UI/UX:

Description

when using .count() on an already executed queryset, the query is evaluated again instead of just checking the length of the _result_cache.

it can be worked around by using len() on queries known to have been executed, in which case len() performs better then .count().

Attachments (2)

cache_count.patch (1.0 KB) - added by Chris Beaven 10 years ago.
faster_count.patch (764 bytes) - added by Chris Beaven 10 years ago.

Download all attachments as: .zip

Change History (9)

Changed 10 years ago by Chris Beaven

Attachment: cache_count.patch added

comment:1 Changed 10 years ago by Chris Beaven

Has patch: set
Needs documentation: unset
Needs tests: unset
Patch needs improvement: unset
Triage Stage: UnreviewedReady for checkin

Bump back if you think it needs tests, but it's working fine for me and the code is pretty straight-forward.

comment:2 Changed 10 years ago by Benjamin Slavin

Triage Stage: Ready for checkinDesign decision needed

-0 from me.

If you want to know the number of objects, I'd recommend sticking to len(QS). Otherwise, there's no guarantee that the result from .count() is the number of objects in the QS you have.

There are ways to fix this, but I think that the current method is preferable.

Here's an example of the problem:

>>> a = Foo.objects.all()
>>> b = Foo.objects.all()
>>> a
[<Foo1>, <Foo2>]
>>> a.count()
2L
>>> b.count()
2L
......Something is added to the DB......
>>> b
[<Foo1>, <Foo2>, <Foo3>]
>>> b.count()
2L
>>> len(b)
3

I'm changing this to 'design decision needed'.

comment:3 Changed 10 years ago by Chris Beaven

Triage Stage: Design decision neededReady for checkin

You're right that caching the count is probably overly aggressive. Here's a patch which doesn't cache but simply does what the original ticket suggests.

b.count() should always match len(_result_cache) now.

Changed 10 years ago by Chris Beaven

Attachment: faster_count.patch added

comment:4 Changed 10 years ago by Chris Beaven

Behaviour now:

>>> a = Foo.objects.all()
>>> b = Foo.objects.all()
>>> a
[<Foo1>, <Foo2>]
>>> a.count()
2L
>>> b.count()
2L

Something is added to the DB.
>>> b
[<Foo1>, <Foo2>, <Foo3>]
>>> b.count()
3
>>> len(b)
3

There is a change from current functionality with this patch: a is still cached so a.count() will now be 2 whereas previously it would use another SQL query and return 3L.

comment:5 Changed 10 years ago by Benjamin Slavin

I think that this is better than the original patch. I still personally prefer just using len(QS), but I change my vote to +0.

My only remaining concern if this patch is introduced is the varied data types of the return value... It may be worth putting a typecast in there so the return type is predictably 'long'.

comment:6 Changed 10 years ago by Chris Beaven

Supposedly, the type .count() returns depends on the database so it's not predictable anyway.

comment:7 Changed 10 years ago by Jacob

Resolution: fixed
Status: newclosed

(In [4561]) Fixed #3541: queryset.count() now respects the queryset cache.

Note: See TracTickets for help on using tickets.
Back to Top