Code

Opened 7 years ago

Closed 7 years ago

#3541 closed (fixed)

queryset.count() does not use _result_cache

Reported by: webograph@… Owned by: adrian
Component: Database layer (models, ORM) Version: master
Severity: Keywords:
Cc: Triage Stage: Ready for checkin
Has patch: yes Needs documentation: no
Needs tests: no Patch needs improvement: no
Easy pickings: UI/UX:

Description

when using .count() on an already executed queryset, the query is evaluated again instead of just checking the length of the _result_cache.

it can be worked around by using len() on queries known to have been executed, in which case len() performs better then .count().

Attachments (2)

cache_count.patch (1.0 KB) - added by SmileyChris 7 years ago.
faster_count.patch (764 bytes) - added by SmileyChris 7 years ago.

Download all attachments as: .zip

Change History (9)

Changed 7 years ago by SmileyChris

comment:1 Changed 7 years ago by SmileyChris

  • Has patch set
  • Needs documentation unset
  • Needs tests unset
  • Patch needs improvement unset
  • Triage Stage changed from Unreviewed to Ready for checkin

Bump back if you think it needs tests, but it's working fine for me and the code is pretty straight-forward.

comment:2 Changed 7 years ago by Benjamin Slavin

  • Triage Stage changed from Ready for checkin to Design decision needed

-0 from me.

If you want to know the number of objects, I'd recommend sticking to len(QS). Otherwise, there's no guarantee that the result from .count() is the number of objects in the QS you have.

There are ways to fix this, but I think that the current method is preferable.

Here's an example of the problem:

>>> a = Foo.objects.all()
>>> b = Foo.objects.all()
>>> a
[<Foo1>, <Foo2>]
>>> a.count()
2L
>>> b.count()
2L
......Something is added to the DB......
>>> b
[<Foo1>, <Foo2>, <Foo3>]
>>> b.count()
2L
>>> len(b)
3

I'm changing this to 'design decision needed'.

comment:3 Changed 7 years ago by SmileyChris

  • Triage Stage changed from Design decision needed to Ready for checkin

You're right that caching the count is probably overly aggressive. Here's a patch which doesn't cache but simply does what the original ticket suggests.

b.count() should always match len(_result_cache) now.

Changed 7 years ago by SmileyChris

comment:4 Changed 7 years ago by SmileyChris

Behaviour now:

>>> a = Foo.objects.all()
>>> b = Foo.objects.all()
>>> a
[<Foo1>, <Foo2>]
>>> a.count()
2L
>>> b.count()
2L

Something is added to the DB.
>>> b
[<Foo1>, <Foo2>, <Foo3>]
>>> b.count()
3
>>> len(b)
3

There is a change from current functionality with this patch: a is still cached so a.count() will now be 2 whereas previously it would use another SQL query and return 3L.

comment:5 Changed 7 years ago by Benjamin Slavin

I think that this is better than the original patch. I still personally prefer just using len(QS), but I change my vote to +0.

My only remaining concern if this patch is introduced is the varied data types of the return value... It may be worth putting a typecast in there so the return type is predictably 'long'.

comment:6 Changed 7 years ago by SmileyChris

Supposedly, the type .count() returns depends on the database so it's not predictable anyway.

comment:7 Changed 7 years ago by jacob

  • Resolution set to fixed
  • Status changed from new to closed

(In [4561]) Fixed #3541: queryset.count() now respects the queryset cache.

Add Comment

Modify Ticket

Change Properties
<Author field>
Action
as closed
as The resolution will be set. Next status will be 'closed'
The resolution will be deleted. Next status will be 'new'
Author


E-mail address and user name can be saved in the Preferences.

 
Note: See TracTickets for help on using tickets.