Opened 19 years ago

Closed 18 years ago

Last modified 16 years ago

#2256 closed defect (fixed)

queryset.count() does not work correctly with sliced querysets

Reported by: eth_anton@… Owned by: Adrian Holovaty
Component: Database layer (models, ORM) Version:
Severity: normal Keywords: queryset count limit
Cc: Malcolm Tredinnick Triage Stage: Ready for checkin
Has patch: yes Needs documentation: no
Needs tests: no Patch needs improvement: no
Easy pickings: no UI/UX: no

Description

Makes for counter-intuitive behavior when slicing...

From an interactive shell session:

>>> all = NewsItem.objects.all()
>>> all.count()
11L
>>> all[:5].count()
11L
>>> set = NewsItem.objects.all()[:5]
>>> set.count()
11L
>>>

Attachments (4)

queryset_count.patch (1.5 KB ) - added by Chris Beaven 18 years ago.
queryset_count_doc.patch (560 bytes ) - added by Chris Beaven 18 years ago.
Documentation if it's needed
queryset_count.2.patch (1.9 KB ) - added by Chris Beaven 18 years ago.
with tests
queryset_count.3.patch (2.0 KB ) - added by Chris Beaven 18 years ago.

Download all attachments as: .zip

Change History (23)

comment:1 by James Bennett, 19 years ago

Hm.

We're definitely explicitly dropping the limit and offset if they're present, but there's no comment to explain why. When I get a little time, I'll try the test suite with the offending lines commented out (lines 183 and 184 in django/db/models/query.py).

comment:2 by Malcolm Tredinnick, 19 years ago

We should either be raising an error in these cases or behaving as we are now.

Count() returns a single value and it doesn't make sense to apply limit and offset to a singleton list like that (although it works. A non-zero offset is going to lead to disappointment in the number of results you get back, though). It's almost certainly not what the caller intended and it's not going to work as a way to count how many things were returned after the limit and offset are applied.

LIMIT/OFFSET are applied after the result set is determined (and in this case it consists of one item, so it's pointless).

comment:3 by cheap_hydrocodone , 18 years ago

Cc: "" added
Component: Database wrapper
Keywords: "" added
Owner: changed from Adrian Holovaty to anonymous
priority: normal
Severity: normal
Type: defect

comment:4 by Collin Grady <cgrady@…>, 18 years ago

Cc: "" removed
Component: Database wrapper
Keywords: queryset count limit added; "" removed
Owner: changed from anonymous to Adrian Holovaty
priority: normal
Severity: blocker
Type: defect
Version: SVN

fixing spambot edits

comment:5 by Collin Grady <cgrady@…>, 18 years ago

Severity: blockernormal

woops, fixing invalid fix

comment:6 by Chris Beaven, 18 years ago

Component: Database wrapperAdmin interface
priority: normalhighest
Severity: normalblocker
Version: SVN

Malcom: .count() is used to determine proper page length for the paginator so it may be what the caller intended.

My opinion is that .count() should only return up to the maximum length that the sliced queryset could return.

comment:7 by Chris Beaven, 18 years ago

Component: Admin interfaceDatabase wrapper
priority: highestnormal
Severity: blockernormal

oops, refreshed after Collin's changes and reverted his fixes :P

comment:8 by Chris Beaven, 18 years ago

Triage Stage: UnreviewedDesign decision needed

Even if we don't change the existing behaviour, existing code should be commented as to why we are doing it the way we are.

comment:9 by Collin Grady <cgrady@…>, 18 years ago

On a whim I checked this using raw SQL and a LIMIT didn't affect the output of COUNT, so maybe that's why the code currently drops it - not much point adding it if the db won't respect it...

Though if there was a way to apply the limit first, and tell sql to count those without killing performance, it would definitely be more intuitive.

comment:10 by Chris Beaven, 18 years ago

Thanks for checking that Collin.

If we do still want count to take into account the LIMIT/OFFSET, then it should be easy enough to do some basic maths to return the correct value, don't you think?

by Chris Beaven, 18 years ago

Attachment: queryset_count.patch added

by Chris Beaven, 18 years ago

Attachment: queryset_count_doc.patch added

Documentation if it's needed

comment:11 by Chris Beaven, 18 years ago

Has patch: set
Triage Stage: Design decision neededReady for checkin

Malcom's comments aren't making much sense to me - no-one was talking about a singleton list.

It seems pretty clear to me that the limit/offset should be respected for a count, and my patch is sound so I'm marking as ready for checkin.

by Chris Beaven, 18 years ago

Attachment: queryset_count.2.patch added

with tests

comment:12 by Chris Beaven, 18 years ago

Summary: queryset.count() drops LIMIT/OFFSET informationqueryset.count() does not work with sliced querysets

I left the doc patch separate, because I'm not sure we need to explicitly document this change in behaviour in the DB api - the old behaviour was never implied.

comment:13 by Chris Beaven, 18 years ago

Summary: queryset.count() does not work with sliced querysetsqueryset.count() does not work correctly with sliced querysets

comment:14 by Malcolm Tredinnick, 18 years ago

Patch needs improvement: set
Triage Stage: Ready for checkinAccepted

My earlier comment was that count() and limit/offset do not mix at the SQL level. You have worked around that by manually applying those parameters inside the Python code. I think this is counter-intuitive, since it's now no longer SQL-intuitive or Python-intuitive (slicing the single count() result changes the result?! We implemented limit/offset as slices because it was analogous -- you were slicing a sequence of results). But I can see the flip side, too, so I'm not going to veto this.

One problem with the patch is

>>> articles[8:100].count()
-1

(using the same test data setup). You need to make line 224 be something like count = max(0, count - offset). Fix that and add a test and we're good to go, I think.

comment:15 by Chris Beaven, 18 years ago

Maybe I'm not following, but you can't slice a count() result as it does not return a QuerySet.
It seems pretty Python-intuitive that len(articles[:5]) == articles[:5].count().

Thanks for picking up that bug, I'll fix now.

by Chris Beaven, 18 years ago

Attachment: queryset_count.3.patch added

comment:16 by Chris Beaven, 18 years ago

Patch needs improvement: unset
Triage Stage: AcceptedReady for checkin

comment:17 by Malcolm Tredinnick, 18 years ago

Resolution: fixed
Status: newclosed

(In [4488]) #fixed #2256 -- Made count() interact with slicing on QuerySets. Patch from
SmileyChris.

comment:18 by redduck666, 16 years ago

Cc: Malcolm Tredinnick added

In [1]: from django.contrib.auth.models import User

In [2]: User.objects.all()[:0].count()
Out[2]: 4296

In [3]: len(User.objects.all()[:0])
Out[3]: 0

is this considered normal?

comment:19 by Karen Tracey, 16 years ago

Alex has opened #10202 for this.

Note: See TracTickets for help on using tickets.
Back to Top