Opened 10 years ago

Closed 10 years ago

Last modified 8 years ago

#2256 closed defect (fixed)

queryset.count() does not work correctly with sliced querysets

Reported by: eth_anton@… Owned by: Adrian Holovaty
Component: Database layer (models, ORM) Version:
Severity: normal Keywords: queryset count limit
Cc: Malcolm Tredinnick Triage Stage: Ready for checkin
Has patch: yes Needs documentation: no
Needs tests: no Patch needs improvement: no
Easy pickings: UI/UX:

Description

Makes for counter-intuitive behavior when slicing...

From an interactive shell session:

>>> all = NewsItem.objects.all()
>>> all.count()
11L
>>> all[:5].count()
11L
>>> set = NewsItem.objects.all()[:5]
>>> set.count()
11L
>>>

Attachments (4)

queryset_count.patch (1.5 KB) - added by Chris Beaven 10 years ago.
queryset_count_doc.patch (560 bytes) - added by Chris Beaven 10 years ago.
Documentation if it's needed
queryset_count.2.patch (1.9 KB) - added by Chris Beaven 10 years ago.
with tests
queryset_count.3.patch (2.0 KB) - added by Chris Beaven 10 years ago.

Download all attachments as: .zip

Change History (23)

comment:1 Changed 10 years ago by James Bennett

Hm.

We're definitely explicitly dropping the limit and offset if they're present, but there's no comment to explain why. When I get a little time, I'll try the test suite with the offending lines commented out (lines 183 and 184 in django/db/models/query.py).

comment:2 Changed 10 years ago by Malcolm Tredinnick

We should either be raising an error in these cases or behaving as we are now.

Count() returns a single value and it doesn't make sense to apply limit and offset to a singleton list like that (although it works. A non-zero offset is going to lead to disappointment in the number of results you get back, though). It's almost certainly not what the caller intended and it's not going to work as a way to count how many things were returned after the limit and offset are applied.

LIMIT/OFFSET are applied after the result set is determined (and in this case it consists of one item, so it's pointless).

comment:3 Changed 10 years ago by cheap_hydrocodone

Cc: "" added
Component: Database wrapper
Keywords: "" added
Owner: changed from Adrian Holovaty to anonymous
priority: normal
Severity: normal
Type: defect

comment:4 Changed 10 years ago by Collin Grady <cgrady@…>

Cc: "" removed
Component: Database wrapper
Keywords: queryset count limit added; "" removed
Owner: changed from anonymous to Adrian Holovaty
priority: normal
Severity: blocker
Type: defect
Version: SVN

fixing spambot edits

comment:5 Changed 10 years ago by Collin Grady <cgrady@…>

Severity: blockernormal

woops, fixing invalid fix

comment:6 Changed 10 years ago by Chris Beaven

Component: Database wrapperAdmin interface
priority: normalhighest
Severity: normalblocker
Version: SVN

Malcom: .count() is used to determine proper page length for the paginator so it may be what the caller intended.

My opinion is that .count() should only return up to the maximum length that the sliced queryset could return.

comment:7 Changed 10 years ago by Chris Beaven

Component: Admin interfaceDatabase wrapper
priority: highestnormal
Severity: blockernormal

oops, refreshed after Collin's changes and reverted his fixes :P

comment:8 Changed 10 years ago by Chris Beaven

Triage Stage: UnreviewedDesign decision needed

Even if we don't change the existing behaviour, existing code should be commented as to why we are doing it the way we are.

comment:9 Changed 10 years ago by Collin Grady <cgrady@…>

On a whim I checked this using raw SQL and a LIMIT didn't affect the output of COUNT, so maybe that's why the code currently drops it - not much point adding it if the db won't respect it...

Though if there was a way to apply the limit first, and tell sql to count those without killing performance, it would definitely be more intuitive.

comment:10 Changed 10 years ago by Chris Beaven

Thanks for checking that Collin.

If we do still want count to take into account the LIMIT/OFFSET, then it should be easy enough to do some basic maths to return the correct value, don't you think?

Changed 10 years ago by Chris Beaven

Attachment: queryset_count.patch added

Changed 10 years ago by Chris Beaven

Attachment: queryset_count_doc.patch added

Documentation if it's needed

comment:11 Changed 10 years ago by Chris Beaven

Has patch: set
Triage Stage: Design decision neededReady for checkin

Malcom's comments aren't making much sense to me - no-one was talking about a singleton list.

It seems pretty clear to me that the limit/offset should be respected for a count, and my patch is sound so I'm marking as ready for checkin.

Changed 10 years ago by Chris Beaven

Attachment: queryset_count.2.patch added

with tests

comment:12 Changed 10 years ago by Chris Beaven

Summary: queryset.count() drops LIMIT/OFFSET informationqueryset.count() does not work with sliced querysets

I left the doc patch separate, because I'm not sure we need to explicitly document this change in behaviour in the DB api - the old behaviour was never implied.

comment:13 Changed 10 years ago by Chris Beaven

Summary: queryset.count() does not work with sliced querysetsqueryset.count() does not work correctly with sliced querysets

comment:14 Changed 10 years ago by Malcolm Tredinnick

Patch needs improvement: set
Triage Stage: Ready for checkinAccepted

My earlier comment was that count() and limit/offset do not mix at the SQL level. You have worked around that by manually applying those parameters inside the Python code. I think this is counter-intuitive, since it's now no longer SQL-intuitive or Python-intuitive (slicing the single count() result changes the result?! We implemented limit/offset as slices because it was analogous -- you were slicing a sequence of results). But I can see the flip side, too, so I'm not going to veto this.

One problem with the patch is

>>> articles[8:100].count()
-1

(using the same test data setup). You need to make line 224 be something like count = max(0, count - offset). Fix that and add a test and we're good to go, I think.

comment:15 Changed 10 years ago by Chris Beaven

Maybe I'm not following, but you can't slice a count() result as it does not return a QuerySet.
It seems pretty Python-intuitive that len(articles[:5]) == articles[:5].count().

Thanks for picking up that bug, I'll fix now.

Changed 10 years ago by Chris Beaven

Attachment: queryset_count.3.patch added

comment:16 Changed 10 years ago by Chris Beaven

Patch needs improvement: unset
Triage Stage: AcceptedReady for checkin

comment:17 Changed 10 years ago by Malcolm Tredinnick

Resolution: fixed
Status: newclosed

(In [4488]) #fixed #2256 -- Made count() interact with slicing on QuerySets. Patch from
SmileyChris.

comment:18 Changed 8 years ago by redduck666

Cc: Malcolm Tredinnick added

In [1]: from django.contrib.auth.models import User

In [2]: User.objects.all()[:0].count()
Out[2]: 4296

In [3]: len(User.objects.all()[:0])
Out[3]: 0

is this considered normal?

comment:19 Changed 8 years ago by Karen Tracey

Alex has opened #10202 for this.

Note: See TracTickets for help on using tickets.
Back to Top