Django

Code

Ticket #2256 (closed: fixed)

Opened 2 years ago

Last modified 2 years ago

queryset.count() does not work correctly with sliced querysets

Reported by: eth_anton@yahoo.com Assigned to: adrian
Milestone: Component: Database layer (models, ORM)
Version: Keywords: queryset count limit
Cc: Triage Stage: Ready for checkin
Has patch: 1 Needs documentation: 0
Needs tests: 0 Patch needs improvement: 0

Description

Makes for counter-intuitive behavior when slicing...

From an interactive shell session:

>>> all = NewsItem.objects.all()
>>> all.count()
11L
>>> all[:5].count()
11L
>>> set = NewsItem.objects.all()[:5]
>>> set.count()
11L
>>>

Attachments

queryset_count.patch (1.5 kB) - added by SmileyChris on 02/11/07 17:04:00.
queryset_count_doc.patch (0.5 kB) - added by SmileyChris on 02/11/07 17:04:42.
Documentation if it's needed
queryset_count.2.patch (1.9 kB) - added by SmileyChris on 02/11/07 17:16:26.
with tests
queryset_count.3.patch (2.0 kB) - added by SmileyChris on 02/11/07 17:46:38.

Change History

06/28/06 15:35:13 changed by ubernostrum

Hm.

We're definitely explicitly dropping the limit and offset if they're present, but there's no comment to explain why. When I get a little time, I'll try the test suite with the offending lines commented out (lines 183 and 184 in django/db/models/query.py).

07/19/06 01:09:29 changed by mtredinnick

We should either be raising an error in these cases or behaving as we are now.

Count() returns a single value and it doesn't make sense to apply limit and offset to a singleton list like that (although it works. A non-zero offset is going to lead to disappointment in the number of results you get back, though). It's almost certainly not what the caller intended and it's not going to work as a way to count how many things were returned after the limit and offset are applied.

LIMIT/OFFSET are applied after the result set is determined (and in this case it consists of one item, so it's pointless).

11/06/06 08:35:45 changed by cheap_hydrocodone

  • severity deleted.
  • cc set to "".
  • component deleted.
  • priority deleted.
  • owner changed from adrian to anonymous.
  • keywords set to "".
  • type deleted.

11/09/06 15:51:40 changed by Collin Grady <cgrady@the-magi.us>

  • severity set to blocker.
  • cc deleted.
  • component set to Database wrapper.
  • priority set to normal.
  • owner changed from anonymous to adrian.
  • version set to SVN.
  • keywords changed from "" to queryset count limit.
  • type set to defect.

fixing spambot edits

11/09/06 15:52:10 changed by Collin Grady <cgrady@the-magi.us>

  • severity changed from blocker to normal.

woops, fixing invalid fix

11/09/06 15:54:53 changed by SmileyChris

  • priority changed from normal to highest.
  • version deleted.
  • component changed from Database wrapper to Admin interface.
  • severity changed from normal to blocker.

Malcom: .count() is used to determine proper page length for the paginator so it may be what the caller intended.

My opinion is that .count() should only return up to the maximum length that the sliced queryset could return.

11/09/06 15:55:59 changed by SmileyChris

  • priority changed from highest to normal.
  • component changed from Admin interface to Database wrapper.
  • severity changed from blocker to normal.

oops, refreshed after Collin's changes and reverted his fixes :P

02/08/07 22:08:21 changed by SmileyChris

  • stage changed from Unreviewed to Design decision needed.

Even if we don't change the existing behaviour, existing code should be commented as to why we are doing it the way we are.

02/08/07 22:17:50 changed by Collin Grady <cgrady@the-magi.us>

On a whim I checked this using raw SQL and a LIMIT didn't affect the output of COUNT, so maybe that's why the code currently drops it - not much point adding it if the db won't respect it...

Though if there was a way to apply the limit first, and tell sql to count those without killing performance, it would definitely be more intuitive.

02/08/07 23:35:54 changed by SmileyChris

Thanks for checking that Collin.

If we do still want count to take into account the LIMIT/OFFSET, then it should be easy enough to do some basic maths to return the correct value, don't you think?

02/11/07 17:04:00 changed by SmileyChris

  • attachment queryset_count.patch added.

02/11/07 17:04:42 changed by SmileyChris

  • attachment queryset_count_doc.patch added.

Documentation if it's needed

02/11/07 17:16:00 changed by SmileyChris

  • has_patch set to 1.
  • stage changed from Design decision needed to Ready for checkin.

Malcom's comments aren't making much sense to me - no-one was talking about a singleton list.

It seems pretty clear to me that the limit/offset should be respected for a count, and my patch is sound so I'm marking as ready for checkin.

02/11/07 17:16:26 changed by SmileyChris

  • attachment queryset_count.2.patch added.

with tests

02/11/07 17:22:40 changed by SmileyChris

  • summary changed from queryset.count() drops LIMIT/OFFSET information to queryset.count() does not work with sliced querysets.

I left the doc patch separate, because I'm not sure we need to explicitly document this change in behaviour in the DB api - the old behaviour was never implied.

02/11/07 17:23:00 changed by SmileyChris

  • summary changed from queryset.count() does not work with sliced querysets to queryset.count() does not work correctly with sliced querysets.

02/11/07 17:35:24 changed by mtredinnick

  • needs_better_patch set to 1.
  • stage changed from Ready for checkin to Accepted.

My earlier comment was that count() and limit/offset do not mix at the SQL level. You have worked around that by manually applying those parameters inside the Python code. I think this is counter-intuitive, since it's now no longer SQL-intuitive or Python-intuitive (slicing the single count() result changes the result?! We implemented limit/offset as slices because it was analogous -- you were slicing a sequence of results). But I can see the flip side, too, so I'm not going to veto this.

One problem with the patch is

>>> articles[8:100].count()
-1

(using the same test data setup). You need to make line 224 be something like count = max(0, count - offset). Fix that and add a test and we're good to go, I think.

02/11/07 17:44:05 changed by SmileyChris

Maybe I'm not following, but you can't slice a count() result as it does not return a QuerySet. It seems pretty Python-intuitive that len(articles[:5]) == articles[:5].count().

Thanks for picking up that bug, I'll fix now.

02/11/07 17:46:38 changed by SmileyChris

  • attachment queryset_count.3.patch added.

02/11/07 17:47:55 changed by SmileyChris

  • needs_better_patch deleted.
  • stage changed from Accepted to Ready for checkin.

02/11/07 18:16:17 changed by mtredinnick

  • status changed from new to closed.
  • resolution set to fixed.

(In [4488]) #fixed #2256 -- Made count() interact with slicing on QuerySets?. Patch from SmileyChris?.


Add/Change #2256 (queryset.count() does not work correctly with sliced querysets)




Change Properties
Action