Code

Opened 6 years ago

Last modified 2 months ago

#8408 assigned New feature

Add a new meta option: don't do count(*) in admin

Reported by: lidaobing Owned by: oinopion
Component: contrib.admin Version: master
Severity: Normal Keywords:
Cc: toke, ludo, marcoberi@…, djfische@…, boxm@…, kmike84@…, slav0nic0@…, dan.fairs@… Triage Stage: Accepted
Has patch: yes Needs documentation: yes
Needs tests: yes Patch needs improvement: no
Easy pickings: no UI/UX: no

Description

Hello,

everytime when I use Admin to view a table, the Admin count the rows in the table. if the database is mysql+innodb, this step is very slow, how about add an option in Meta to prevent this, such as

class A(models.Model):
    class Meta:
        do_not_count = True

thanks

Attachments (1)

paginator-patch.diff (13.2 KB) - added by boxm 3 years ago.
Patch admin classes to add UncountedPaginator

Download all attachments as: .zip

Change History (37)

comment:1 Changed 6 years ago by mtredinnick

  • milestone set to post-1.0
  • Needs documentation unset
  • Needs tests unset
  • Patch needs improvement unset
  • Triage Stage changed from Unreviewed to Design decision needed

I can't see that this is likely to happen. Firstly, it would be an admin option, not a meta option. Secondly, even with tens of thousands of rows in a test table here, the count takes far less than a second, so compared to the time for collecting the display data, rendering it and returning the page, it isn't that large a time slice.

How many rows are in this table you are seeing the problem with?

Right now I'm -1 on this as being such an edge case as to be not worth an option. If counting the rows in a table is too slow to be used, then there are going to be plenty of other options querying that table as well.

comment:2 Changed 6 years ago by lidaobing

we have tens of millions rows in this table. and we want to use django Admin as a customizable viewer.

comment:3 Changed 6 years ago by adrian

  • Triage Stage changed from Design decision needed to Accepted

I like this optional optimization for people who don't want the admin to do a "select count(*)" on every changelist view, but I'm not sure whether we could implement it without throwing out some valuable parts of the changelist (mostly, the pagination). So, I'm marking this as "accepted," with the caveat that we need to look at what parts of the changelist will be thrown out.

comment:4 Changed 5 years ago by soroosh

This can be done without losing any functionality by setting a limit for count like this:
SELECT count(*) from (select id from HUGE_TABLE LIMIT 10000) as b;
Then if the raws count is more than for example 10000, SQL server stops sequential scanning through the table and paginatior can list pages around current page but not the last page. On my system postgreql takes 60 seconds to return count of a table with 11,000,000 raws and this is suboptimal to wait a minute for each admin page to load.

comment:5 Changed 5 years ago by anonymous

  • milestone post-1.0 deleted

Milestone post-1.0 deleted

comment:6 Changed 5 years ago by mrts

  • milestone set to 1.2

Having problems with large datasets in admin myself (not sure if count() is the only reason though), proposing this for 1.2.

comment:7 Changed 5 years ago by gonz

  • Cc gonz added

comment:8 Changed 5 years ago by toke

  • Cc toke added

comment:9 Changed 4 years ago by ludo

  • Cc ludo added

comment:10 Changed 4 years ago by stephanhoyer

The count(*) is essential for displaying the paginator. Without knowing the amount of entities of a table, there can be any well working paginator.
One solution could be to display only "prev/next"-buttons and input field for page.

But i think the effort of fixing this is much higher then the benefit

So -1 for this request.

comment:11 Changed 4 years ago by anonymous

aside from the count problem, LIMIT item_count OFFSET 25000 (maybe slower numbers too already) is damn slow in MySQL ;)

comment:12 Changed 4 years ago by ubernostrum

  • milestone 1.2 deleted

1.2 is feature-frozen, moving this feature request off the milestone.

comment:13 Changed 4 years ago by marcob

See #13643 for an ugly-monkey-patching workaround.

comment:14 Changed 4 years ago by marcob

  • Cc marcoberi@… added

comment:15 Changed 3 years ago by davidfischer

  • Cc djfische@… added

comment:16 follow-up: Changed 3 years ago by kmike

I think this can be marked as 'wontfix' because custom paginators can be used in admin now:
http://docs.djangoproject.com/en/dev/ref/contrib/admin/#django.contrib.admin.ModelAdmin.paginator

comment:17 in reply to: ↑ 16 ; follow-up: Changed 3 years ago by lrekucki

  • Resolution set to wontfix
  • Status changed from new to closed

Replying to kmike:

I think this can be marked as 'wontfix' because custom paginators can be used in admin now:
http://docs.djangoproject.com/en/dev/ref/contrib/admin/#django.contrib.admin.ModelAdmin.paginator

Agreed :) Here's a link to a sample Paginator that doesn't use count

comment:18 in reply to: ↑ 17 Changed 3 years ago by boxm

  • Cc boxm@… added
  • Easy pickings unset
  • Resolution wontfix deleted
  • Severity set to Normal
  • Status changed from closed to reopened
  • Type set to Uncategorized
  • Version changed from 0.96 to SVN

Replying to lrekucki:

Replying to kmike:

I think this can be marked as 'wontfix' because custom paginators can be used in admin now:
http://docs.djangoproject.com/en/dev/ref/contrib/admin/#django.contrib.admin.ModelAdmin.paginator

Agreed :) Here's a link to a sample Paginator that doesn't use count

Reopening, as the resolution simply doesn't work. The example paginator given by kmike simply doesn't work as a ModelAdmin paginator, because the {% pagination %} tag and the ChangeList class (among others) assume you can count the number of items being paginated. See my comment on #4065 for the gory details.

There are scenarios where you don't know how many results are being paginated, but can still do useful pagination links:

  • With InnoDB tables in MySQL, count(*) is frighteningly expensive, so while in principle you can count, in practice you can't
  • With NoSQL databases it can be impossible to do a count, but you can still paginate by range queries

A proposed design for such a paginator would be to display (on page 12):

1 2 3 ..... 10 11 12 ... Next

which is better than just next/prev.

I'll see if I can create acceptable patches.

Changed 3 years ago by boxm

Patch admin classes to add UncountedPaginator

comment:19 Changed 3 years ago by boxm

  • Has patch set
  • Needs documentation set
  • Needs tests set
  • Triage Stage changed from Accepted to Design decision needed

OK, here's an attempt at a patch that fixes this.

I've added a new UncountedPaginator to pagination.py, and updated the admin code to check whether the paginator has the appropriate attributes before using them, falling back on alternative implementations if not.

Before I create test & doc updates, would like some feedback on whether this patch is going about things the right way and is something that might be accepted.

comment:20 Changed 3 years ago by julien

  • Type changed from Uncategorized to New feature
  • UI/UX unset

comment:21 Changed 3 years ago by charles@…

This is not just MySQL. PostgreSQL has a separate query for cheap, inexact counts (with statistics collected during the most recent ANALYZE) -- but exact counts are expensive; I have a table where a SELECT COUNT(*) is typically running upward of 50 seconds.

Perhaps on PostgreSQL we could run a query like select reltuples from pg_class where relname=?;, and use the traditional pagination system if the result is under 20,000 or so, or a newer one otherwise?

See also http://wiki.postgresql.org/wiki/Slow_Counting

comment:22 Changed 2 years ago by kmike

  • Cc kmike84@… added

comment:23 Changed 2 years ago by adamnelson

  • Triage Stage changed from Design decision needed to Accepted

Switching back to 'accepted' - I think the switch to DDN was in error and simply sent this back into the icebox even though it was previously accepted by a core dev and has a draft patch.

comment:24 Changed 2 years ago by anonymous

The issue is not restricted to pagination. Seems like the admin interface calls count(*) at several other places.

comment:25 Changed 2 years ago by gonz

  • Cc gonz removed

comment:26 Changed 2 years ago by nova77

Apparently count(*) is called on the whole table even if you filter your view. For instance if you add "?idin=555,666", you'll have two select count(*): one with the "where" clause and a plain one which runs on the whole table.

comment:27 Changed 2 years ago by akaariai

Can you track down where the unrestricted qs.count() is issued, and if there is a reason for it?

comment:28 Changed 2 years ago by aaugustin

The admin displays the number of matching objects and the total number of objects.

comment:29 Changed 22 months ago by slav0nic

  • Cc slav0nic0@… added

comment:30 Changed 15 months ago by danfairs

  • Cc dan.fairs@… added

comment:31 Changed 15 months ago by bb@…

+1 because I ran into this problem today.

Using PostgreSQL and django 1.4 with a table that contains about 15 million rows. executing "select count(*) from mytable" in the psql console takes about 40 seconds, even with an index on the primary key. This leads to timeouts and huge lag.

I'm going to try MySQL to see if this is any better.

comment:32 Changed 13 months ago by aaugustin

  • Status changed from reopened to new

comment:33 Changed 11 months ago by craig.labenz@…

Using a table of 130M+ rows and thus obviously running into the same problem. The COUNT(*) query takes literally 10+ minutes to run in InnoDB-land.

comment:34 Changed 6 months ago by jonathan_livni

When django is hosted on Heroku, there's a time limit of 30 seconds for any web request. So instead of admin pages taking a very long time to load, you get admin pages which never load.

http://s2.postimg.org/onfy59xuh/Capture.jpg

comment:35 Changed 2 months ago by oinopion

  • Owner changed from nobody to oinopion
  • Status changed from new to assigned
  • Summary changed from add a new meta option: don't do count(*) in admin to Add a new meta option: don't do count(*) in admin

comment:36 Changed 2 months ago by oinopion

I'm working on this feature in my branch here: https://github.com/oinopion/django/compare/nocountpagination

Add Comment

Modify Ticket

Change Properties
<Author field>
Action
as assigned
The owner will be changed from oinopion to anonymous. Next status will be 'assigned'
The ticket will be disowned. Next status will be 'new'
as The resolution will be set. Next status will be 'closed'
Author


E-mail address and user name can be saved in the Preferences.

 
Note: See TracTickets for help on using tickets.