Opened 5 years ago

Last modified 4 years ago

#14131 new Cleanup/optimization

The pagination module should have some limit, or a warning should be given in the documentation

Reported by: mlissner Owned by: nobody
Component: Core (Other) Version: master
Severity: Normal Keywords:
Cc: Triage Stage: Accepted
Has patch: no Needs documentation: no
Needs tests: no Patch needs improvement: no
Easy pickings: no UI/UX: no

Description

Perhaps I'm off base on this, but I was just analyzing my slow queries log, and I noticed that the pagination module produces some incredibly slow queries in MySQL when high page numbers are requested. This just took down my server, so I was thinking about possible solutions.

The pagination module is using the OFFSET MySQL clause, which, as I understand it, must iterate over every row up to the OFFSET number in order to work.

The two solutions I thought of for this were:

  1. Add something to the documentation that indicates how this might produce very slow queries.
  1. Add code to the pagination module to either:
  1. Solve the problem with smarter MySQL queries (not sure if/how this is possible, sorry); or
  2. Allow a max_pages attribute, which throws some kind of error when the max_page number is exceeded. It could even have a sane default, keeping people within normal limits.

Admittedly, 2b is the solution I'm coding up right now, but it would be nice if Django enforced, or at least hinted at, a solution to this problem.

I've also (FWIW) tested pagination on Google, Yahoo, and Bing, and all three limit their results (Bing: page 20, Google: page 100, and Yahoo: page 100).

Change History (5)

comment:1 Changed 5 years ago by PaulM

  • Needs documentation unset
  • Needs tests unset
  • Patch needs improvement unset
  • Triage Stage changed from Unreviewed to Accepted

I agree that this should be documented as a potential gotcha.

LIMIT/OFFSET is slow for the reasons you mentioned.

There are a number of ways around this problem. One approach is to index a table and use WHERE/LIMIT rather than LIMIT/OFFSET. I'm not sure where in Django this sort of optimization belongs. On the one hand, it's a problem shared by all queryset slicing operations that produce a LIMIT/OFFSET statement. On the other, it requires specific knowledge about the contents of the table (and an index), so it might be easier to pass the clue in to a custom bit here in Pagination. Or possibly I'm completely off base here.

More discussion of the same issue here:
http://stackoverflow.com/questions/1243952/how-can-i-speed-up-a-mysql-query-with-a-large-offset-in-the-limit-clause

and slides 12 to 14 dicuss the WHERE/LIMIT approach here:
http://www.slideshare.net/Eweaver/efficient-pagination-using-mysql

comment:2 Changed 4 years ago by julien

  • Component changed from Contrib apps to Core framework

comment:3 Changed 4 years ago by julien

  • Severity set to Normal
  • Type set to Cleanup/optimization

comment:4 Changed 3 years ago by aaugustin

  • UI/UX unset

Change UI/UX from NULL to False.

comment:5 Changed 3 years ago by aaugustin

  • Easy pickings unset

Change Easy pickings from NULL to False.

Note: See TracTickets for help on using tickets.
Back to Top