Opened 6 years ago

Closed 4 months ago

#14131 closed Cleanup/optimization (fixed)

The pagination module should have some limit, or a warning should be given in the documentation

Reported by: mlissner Owned by: winsmith
Component: Documentation Version: master
Severity: Normal Keywords:
Cc: winsmith@… Triage Stage: Accepted
Has patch: yes Needs documentation: no
Needs tests: no Patch needs improvement: no
Easy pickings: no UI/UX: no

Description

Perhaps I'm off base on this, but I was just analyzing my slow queries log, and I noticed that the pagination module produces some incredibly slow queries in MySQL when high page numbers are requested. This just took down my server, so I was thinking about possible solutions.

The pagination module is using the OFFSET MySQL clause, which, as I understand it, must iterate over every row up to the OFFSET number in order to work.

The two solutions I thought of for this were:

  1. Add something to the documentation that indicates how this might produce very slow queries.
  1. Add code to the pagination module to either:
  1. Solve the problem with smarter MySQL queries (not sure if/how this is possible, sorry); or
  2. Allow a max_pages attribute, which throws some kind of error when the max_page number is exceeded. It could even have a sane default, keeping people within normal limits.

Admittedly, 2b is the solution I'm coding up right now, but it would be nice if Django enforced, or at least hinted at, a solution to this problem.

I've also (FWIW) tested pagination on Google, Yahoo, and Bing, and all three limit their results (Bing: page 20, Google: page 100, and Yahoo: page 100).

Change History (15)

comment:1 Changed 6 years ago by PaulM

  • Needs documentation unset
  • Needs tests unset
  • Patch needs improvement unset
  • Triage Stage changed from Unreviewed to Accepted

I agree that this should be documented as a potential gotcha.

LIMIT/OFFSET is slow for the reasons you mentioned.

There are a number of ways around this problem. One approach is to index a table and use WHERE/LIMIT rather than LIMIT/OFFSET. I'm not sure where in Django this sort of optimization belongs. On the one hand, it's a problem shared by all queryset slicing operations that produce a LIMIT/OFFSET statement. On the other, it requires specific knowledge about the contents of the table (and an index), so it might be easier to pass the clue in to a custom bit here in Pagination. Or possibly I'm completely off base here.

More discussion of the same issue here:
http://stackoverflow.com/questions/1243952/how-can-i-speed-up-a-mysql-query-with-a-large-offset-in-the-limit-clause

and slides 12 to 14 dicuss the WHERE/LIMIT approach here:
http://www.slideshare.net/Eweaver/efficient-pagination-using-mysql

comment:2 Changed 5 years ago by julien

  • Component changed from Contrib apps to Core framework

comment:3 Changed 5 years ago by julien

  • Severity set to Normal
  • Type set to Cleanup/optimization

comment:4 Changed 4 years ago by aaugustin

  • UI/UX unset

Change UI/UX from NULL to False.

comment:5 Changed 4 years ago by aaugustin

  • Easy pickings unset

Change Easy pickings from NULL to False.

comment:6 Changed 12 months ago by timgraham

  • Component changed from Core (Other) to Documentation

comment:7 Changed 4 months ago by winsmith

  • Cc winsmith@… added
  • Owner changed from nobody to winsmith
  • Status changed from new to assigned

comment:9 Changed 4 months ago by Erik Romijn <eromijn@…>

  • Resolution set to fixed
  • Status changed from assigned to closed

In 55c843f2:

Fixed #14131 -- Added note to docs about Pagination and large Querysets

comment:10 Changed 4 months ago by Erik Romijn <eromijn@…>

In be1ac00:

[1.9.x] Fixed #14131 -- Added note to docs about Pagination and large Querysets

Backport of 55c843f2ef702b4ebcd024920d4193bdf4c3fe07 from master.

comment:11 Changed 4 months ago by timgraham

  • Has patch unset
  • Resolution fixed deleted
  • Status changed from closed to new

Reopening per my comment on the PR that was missed before the merge: "I think you should at least describe the reason so that someone knowledgeable about their database can ascertain whether or not the warning applies to them. "

comment:12 Changed 4 months ago by winsmith

  • Has patch set

Thanks for your feedback, I added some more information.

New PR: https://github.com/django/django/pull/6422

comment:13 Changed 4 months ago by Tim Graham <timograham@…>

In eed658d7:

Refs #14131 -- Documented why paginating large QuerySets may be slow.

comment:14 Changed 4 months ago by Tim Graham <timograham@…>

In e043b85b:

[1.9.x] Refs #14131 -- Documented why paginating large QuerySets may be slow.

Backport of eed658d7c4dda695976c6845346b166960957eba from master

comment:15 Changed 4 months ago by timgraham

  • Resolution set to fixed
  • Status changed from new to closed
Note: See TracTickets for help on using tickets.
Back to Top