Opened 8 years ago

Closed 4 years ago

#7760 closed Cleanup/optimization (wontfix)

Queries on large primary tables with limit/offset clauses are slow

Reported by: henrybaxter Owned by: nobody
Component: Database layer (models, ORM) Version: master
Severity: Normal Keywords: database, admin, slow, query
Cc: Triage Stage: Someday/Maybe
Has patch: yes Needs documentation: no
Needs tests: no Patch needs improvement: no
Easy pickings: no UI/UX: no


When django's db layer needs to select_related() on a large initial table with a number of joins and it has a limit or offset clause, it generates slow queries.

SELECT * FROM big_table
INNER JOIN t1 ON big_table.t1_id
INNER JOIN tn ON big_table.tn_id

This query will join the whole table (even if it has a 1 million records) and THEN do the limit. This is WAY faster:


This type of query change does not work when there is a WHERE clause that references a product of the JOIN though. If the WHERE clause only references the primary table, then it still works great.

I have submitted a patch to help demonstrate the problem - it only works when there are no WHERE clauses at all (which is almost pointless), but it works for my particular purpose (pagination in the admin interface with no filters or searching)

I would love to hear from another developer how to integrate this functionality (or if it can actually be done without changes to the django codebase)

Attachments (1)

subquery_patch (2.9 KB) - added by henrybaxter 8 years ago.
Subquery LIMIT and OFFSET

Download all attachments as: .zip

Change History (9)

Changed 8 years ago by henrybaxter

Attachment: subquery_patch added

Subquery LIMIT and OFFSET

comment:1 Changed 8 years ago by Malcolm Tredinnick

Needs documentation: unset
Needs tests: unset
Patch needs improvement: unset
Triage Stage: UnreviewedDesign decision needed

At the moment, I suspect this is just not worth it, since there are so many cases where it's not going to be a useful transform and it's a whole separate code path to have to maintain. If it's a real performance issue for a particular use-case, then the correct solution is "don't do that". Doing limits and offsets is already a performance hit. Doing a slightly better related filter and/or using values() is going to be the solution in most cases.

Leaving open for a bit because I don't want to make a rush judgement, but I'm uninspired about the proposal at the moment.

comment:2 Changed 8 years ago by henrybaxter

Thank you for looking at it, I really appreciate it. I actually think I agree though, it's just not that useful. It requires careful manipulation for ORDER BY as well as WHERE as I mentioned.

Sorry for the ill-conceived idea :)

By the way the 'use case' was the Django admin. I still think somehow Django admin should handle this situation better:

class Thing(models.Model):

me = models.CharField(max_length=100)
foreign = models.ForeignKey(Foreign)
foreign2 = models.ForeignKey(Foreign2)

class Admin:

list_display = ('me', 'foreign', 'foreign2')

where 'Thing' has > 10000 entries, foreign tables need not be too large and you are looking at the admin changelist for it

On a decent machine with no load the resulting query will take several seconds to process

In face I have no idea why MySQL does not optimize the query, since this is really the database's territory...

Thanks again for your response!

comment:3 Changed 8 years ago by Piotr Lewandowski <django@…>

Component: UncategorizedDatabase wrapper

comment:4 Changed 6 years ago by Malcolm Tredinnick

Triage Stage: Design decision neededSomeday/Maybe

My opinion hasn't changed much over the intervening years on this one, although if it was magically easily possible, I would take the patch. Let's call it someday/maybe to give people something to look it. Should somebody come along with a great patch to do this that didn't make the normal SQL construction path slower, there would be no reason to keep it out.

comment:5 Changed 5 years ago by Luke Plant

Severity: Normal
Type: Cleanup/optimization

comment:6 Changed 5 years ago by Aymeric Augustin

UI/UX: unset

Change UI/UX from NULL to False.

comment:7 Changed 5 years ago by Aymeric Augustin

Easy pickings: unset

Change Easy pickings from NULL to False.

comment:8 Changed 4 years ago by Anssi Kääriäinen

Resolution: wontfix
Status: newclosed

I think this can be closed. First, to me it seems a good DB would be able to optimize such queries, second, I am afraid supporting this will be hard. I wonder if fixing this ticket would cause performance problems for other queries.

If we want to support this kind of query, then I think a separate API for doing subqueries would be the way to go.

Note: See TracTickets for help on using tickets.
Back to Top