document that order_by('?') is a huge performance issue
|Reported by:||GomoX <gomo@…>||Owned by:||mboersma|
|Component:||Database layer (models, ORM)||Version:|
|Cc:||gomo@…||Triage Stage:||Ready for checkin|
|Has patch:||yes||Needs documentation:||no|
|Needs tests:||no||Patch needs improvement:||no|
order_by('?') generates an SQL query that is horrendous from a performance point of view (the "ORDER BY RAND() LIMIT" type query).
For the current state of affairs, I think at the very least a warning should be added to http://www.djangoproject.com/documentation/db-api/#order-by-fields .
That page happily states that you can use the method for obtaining a random row, but in a real scenario that is a very bad idea, and should be avoided at all costs.
On a more useful approach, maybe extra code could be added to a model's Meta class if you plan on grabbing random rows from that particular table. This could set up needed tables/columns/constraints in order to extract a random row without such a big performance hit. If you use order_by('?') on a model with this Meta setting, the enhancement would be transparent. How and if this improvement could be implemented is open for discussion, and is probably database dependent. The page I linked above has some discussion on the topic.
Change History (4)
comment:1 Changed 7 years ago by Simon G. <dev@…>
- Needs documentation unset
- Needs tests unset
- Patch needs improvement unset
- Summary changed from order_by('?') is a huge performance issue to document that order_by('?') is a huge performance issue
- Triage Stage changed from Unreviewed to Accepted
Changed 7 years ago by mboersma
comment:2 Changed 7 years ago by mboersma
- Has patch set
- Owner changed from nobody to mboersma
- Status changed from new to assigned
- Triage Stage changed from Accepted to Ready for checkin