In admin results can be omitted due to pagination and inadequate ordering clauses
|Reported by:||lukeplant||Owned by:||julien|
|Has patch:||yes||Needs documentation:||no|
|Needs tests:||no||Patch needs improvement:||no|
In the admin, pagination is achieved using LIMIT/OFFSET. However, if the ordering specified in the SQL does not totally define an order of rows, LIMIT/OFFSET is not deterministic with respect to exactly which rows are returned.
For example, using django.contrib.auth, and suppose you have hundreds of users with is_staff == False. Then, User.objects.order_by('is_staff')[0:20] must return those with is_staff == False, (since these sort first), but it can return any of them. If you then ask it for User.objects.order_by('is_staff')[20:40], it is perfectly free to give you same 20 as the first time - since by the same token it can return any of them.
This means that if you are in the admin, looking at the User list and you have ordered by is_staff, paging through the results is not guaranteed to show you all rows - it could duplicate some and omit others.
I have observed this behaviour in the wild with Postgres. The client in question was extremely confused as to why, when sorting by a certain boolean field, some results disappeared, and I was stumped for a while. I also confirmed from a Django shell that a series of consecutive 'slices' of a QuerySet can return some rows twice and some not at all.
We can argue that the database is being completely correct in what it is returning, because the question asked is a silly question - you can't deterministically take items X to Y of a set that doesn't have a strict total order. But I don't think we can argue that this is not a bug in the admin - the admin should not be asking the database silly questions.
It applies to any situation where the ordering does not totally define the order of results returned. It's especially serious for booleans, because at least half your dataset will share a value.
One possible solution would be to add an ordering by 'pk' to whatever is explicitly chosen - so you would have User.objects.order_by('is_staff', 'pk'). I don't know what performance impact this would have, however.
Change History (26)
comment:6 Changed 3 years ago by julien
- Has patch set
comment:17 Changed 3 years ago by akaariai
- Cc anssi.kaariainen@… added