Opened 8 years ago
Closed 8 years ago
#27193 closed New feature (fixed)
ORDER BY clause not included in subqueries using select_for_update()
Reported by: | sqwishy | Owned by: | nobody |
---|---|---|---|
Component: | Database layer (models, ORM) | Version: | 1.9 |
Severity: | Normal | Keywords: | |
Cc: | Shai Berger | Triage Stage: | Accepted |
Has patch: | yes | Needs documentation: | no |
Needs tests: | no | Patch needs improvement: | no |
Easy pickings: | no | UI/UX: | no |
Description
This was tested using Django 1.9.5. I believe the code below fails to generate the correct SQL. I explicitly set an ordering in the subquery but it is not expressed in the generated SQL. This alters the behaviour of the SELECT statement when the FOR UPDATE clause is used. My use case is to lock rows in a particular order with the goal of preventing deadlocks.
Example code:
with transaction.atomic(): print(User.objects.filter(id__in=User.objects.order_by('id').select_for_update()))
PostgreSQL logs:
2016-09-08 02:59:43 ACWST [30398-20] testsystem@testsystem LOG: statement: BEGIN 2016-09-08 02:59:43 ACWST [30398-21] testsystem@testsystem LOG: statement: SELECT "testapp_user"."id", "testapp_user"."password", ... FROM "testapp_user" WHERE "testapp_user"."id" IN (SELECT "testapp_user"."id" FROM "testapp_user" FOR UPDATE) LIMIT 21 2016-09-08 02:59:43 ACWST [30398-22] testsystem@testsystem LOG: statement: COMMIT
It looks like the order_by is being cleared on the queryset object here https://github.com/django/django/blob/master/django/db/models/sql/compiler.py#L476
I can verify this as it looks like ORDER BY does show up when I include distinct() in the subquery - which isn't a suitable workaround as DISTINCT is not compatible with FOR UPDATE - and by slicing the subquery as to set an offset or a limit.
For now, an acceptable workaround seems to make the SELECT FOR UPDATE in a separate query instead of a subquery. That at least seems to accomplish the goal I am trying to achieve.
Change History (6)
comment:1 by , 8 years ago
Resolution: | → needsinfo |
---|---|
Status: | new → closed |
comment:2 by , 8 years ago
Resolution: | needsinfo |
---|---|
Status: | closed → new |
The documentation at the following link may suggest that rows are locked while a query is running rather than all at once atomically; see particularly the second paragraph under "13.2.1. Read Committed Isolation Level" https://www.postgresql.org/docs/current/static/transaction-iso.html#XACT-READ-COMMITTED
This is a bit of code where adding the order by seems to prevent deadlocks.
from django.db import transaction from django.contrib.auth.models import Group from django.db.models import F def setup(): for i in range(20): Group.objects.get_or_create(name='test_27193_%i' % i) def test(): for _ in range(20): with transaction.atomic(): Group.objects.filter( id__in=Group.objects.filter(name__startswith='test_27193').select_for_update() ).update(name=F('name')) #Group.objects.filter( # id__in=Group.objects.filter(name__startswith='test_27193').select_for_update().order_by('id')[1:] #).update(name=F('name')) setup() test()
I have that saved to a file aptly named script.py
and run it 20 times at once (I'm sure there is a less ugly way to do this):
seq 20 | parallel --ungroup -j 20 "echo {}; ./manage.py shell --plain < script.py"
Doing this with the script as shown, exceptions are raised in python about deadlocks and PostgreSQL logs about it. If you comment out that first query and uncomment the second one everything runs fine. The slice there is just to convince django not to reset the order-by value in the queryset.
comment:3 by , 8 years ago
Well, I don't quite see it in the doc you linked, but it's mentioned in this thread and your code example is convincing. I stand corrected.
comment:4 by , 8 years ago
Cc: | added |
---|---|
Component: | Uncategorized → Database layer (models, ORM) |
Triage Stage: | Unreviewed → Accepted |
Type: | Uncategorized → New feature |
That said, 1.9 and even 1.10 are past the phase where this can be put in them. Accepting tentatively, assuming it can be reproduced in master (I don't have time to check right now)
Hi,
You said:
Can you please point to some documentation, or example code with execution, that shows this idea may work? As far as I know, the locking is atomic -- it is done for all rows matching the criteria together, and thus the order-by clause on the inner select is indeed meaningless and can be dropped.
If and when you show this, please re-open the ticket.