Opened 3 years ago
Closed 3 years ago
#33759 closed Cleanup/optimization (fixed)
Using subquery to filter a model delete generates a sub-optimal SQL
| Reported by: | Christofer Bertonha | Owned by: | Aman Pandey |
|---|---|---|---|
| Component: | Database layer (models, ORM) | Version: | 4.0 |
| Severity: | Normal | Keywords: | |
| Cc: | Simon Charette | Triage Stage: | Ready for checkin |
| Has patch: | yes | Needs documentation: | no |
| Needs tests: | no | Patch needs improvement: | no |
| Easy pickings: | no | UI/UX: | no |
Description (last modified by )
Taking in consideration the app of tutorial for demonstrate the issue.
using postgresql as db.
Executing this following code:
subquery = Comment.objects.filter(created_at__lt=datetime(2022, 6, 1))[:1000] Comment.objects.filter(id__in=subquery).delete()
Generates the following SQL:
DELETE FROM "comment" WHERE "comment"."id" IN (SELECT V0."id" FROM "comment" V0 WHERE V0."id" IN (SELECT U0."id" FROM "comment" U0 WHERE U0."created_at" < '2022-06-01T00:00:00+00:00'::timestamptz LIMIT 1000))
The inner select V0 is completely unnecessary
If a simple select with the same subquery is executed. It generates expected SQL query.
subquery = Comment.objects.filter(created_at__lt=datetime(2022, 6, 1))[:1000] comments = list( Comment.objects..filter(id__in=subquery) )
SELECT * FROM "comment" WHERE "comment"."id" IN (SELECT U0."id" FROM "comment" U0 WHERE U0."created_at" < '2022-06-01T00:00:00+00:00'::timestamptz LIMIT 1000),
Change History (11)
comment:1 by , 3 years ago
| Description: | modified (diff) |
|---|
comment:2 by , 3 years ago
| Description: | modified (diff) |
|---|
comment:3 by , 3 years ago
| Summary: | Using subquery to filter a model delete creates sub-optimal SQL → Using subquery to filter a model delete generates a sub-optimal SQL |
|---|
comment:4 by , 3 years ago
| Cc: | added |
|---|---|
| Type: | Bug → Cleanup/optimization |
comment:5 by , 3 years ago
| Triage Stage: | Unreviewed → Accepted |
|---|
We could add a new feature flag to add an extra subquery only on MySQL:
-
django/db/backends/base/features.py
diff --git a/django/db/backends/base/features.py b/django/db/backends/base/features.py index c54d30cf73..e8737aadc2 100644
a b class BaseDatabaseFeatures: 12 12 allows_group_by_selected_pks = False 13 13 empty_fetchmany_value = [] 14 14 update_can_self_select = True 15 # Does the backend support self-reference subqueries in the DELETE 16 # statement? 17 delete_can_self_reference_subquery = True 15 18 16 19 # Does the backend distinguish between '' and None? 17 20 interprets_empty_strings_as_nulls = False -
django/db/backends/mysql/features.py
diff --git a/django/db/backends/mysql/features.py b/django/db/backends/mysql/features.py index 3ea3deeae3..848d891a84 100644
a b class DatabaseFeatures(BaseDatabaseFeatures): 25 25 supports_slicing_ordering_in_compound = True 26 26 supports_index_on_text_field = False 27 27 supports_update_conflicts = True 28 delete_can_self_reference_subquery = False 28 29 create_test_procedure_without_params_sql = """ 29 30 CREATE PROCEDURE test_procedure () 30 31 BEGIN -
django/db/models/sql/compiler.py
diff --git a/django/db/models/sql/compiler.py b/django/db/models/sql/compiler.py index 9c7bd8ea1a..e57a43ac47 100644
a b class SQLDeleteCompiler(SQLCompiler): 1721 1721 Create the SQL for this query. Return the SQL string and list of 1722 1722 parameters. 1723 1723 """ 1724 if self.single_alias and not self.contains_self_reference_subquery: 1724 if self.single_alias and ( 1725 self.connection.features.delete_can_self_reference_subquery or 1726 not self.contains_self_reference_subquery 1727 ): 1725 1728 return self._as_sql(self.query) 1726 1729 innerq = self.query.clone() 1727 1730 innerq.__class__ = Query
Another issue is that QuerySet.delete() from the ticket description:
subquery = Comment.objects.filter(created_at__lt=datetime(2022, 6, 1))[:1000]
Comment.objects.filter(id__in=subquery).delete()
...
File "site-packages/MySQLdb/connections.py", line 254, in query
_mysql.connection.query(self, query)
django.db.utils.NotSupportedError: (1235, "This version of MySQL doesn't yet support 'LIMIT & IN/ALL/ANY/SOME subquery'")
crashes on MySQL even with 4074f38e1dcc93b859bbbfd6abd8441c3bca36b3 (I'm not sure it's worth fixing.)
comment:6 by , 3 years ago
Mariusz I think that it makes sense to only enable the query wrapping when necessary through a feature flag as you've suggested.
As for the MySQL limited subquery issue that's a different can of worms. The unnecessary query wrapping should be testable on backends with the feature disabled by counting the number of SELECT.
Thanks for the ping and sorry for my late answer!
comment:9 by , 3 years ago
| Owner: | set to |
|---|---|
| Status: | new → assigned |
comment:10 by , 3 years ago
| Has patch: | set |
|---|---|
| Needs tests: | set |
Draft PR up https://github.com/django/django/pull/16809
comment:12 by , 3 years ago
| Triage Stage: | Accepted → Ready for checkin |
|---|
This behavior was changed in 4074f38e1dcc93b859bbbfd6abd8441c3bca36b3 to prevent:
on MySQL. I have no idea for an alternative fix.