Opened 2 years ago
Closed 19 months ago
#33759 closed Cleanup/optimization (fixed)
Using subquery to filter a model delete generates a sub-optimal SQL
Reported by: | Christofer Bertonha | Owned by: | Aman Pandey |
---|---|---|---|
Component: | Database layer (models, ORM) | Version: | 4.0 |
Severity: | Normal | Keywords: | |
Cc: | Simon Charette | Triage Stage: | Ready for checkin |
Has patch: | yes | Needs documentation: | no |
Needs tests: | no | Patch needs improvement: | no |
Easy pickings: | no | UI/UX: | no |
Description (last modified by )
Taking in consideration the app of tutorial for demonstrate the issue.
using postgresql as db.
Executing this following code:
subquery = Comment.objects.filter(created_at__lt=datetime(2022, 6, 1))[:1000] Comment.objects.filter(id__in=subquery).delete()
Generates the following SQL:
DELETE FROM "comment" WHERE "comment"."id" IN (SELECT V0."id" FROM "comment" V0 WHERE V0."id" IN (SELECT U0."id" FROM "comment" U0 WHERE U0."created_at" < '2022-06-01T00:00:00+00:00'::timestamptz LIMIT 1000))
The inner select V0 is completely unnecessary
If a simple select with the same subquery is executed. It generates expected SQL query.
subquery = Comment.objects.filter(created_at__lt=datetime(2022, 6, 1))[:1000] comments = list( Comment.objects..filter(id__in=subquery) )
SELECT * FROM "comment" WHERE "comment"."id" IN (SELECT U0."id" FROM "comment" U0 WHERE U0."created_at" < '2022-06-01T00:00:00+00:00'::timestamptz LIMIT 1000),
Change History (11)
comment:1 by , 2 years ago
Description: | modified (diff) |
---|
comment:2 by , 2 years ago
Description: | modified (diff) |
---|
comment:3 by , 2 years ago
Summary: | Using subquery to filter a model delete creates sub-optimal SQL → Using subquery to filter a model delete generates a sub-optimal SQL |
---|
comment:4 by , 2 years ago
Cc: | added |
---|---|
Type: | Bug → Cleanup/optimization |
comment:5 by , 2 years ago
Triage Stage: | Unreviewed → Accepted |
---|
We could add a new feature flag to add an extra subquery only on MySQL:
-
django/db/backends/base/features.py
diff --git a/django/db/backends/base/features.py b/django/db/backends/base/features.py index c54d30cf73..e8737aadc2 100644
a b class BaseDatabaseFeatures: 12 12 allows_group_by_selected_pks = False 13 13 empty_fetchmany_value = [] 14 14 update_can_self_select = True 15 # Does the backend support self-reference subqueries in the DELETE 16 # statement? 17 delete_can_self_reference_subquery = True 15 18 16 19 # Does the backend distinguish between '' and None? 17 20 interprets_empty_strings_as_nulls = False -
django/db/backends/mysql/features.py
diff --git a/django/db/backends/mysql/features.py b/django/db/backends/mysql/features.py index 3ea3deeae3..848d891a84 100644
a b class DatabaseFeatures(BaseDatabaseFeatures): 25 25 supports_slicing_ordering_in_compound = True 26 26 supports_index_on_text_field = False 27 27 supports_update_conflicts = True 28 delete_can_self_reference_subquery = False 28 29 create_test_procedure_without_params_sql = """ 29 30 CREATE PROCEDURE test_procedure () 30 31 BEGIN -
django/db/models/sql/compiler.py
diff --git a/django/db/models/sql/compiler.py b/django/db/models/sql/compiler.py index 9c7bd8ea1a..e57a43ac47 100644
a b class SQLDeleteCompiler(SQLCompiler): 1721 1721 Create the SQL for this query. Return the SQL string and list of 1722 1722 parameters. 1723 1723 """ 1724 if self.single_alias and not self.contains_self_reference_subquery: 1724 if self.single_alias and ( 1725 self.connection.features.delete_can_self_reference_subquery or 1726 not self.contains_self_reference_subquery 1727 ): 1725 1728 return self._as_sql(self.query) 1726 1729 innerq = self.query.clone() 1727 1730 innerq.__class__ = Query
Another issue is that QuerySet.delete()
from the ticket description:
subquery = Comment.objects.filter(created_at__lt=datetime(2022, 6, 1))[:1000] Comment.objects.filter(id__in=subquery).delete() ... File "site-packages/MySQLdb/connections.py", line 254, in query _mysql.connection.query(self, query) django.db.utils.NotSupportedError: (1235, "This version of MySQL doesn't yet support 'LIMIT & IN/ALL/ANY/SOME subquery'")
crashes on MySQL even with 4074f38e1dcc93b859bbbfd6abd8441c3bca36b3.
comment:6 by , 2 years ago
Mariusz I think that it makes sense to only enable the query wrapping when necessary through a feature flag as you've suggested.
As for the MySQL limited subquery issue that's a different can of worms. The unnecessary query wrapping should be testable on backends with the feature disabled by counting the number of SELECT
.
Thanks for the ping and sorry for my late answer!
comment:9 by , 19 months ago
Owner: | set to |
---|---|
Status: | new → assigned |
comment:10 by , 19 months ago
Has patch: | set |
---|---|
Needs tests: | set |
Draft PR up https://github.com/django/django/pull/16809
comment:12 by , 19 months ago
Triage Stage: | Accepted → Ready for checkin |
---|
This behavior was changed in 4074f38e1dcc93b859bbbfd6abd8441c3bca36b3 to prevent:
on MySQL. I have no idea for an alternative fix.