#30533 closed Bug (wontfix)
Delete cascade on large tables can cause process termination on PostgreSQL.
Reported by: | Thomas Riccardi | Owned by: | nobody |
---|---|---|---|
Component: | Database layer (models, ORM) | Version: | dev |
Severity: | Normal | Keywords: | queryset delete cascade batch bulk_batch_size postgresql crash |
Cc: | Triage Stage: | Unreviewed | |
Has patch: | no | Needs documentation: | no |
Needs tests: | no | Patch needs improvement: | no |
Easy pickings: | no | UI/UX: | no |
Description
Scenario
- Model A has a foreign key to model B with on_delete=CASCADE
- many B objects
- execute
B.objects.delete()
Expected result:
- deletion works
Actual result:
- when there are many B objects to delete (in our case it was 14M rows), the database process gets killed before completion:
2019-05-28 19:17:42.237 CEST [1]: [2-1] db=,user= LOG: server process (PID 12911) was terminated by signal 9: Killed 2019-05-28 19:17:42.238 CEST [1]: [3-1] db=,user= DETAIL: Failed process was running: DELETE FROM "a" WHERE "a"."b_id" IN (17271, 17272, 17273, 17274, 17275, <truncated enormous list>
- with a smaller database it worked (2M rows)
Analysis
It seems the related objects A get deleted in one query with an unbound IN (...)
list of B objects.
In fact this pattern already lead to an issue with the sqlite backend (#link0: sqlite supports only 999 parameters per query)
This was fixed in django 1.8 by adding batch query #link1, #link2 with a size specified per backend:
Workaround
As a temporary workaround we monkey patched the connection
instance own bulk_batch_size
and limit to 64k.
import types def monkey_patch_connection_bulk_batch_size(connection): def limited_bulk_batch_size(self, fields, objs): """ PostgreSQL can crash with too many parameters in a query e.g. 'DELETE FROM x WHERE x.y IN (...large list...)' => limit to 64k """ return 2**16 connection.ops.bulk_batch_size = types.MethodType(limited_bulk_batch_size, connection.ops)
It worked great in our case: we used it in a migration.
workaround limitations:
- no idea where to monkey patch for global usage
- no idea how to choose the bulk size value
- didn't handle the more complex computation using
fields
andobjs
parameters
(Related remark: this is for deleting the related objects, then for the main objects deletion django already uses batch deletion, but with much smaller batch size: GET_ITERATOR_CHUNK_SIZE = 100
; with some latency to the database it's a really small value (0.2ms per query))
Context
- postgresql 9.6 (GCP Cloud SQL)
- django 2.0.13 (but related code has not changed in current master it seems)
---
link 0 https://code.djangoproject.com/ticket/16426
link 1 https://github.com/django/django/blob/f6075fb333bae29ee213b050e91eaadef75496dd/django/db/models/deletion.py#L224-L225
link 2 https://github.com/django/django/blob/f6075fb333bae29ee213b050e91eaadef75496dd/django/db/models/deletion.py#L167
link 3 https://github.com/django/django/blob/f6075fb333bae29ee213b050e91eaadef75496dd/django/db/backends/sqlite3/operations.py#L27-L40
link 4 https://github.com/django/django/blob/f6075fb333bae29ee213b050e91eaadef75496dd/django/db/backends/oracle/operations.py#L592-L596
link 5 https://github.com/django/django/blob/f6075fb333bae29ee213b050e91eaadef75496dd/django/db/backends/base/operations.py#L65-L71
Change History (4)
follow-up: 2 comment:1 by , 5 years ago
Version: | 2.0 → master |
---|
comment:2 by , 5 years ago
Replying to felixxm:
Thanks for the report. Do you know any source that describes the max number of parameters for PostgreSQL? (e.g. 64k is specified in db docs on Oracle).
I think there is no hardcoded limits in postgresql; what I think probably happened is a high memory allocation that resulted in an OOMKill on the connection process on the db side: with more RAM the query could probably work.
The queries pattern in this scenario is:
- get all B objects to delete: SELECT
- delete all A related objects: the unbound DELETE IN that crashes
- delete all B objects: DELETE IN with batch of 100
It would make sense to have the same size for the 2 DELETE IN, but 100 is really small for DELETEs (the result size is constant, not proportional to the number of deleted objects): maybe add another constant for DELETE: GET_ITERATOR_CHUNK_SIZE
should be for get (where the result size is proportional to the number of objects), with a much higher value; this could be part of the fix for this ticket?
comment:3 by , 5 years ago
Resolution: | → wontfix |
---|---|
Status: | new → closed |
Summary: | Delete cascade can break postgresql database: too large 'DELETE FROM a WHERE a.b_id IN (...)' → Delete cascade can break postgresql database: too large 'DELETE FROM a WHERE a.b_id IN (...)'. |
If we don't have any restrictions in PostgreSQL I wouldn't classify this as a bug in Django. In SQLite and Oracle we limit the number of queries parameters or even the number of parameters in IN (...)
list, because we have to. These are databases limitations, which is not the case on PostgreSQL. You have at least two ways to handle this:
delete()
in chunks (1-2M) using ORM, or- TRUNCATE ... CASCADE table manually .
comment:4 by , 5 years ago
Summary: | Delete cascade can break postgresql database: too large 'DELETE FROM a WHERE a.b_id IN (...)'. → Delete cascade on large tables can cause process termination on PostgreSQL. |
---|
Thanks for the report. Do you know any source that describes the max number of parameters for PostgreSQL? (e.g. 64k is specified in db docs on Oracle).