Opened 5 years ago
Last modified 14 months ago
#31202 assigned Cleanup/optimization
Bulk update suffers from poor performance with large numbers of models and columns — at Initial Version
Reported by: | Tom Forbes | Owned by: | nobody |
---|---|---|---|
Component: | Database layer (models, ORM) | Version: | dev |
Severity: | Normal | Keywords: | |
Cc: | Mikuláš Poul, Florian Demmer, John Speno, Akash Kumar Sen | Triage Stage: | Accepted |
Has patch: | no | Needs documentation: | no |
Needs tests: | no | Patch needs improvement: | no |
Easy pickings: | no | UI/UX: | no |
Description
A user has reported seeing extremely slow update times when using bulk_update. With the django-bulk-update package, which does not use the expressions API and constructs raw SQL directly, an update with 100,000 objects and and 10 fields takes 24 seconds. With the built in bulk_update it takes 2 minutes and 24 seconds - 6x as slow.
The user has provided a reproduction case here: https://github.com/mikicz/bulk-update-tests/blob/master/apps/something/models.py and https://github.com/mikicz/bulk-update-tests/blob/master/apps/something/test_bulk_update.py
From an initial look at the profiling that has been provided (https://github.com/aykut/django-bulk-update/files/4060369/inbuilt_pyinstrument.txt) it seems a lot of overhead comes from building the SQL query rather than executing it - I think if we can improve the performance it could speed up other workloads.
See: https://github.com/aykut/django-bulk-update/issues/75#issuecomment-576886385