Opened 5 years ago
Last modified 14 months ago
#31202 assigned Cleanup/optimization
Bulk update suffers from poor performance with large numbers of models and columns — at Version 1
Reported by: | Tom Forbes | Owned by: | Tom Forbes |
---|---|---|---|
Component: | Database layer (models, ORM) | Version: | dev |
Severity: | Normal | Keywords: | |
Cc: | Mikuláš Poul, Florian Demmer, John Speno, Akash Kumar Sen | Triage Stage: | Accepted |
Has patch: | no | Needs documentation: | no |
Needs tests: | no | Patch needs improvement: | no |
Easy pickings: | no | UI/UX: | no |
Description (last modified by )
A user has reported seeing extremely slow update times when using bulk_update. With the django-bulk-update package, which does not use the expressions API and constructs raw SQL directly (https://github.com/aykut/django-bulk-update/blob/master/django_bulk_update/helper.py#L202), an update with 100,000 objects and and 10 fields takes 24 seconds. With the built in bulk_update it takes 2 minutes and 24 seconds - 6x as slow.
The user has provided a reproduction case here: https://github.com/mikicz/bulk-update-tests/blob/master/apps/something/models.py and https://github.com/mikicz/bulk-update-tests/blob/master/apps/something/test_bulk_update.py
From an initial look at the profiling that has been provided (https://github.com/aykut/django-bulk-update/files/4060369/inbuilt_pyinstrument.txt) it seems a lot of overhead comes from building the SQL query rather than executing it - I think if we can improve the performance it could speed up other workloads.
See: https://github.com/aykut/django-bulk-update/issues/75#issuecomment-576886385
Change History (1)
comment:1 by , 5 years ago
Description: | modified (diff) |
---|---|
Owner: | changed from | to
Status: | new → assigned |