Opened 5 years ago
Last modified 4 weeks ago
#31202 assigned Cleanup/optimization
Bulk update suffers from poor performance with large numbers of models and columns — at Version 1
Description (last modified by ) ¶
A user has reported seeing extremely slow update times when using bulk_update. With the django-bulk-update package, which does not use the expressions API and constructs raw SQL directly (https://github.com/aykut/django-bulk-update/blob/master/django_bulk_update/helper.py#L202), an update with 100,000 objects and and 10 fields takes 24 seconds. With the built in bulk_update it takes 2 minutes and 24 seconds - 6x as slow.
The user has provided a reproduction case here: https://github.com/mikicz/bulk-update-tests/blob/master/apps/something/models.py and https://github.com/mikicz/bulk-update-tests/blob/master/apps/something/test_bulk_update.py
From an initial look at the profiling that has been provided (https://github.com/aykut/django-bulk-update/files/4060369/inbuilt_pyinstrument.txt) it seems a lot of overhead comes from building the SQL query rather than executing it - I think if we can improve the performance it could speed up other workloads.
See: https://github.com/aykut/django-bulk-update/issues/75#issuecomment-576886385
According to the ticket's flags, the next step(s) to move this issue forward are:
- To provide a patch by sending a pull request. Claim the ticket when you start working so that someone else doesn't duplicate effort. Before sending a pull request, review your work against the patch review checklist. Check the "Has patch" flag on the ticket after sending a pull request and include a link to the pull request in the ticket comment when making that update. The usual format is:
[https://github.com/django/django/pull/#### PR]
.
Change History (1)
comment:1 by , 5 years ago
Description: | modified (diff) |
---|---|
Owner: | changed from | to
Status: | new → assigned |