Opened 4 years ago

Last modified 7 months ago

#31202 assigned Cleanup/optimization

Bulk update suffers from poor performance with large numbers of models and columns — at Version 1

Reported by: Tom Forbes Owned by: Tom Forbes
Component: Database layer (models, ORM) Version: dev
Severity: Normal Keywords:
Cc: Mikuláš Poul, Florian Demmer, John Speno, Akash Kumar Sen Triage Stage: Accepted
Has patch: no Needs documentation: no
Needs tests: no Patch needs improvement: no
Easy pickings: no UI/UX: no

Description (last modified by Tom Forbes)

A user has reported seeing extremely slow update times when using bulk_update. With the django-bulk-update package, which does not use the expressions API and constructs raw SQL directly (https://github.com/aykut/django-bulk-update/blob/master/django_bulk_update/helper.py#L202), an update with 100,000 objects and and 10 fields takes 24 seconds. With the built in bulk_update it takes 2 minutes and 24 seconds - 6x as slow.

The user has provided a reproduction case here: https://github.com/mikicz/bulk-update-tests/blob/master/apps/something/models.py and https://github.com/mikicz/bulk-update-tests/blob/master/apps/something/test_bulk_update.py

From an initial look at the profiling that has been provided (https://github.com/aykut/django-bulk-update/files/4060369/inbuilt_pyinstrument.txt) it seems a lot of overhead comes from building the SQL query rather than executing it - I think if we can improve the performance it could speed up other workloads.

See: https://github.com/aykut/django-bulk-update/issues/75#issuecomment-576886385

Change History (1)

comment:1 by Tom Forbes, 4 years ago

Description: modified (diff)
Owner: changed from nobody to Tom Forbes
Status: newassigned
Note: See TracTickets for help on using tickets.
Back to Top