O(n) behaviour in default configuration when uploading many duplicate filenames
|Reported by:||dw||Owned by:||Tim Graham|
|Has patch:||yes||Needs documentation:||no|
|Needs tests:||no||Patch needs improvement:||yes|
In the default configuration, Django django/core/files/storage.py get_available_name() may degrade to producing a huge number of stat() system calls when a duplicate filename is being uploaded. Since stat() may invoke IO, this may produce a huge data-dependent slowdown that slowly worsens over time.
The problem was originally detected by our test suite, but it is likely that especially older Django installations will be experiencing it too, perhaps without even realizing.
I filed a pull request at https://github.com/django/django/pull/3008 before reading the contributor documentation. It seems to suggest I should file a bug to be triaged before submitting a pull?
This is my first Django contribution, so apologies for any confusion.
Change History (12)
comment:1 Changed 2 years ago by
|Patch needs improvement:||unset|
comment:2 Changed 2 years ago by
|Triage Stage:||Unreviewed → Accepted|
|Type:||Uncategorized → Cleanup/optimization|
comment:7 Changed 2 years ago by
|Owner:||changed from nobody to Tim Graham|
|Patch needs improvement:||set|
|Status:||new → assigned|
|Triage Stage:||Ready for checkin → Accepted|