O(n) behaviour in default configuration when uploading many duplicate filenames
|Reported by:||dw||Owned by:||timgraham|
|Has patch:||yes||Needs documentation:||no|
|Needs tests:||no||Patch needs improvement:||yes|
In the default configuration, Django django/core/files/storage.py get_available_name() may degrade to producing a huge number of stat() system calls when a duplicate filename is being uploaded. Since stat() may invoke IO, this may produce a huge data-dependent slowdown that slowly worsens over time.
The problem was originally detected by our test suite, but it is likely that especially older Django installations will be experiencing it too, perhaps without even realizing.
I filed a pull request at https://github.com/django/django/pull/3008 before reading the contributor documentation. It seems to suggest I should file a bug to be triaged before submitting a pull?
This is my first Django contribution, so apologies for any confusion.
Change History (12)
comment:1 Changed 2 years ago by areski
- Needs documentation unset
- Needs tests unset
- Patch needs improvement unset
comment:2 Changed 2 years ago by timo
- Has patch set
- Triage Stage changed from Unreviewed to Accepted
- Type changed from Uncategorized to Cleanup/optimization
comment:7 Changed 2 years ago by timgraham
- Owner changed from nobody to timgraham
- Patch needs improvement set
- Status changed from new to assigned
- Triage Stage changed from Ready for checkin to Accepted
comment:8 Changed 2 years ago by Tim Graham <timograham@…>
- Resolution set to fixed
- Status changed from assigned to closed