File.chunks contains potentially expensive size operation
|Reported by:||psagers||Owned by:||nobody|
|Cc:||Triage Stage:||Design decision needed|
|Has patch:||yes||Needs documentation:||no|
|Needs tests:||no||Patch needs improvement:||no|
django.core.files.base.File's chunk method uses the file's size to determine how many chunks to return. Retrieving the size of a file is very fast for simple files, but may be very expensive for compressed files and other storage mechanisms. And in this case, it's completely unnecessary. Replacing the size-based loop with a simple check for the end of the stream has fewer dependencies and avoids a potentially expensive operation.
To cite one practical example, if the file is stored as a compressed file on disk, the current implementation will result in either decompressing the entire file twice or caching the entire decompressed file in memory. Removing the size dependency avoids both.
Change History (11)
Changed 7 years ago by psagers
comment:1 Changed 6 years ago by jacob
- Needs documentation unset
- Needs tests unset
- Patch needs improvement unset
- Triage Stage changed from Unreviewed to Design decision needed