Opened 8 years ago

Closed 5 years ago

#9632 closed Cleanup/optimization (duplicate)

File.chunks contains potentially expensive size operation

Reported by: Peter Sagerson Owned by: nobody
Component: File uploads/storage Version: 1.0
Severity: Normal Keywords: file
Cc: Triage Stage: Design decision needed
Has patch: yes Needs documentation: no
Needs tests: no Patch needs improvement: no
Easy pickings: no UI/UX: no


django.core.files.base.File's chunk method uses the file's size to determine how many chunks to return. Retrieving the size of a file is very fast for simple files, but may be very expensive for compressed files and other storage mechanisms. And in this case, it's completely unnecessary. Replacing the size-based loop with a simple check for the end of the stream has fewer dependencies and avoids a potentially expensive operation.

To cite one practical example, if the file is stored as a compressed file on disk, the current implementation will result in either decompressing the entire file twice or caching the entire decompressed file in memory. Removing the size dependency avoids both.

Attachments (1)

chunks.diff (727 bytes) - added by Peter Sagerson 8 years ago.

Download all attachments as: .zip

Change History (11)

Changed 8 years ago by Peter Sagerson

Attachment: chunks.diff added

comment:1 Changed 8 years ago by Jacob

Needs documentation: unset
Needs tests: unset
Patch needs improvement: unset
Triage Stage: UnreviewedDesign decision needed

comment:2 Changed 6 years ago by Adam Nelson

We need a design decision on this - the patch looks fine.

comment:3 Changed 6 years ago by Adam Nelson

Also this ticket stating that the chunk size isn't always honored at #12157

comment:4 Changed 6 years ago by Ian Lewis

+1 for removing the size() check in chunks(). It tends to break things when wrapping non-file-system file-like-objects such as StringIO()

comment:5 Changed 6 years ago by Ian Lewis

However, that said, this patch might pose a problem for file-like-objects that should not be over-read socket connection. The patch probably should honor the file size if it exists.

comment:6 Changed 6 years ago by Ian Kelly

See also #13901.

comment:7 Changed 5 years ago by Luke Plant

Severity: Normal
Type: Cleanup/optimization

comment:8 Changed 5 years ago by Aymeric Augustin

UI/UX: unset

Change UI/UX from NULL to False.

comment:9 Changed 5 years ago by Aymeric Augustin

Easy pickings: unset

Change Easy pickings from NULL to False.

comment:10 Changed 5 years ago by Claude Paroz

Resolution: duplicate
Status: newclosed

When fixing #15644, the chunks implementation has been changed to not depend on size any more (r17871). It remains to be seen if comment:5 (over-read socket connection) might be true, triggering new bug reports. Real use-cases welcome...

Note: See TracTickets for help on using tickets.
Back to Top