Opened 15 years ago

Closed 12 years ago

#9632 closed Cleanup/optimization (duplicate)

File.chunks contains potentially expensive size operation

Reported by: Peter Sagerson Owned by: nobody
Component: File uploads/storage Version: 1.0
Severity: Normal Keywords: file
Cc: Triage Stage: Design decision needed
Has patch: yes Needs documentation: no
Needs tests: no Patch needs improvement: no
Easy pickings: no UI/UX: no

Description

django.core.files.base.File's chunk method uses the file's size to determine how many chunks to return. Retrieving the size of a file is very fast for simple files, but may be very expensive for compressed files and other storage mechanisms. And in this case, it's completely unnecessary. Replacing the size-based loop with a simple check for the end of the stream has fewer dependencies and avoids a potentially expensive operation.

To cite one practical example, if the file is stored as a compressed file on disk, the current implementation will result in either decompressing the entire file twice or caching the entire decompressed file in memory. Removing the size dependency avoids both.

Attachments (1)

chunks.diff (727 bytes ) - added by Peter Sagerson 15 years ago.

Download all attachments as: .zip

Change History (11)

by Peter Sagerson, 15 years ago

Attachment: chunks.diff added

comment:1 by Jacob, 15 years ago

Triage Stage: UnreviewedDesign decision needed

comment:2 by Adam Nelson, 14 years ago

We need a design decision on this - the patch looks fine.

comment:3 by Adam Nelson, 14 years ago

Also this ticket stating that the chunk size isn't always honored at #12157

comment:4 by Ian Lewis, 14 years ago

+1 for removing the size() check in chunks(). It tends to break things when wrapping non-file-system file-like-objects such as StringIO()

comment:5 by Ian Lewis, 14 years ago

However, that said, this patch might pose a problem for file-like-objects that should not be over-read socket connection. The patch probably should honor the file size if it exists.

comment:6 by Erin Kelly, 14 years ago

See also #13901.

comment:7 by Luke Plant, 13 years ago

Severity: Normal
Type: Cleanup/optimization

comment:8 by Aymeric Augustin, 12 years ago

UI/UX: unset

Change UI/UX from NULL to False.

comment:9 by Aymeric Augustin, 12 years ago

Easy pickings: unset

Change Easy pickings from NULL to False.

comment:10 by Claude Paroz, 12 years ago

Resolution: duplicate
Status: newclosed

When fixing #15644, the chunks implementation has been changed to not depend on size any more (r17871). It remains to be seen if comment:5 (over-read socket connection) might be true, triggering new bug reports. Real use-cases welcome...

Note: See TracTickets for help on using tickets.
Back to Top