#27334 closed Cleanup/optimization (fixed)
File uploads could rename temporary files rather than copying them
Reported by: | Adam Chidlow | Owned by: | nobody |
---|---|---|---|
Component: | File uploads/storage | Version: | 1.10 |
Severity: | Normal | Keywords: | FileField, TemporaryUploadedFile |
Cc: | Triage Stage: | Accepted | |
Has patch: | yes | Needs documentation: | no |
Needs tests: | no | Patch needs improvement: | no |
Easy pickings: | no | UI/UX: | no |
Description
Using default settings, when a file is uploaded and is greater than 2.5MB it will get saved as a temporary file. However even if that temporary file is on the same file system as MEDIA_ROOT, it will be copied instead of moved.
This appears to be because FileField.pre_save passes the FieldFile as 'content' to Storage.save(), instead of FieldFile.file, which is the TemporaryUploadedFile having a temporary_file_path attribute thus invoking file_move_safe.
This appears to have been the behaviour from the beginning: https://github.com/django/django/blame/master/django/db/models/fields/files.py#L296
For large uploaded files this has some undesirable consequences, it increases the time for the request to complete, and when running Django certain ways (at least under gunicorn using gthread workers), the temporary file won't be closed until that worker has recieved another request, so it's taking up double the storage for that time.
Change History (12)
comment:1 by , 8 years ago
Component: | Uncategorized → File uploads/storage |
---|---|
Resolution: | → duplicate |
Status: | new → closed |
Summary: | File uploads create copies instead of renaming → File uploads could rename temporary files rather than copying them |
Type: | Bug → Cleanup/optimization |
comment:2 by , 8 years ago
Resolution: | duplicate |
---|---|
Status: | closed → new |
I'm pretty sure this is not a duplicate of #21602. From what I understand that ticket is talking about potential race conditions in a few different scenarios. In this case, the consequences of solving this ticket would solve one particular case of that ticket, but solving the linked ticket wouldn't necessarily do so in a way that solves this ticket as well.
Furthermore, I think this ticket should just be a one-line fix.
def pre_save(self, model_instance, add): "Returns field's value just before saving." file = super(FileField, self).pre_save(model_instance, add) if file and not file._committed: # Commit the file to storage prior to saving the model file.save(file.name, file, save=False) return file
Should change the file.save call to:
file.save(file.name, file.file, save=False)
This will work, since the only times _committed
is False
is if the file has been deleted (in which case bool(file)
is False
), or a FileField
has been set to an instance of File
that is not also an instance of FieldFile
, in which case passing the File
object that was set to the field as the contents to Storage.save
is perfectly valid.
comment:3 by , 8 years ago
I didn't look into the specifics but existing tests are passing with that change which is a good sign. Can you write a new test?
comment:4 by , 8 years ago
Triage Stage: | Unreviewed → Accepted |
---|
comment:5 by , 8 years ago
Thanks for taking a second look.
I've written a test and put it and the one line change into a pull request.
This is my first pull request to django, if I've made any mistakes in coding conventions or the pull request process please let me know and I'll correct them.
comment:6 by , 8 years ago
Has patch: | set |
---|---|
Patch needs improvement: | set |
comment:7 by , 8 years ago
Patch needs improvement: | unset |
---|
Looks like a duplicate of #21602.