#2983 closed defect (wontfix)
ImageField not deleing previously attached file when updated
Reported by: | Shimon | Owned by: | Tony Becker |
---|---|---|---|
Component: | File uploads/storage | Version: | dev |
Severity: | major | Keywords: | fs-rf-docs |
Cc: | Triage Stage: | Accepted | |
Has patch: | yes | Needs documentation: | yes |
Needs tests: | no | Patch needs improvement: | no |
Easy pickings: | no | UI/UX: | no |
Description
Steps to reproduce (in trunk):
1)Create a model with an ImageField.
2)Assign a new image to the ImageField.
3)Edit the object just created and upload another image to the ImageField.
Expected behavior:
Image file created in step 2 should be deleted (new file takes it's place).
In Practice the old image file is orphaned and never deleted.
I think this is serious because a malicious user can use any django image field to fill up the server hard disk (repeatedly change the contents of the image field) and fixing the situation without damaging a site is non-trivial as it requires the creation of a script that checks which image files are valid and which are orphans.
Attachments (3)
Change History (29)
by , 18 years ago
Attachment: | image_field_delete_orig_on_update.patch added |
---|
comment:1 by , 18 years ago
Summary: | ImageField not deleing previously attached file when updated → [patch] ImageField not deleing previously attached file when updated |
---|
The thumbnail library (http://djangoutils.python-hosting.com/wiki/Thumbnails) has code with this functionality we can reuse. Patch attached
comment:2 by , 17 years ago
Triage Stage: | Unreviewed → Accepted |
---|
follow-up: 4 comment:3 by , 17 years ago
Summary: | [patch] ImageField not deleing previously attached file when updated → ImageField not deleing previously attached file when updated |
---|---|
Triage Stage: | Accepted → Design decision needed |
Actually, first the decision needs to be made about whether this is the right course of action.
comment:4 by , 17 years ago
Replying to SmileyChris:
Actually, first the decision needs to be made about whether this is the right course of action.
I think the appropriate action would be to provide this functionality as an option.
follow-up: 6 comment:5 by , 17 years ago
I think it's be great to keep the old images around, at least back one revision. It should be possible to make this an option in the model definition "undo=True". I suggest this because I have experience with Content Management systems. End users typically want an undo function when there are multiple editors. And honestly, most don't remember to save the old image in case they make a mistake. Of course this adds some complexity. But in reality only add one column to the database. Say your field is named 'test_image', you might make the backup be 'test_image_backup' or something similar. Just my 2 cents.
follow-up: 7 comment:6 by , 17 years ago
Regarding the current behavior, IMHO: I think it's confusing, a security concern, and non-transparent.
Regarding keeping this behavior as an option in an attempt to support history/undo for the image field: I'm a bit uncomfortable with this behavior being part of the core framework, as I don't think this is a common usage case.
No other fields in django support recording their history - if you change the contents of a text field there is no way of bringing the old value back. Why should the image field be any different? Why should adding an image to a django model add yet another field/table to the database?
In addition, IMHO there are security concerns with the current behavior - a malicious user could attack an unsuspecting django site by filling up the hard disk. This simply by repeatedly changing an image field.
I also think that the current behavior is not what a user would expect to happen - why should the framework keep old image files around even though the contents of the image field have changed? If there where a method to access the old image files that would be a different issue.
In short: I think we should move forward with the current patch.
If anyone wants to preserve the current behavior, then maybe open another ticket to support the undo=True option specifically - i.e. with relevant documentation and utilities for cleanup.
follow-up: 9 comment:7 by , 17 years ago
In addition, IMHO there are security concerns with the current behavior - a malicious user could attack an unsuspecting django site by filling up the hard disk. This simply by repeatedly changing an image field.
You are absolutely correctly, but it is also possible that a malicious user would upload a new file/image just so django would delete the existing one.
I don't think of this as a way to "undo"... I guess I just have a problem with Django deleting files from the web server.
Think of this case. There are already image files on the server (in use by another non-Django application). I create a Django app that also makes use of these files. So I import the files into the Django models (not making use of forms to upload them). Then I decide to later change which image the Django app is using. If Django was to delete the old file, It would break my non-Django application.
The issue I see is that you are simply looking at this from the perspective that ImageFields are always used with forms.
I'm not saying that this this behavior is a bad idea, I'm just saying that I would like some method of disabling it.
After all, it seems that is how the rest of the Django ORM works, Make the default behaviors reasonable, but have options to override the defaults for special circumstances.
follow-up: 10 comment:9 by , 17 years ago
Replying to steven.potter@gmail.com:
You are absolutely correctly, but it is also possible that a malicious user would upload a new file/image just so django would delete the existing one.
What your referring to is a data access problem, i.e. a user has rights to change data they shouldn't. On this level IMHO there is no conceptual difference between the image file and a text field in the database.
The only difference is that instead of the data being stored inside the database (A Unicode byte string in the case of a database text field), the actual bits are stored externally to the database, and only a pointer to the data (the file's path) is stored in the database.
From this viewpoint, not having django deleting the file when the database value is changed to point to another file is like not having a working garbage collector - users can fill up memory [Hard Disk] with old unreferenced values.
Think of this case. There are already image files on the server (in use by another non-Django application). I create a Django app that also makes use of these files. So I import the files into the Django models (not making use of forms to upload them). Then I decide to later change which image the Django app is using. If Django was to delete the old file, It would break my non-Django application.
Could it be that the feature you guys are after is to have ImageField who's attached images are not "managed by django" i.e. django should only store pointers to them but not delete them?
After all, it seems that is how the rest of the Django ORM works, Make the default behaviors reasonable, but have options to override the defaults for special circumstances.
In other words, the default behavior should be: Django takes responsibility over all image files and collects it's own garbage in this respect (deletes image files no longer referenced from the db).
Override the defaults: provide functionality for not touching the image files on disk and letting another piece of software manage them?
Does this sound satisfactory?
ubernordtrom: The ticket you assigned as duplicate is IMHO a different issue - it refers to the ability to delete an existing image file directly from the admin application. This ticket is about deleting an old image file when a new image file is uploaded.
comment:10 by , 17 years ago
In other words, the default behavior should be: Django takes responsibility over all image files and collects it's own garbage in this respect (deletes image files no longer referenced from the db).
Override the defaults: provide functionality for not touching the image files on disk and letting another piece of software manage them?
Does this sound satisfactory?
This is exactly what I had in mind.
comment:11 by , 17 years ago
Keywords: | fs-r added |
---|---|
Triage Stage: | Design decision needed → Accepted |
This should be fixed when #5361 lands.
comment:12 by , 17 years ago
Owner: | changed from | to
---|---|
Status: | new → assigned |
comment:13 by , 17 years ago
Nevermind, I just saw Jacob's comment.
For what it's worth, I think undo would be a nice feature, but I'd rather get to a stable 1.0 feature set that does not include undo. Any undo implementation should be part of a larger design change that allows undo for *all* admin actions, imho.
comment:14 by , 17 years ago
Keywords: | fs-rf added; fs-r removed |
---|
comment:15 by , 17 years ago
Keywords: | fs-rf-docs added; fs-rf removed |
---|
comment:16 by , 17 years ago
The patch in #5361 will address this, but it won't include this functionality by default. Instead, it makes it trivial to implement this yourself in just a few lines of code, and that process is covered in its documentation.
comment:17 by , 17 years ago
milestone: | → 1.0 beta |
---|
comment:18 by , 16 years ago
Resolution: | → wontfix |
---|---|
Status: | assigned → closed |
File storage is in as of [8244], so I'm marking this as wontfix -- as Marty says, it's very easy to write a custom file storage backend that works exactly how you like it.
comment:19 by , 16 years ago
Component: | Core framework → File uploads/storage |
---|---|
milestone: | 1.0 beta → 1.1 |
Patch needs improvement: | set |
Resolution: | wontfix |
Status: | closed → reopened |
You still can upload millions of pictures to lawrence.com.
- Register there
- Login
- Get session ID from firebug or opera's site prefs
- Make image. Call it "image.gif"
- Run this script in infinitive loop:
curl -b sessionid=SESSION_ID -F avatar=@image.gif http://www.lawrence.com/accounts/profile/edit/
You can change filename from time to time to upload more images.
lawrence.com will have tonns of images in upload dir. If you have goood connection - you can easy fill server's storage.
I mailed about it to lawrence.com staff at 17 Feb, but seems that you don't want to fix it.
comment:20 by , 16 years ago
Resolution: | → wontfix |
---|---|
Status: | reopened → closed |
Restoring previous status. Alexey, what lawrence.com chooses to do is up to it. There may be some Django maintainers that have work for or have close ties to lawrence.com, but Django's tracker is not the right place for issues with an independent web site. We couldn't check in some changes to Django code and have it automagically affect lawrence.com even if we wanted to.
comment:21 by , 16 years ago
But why docs are not saying that if you update FileField/ImageField, old file remain on filesystem?
If you don't want to fix it, write an example, how to avoid it, please. The exact goal is "let old file be deleted after replacement with new one".
That's obvious behaviour. Your text fields, char fields, etc are replaced on model field update. Why FileField behave differently?
comment:22 by , 16 years ago
It can't be done with FileStorage because FileStorage knows nothing about context of file creation. It can't decide wether owerwrite the old file (if it belongs to the same FileField as the new one), or append "_" to it's name (if old file belongs to other record's FileField).
comment:23 by , 15 years ago
I would also be curious to see how this is done if it is supposedly trivial.
by , 15 years ago
comment:24 by , 15 years ago
milestone: | 1.1 |
---|---|
Needs documentation: | set |
Needs tests: | set |
Patch needs improvement: | unset |
While I'm not going to re-open the ticket just yet, but I'm pretty sure that this isn't quite as easy as using a custom storage backend for the field.
A common case is that you have a ModelForm
(bound to an existing instance) for which you use to upload a new file. The form is saved in your view which overwrites the linked FieldFile
with a new one (which will be based on the data from the UploadedFile
). Now there is no reference to the old file, so even custom storage can't touch it (unless you use an upload_to callable which names it the same and your custom storage is changed to [dangerously] blindly overwrite files with the same name).
Anyway, I've attached a patch which works well for me in limited testing. It'd need tests and docs of course.
by , 15 years ago
Attachment: | 2983.2.diff added |
---|
comment:26 by , 15 years ago
Since this ticket is messy, and at Marty's suggestion, I'm posting this as a new ticket: #11663
against r4085