Opened 8 years ago

Closed 6 years ago

Last modified 6 years ago

#2983 closed defect (wontfix)

ImageField not deleing previously attached file when updated

Reported by: Shimon Owned by: tbecker
Component: File uploads/storage Version: master
Severity: major Keywords: fs-rf-docs
Cc: Triage Stage: Accepted
Has patch: yes Needs documentation: yes
Needs tests: no Patch needs improvement: no
Easy pickings: UI/UX:

Description

Steps to reproduce (in trunk):
1)Create a model with an ImageField.
2)Assign a new image to the ImageField.
3)Edit the object just created and upload another image to the ImageField.

Expected behavior:
Image file created in step 2 should be deleted (new file takes it's place).

In Practice the old image file is orphaned and never deleted.

I think this is serious because a malicious user can use any django image field to fill up the server hard disk (repeatedly change the contents of the image field) and fixing the situation without damaging a site is non-trivial as it requires the creation of a script that checks which image files are valid and which are orphans.

Attachments (3)

image_field_delete_orig_on_update.patch (712 bytes) - added by Tim Goh <timgoh@…> 8 years ago.
against r4085
2983.diff (4.9 KB) - added by SmileyChris 6 years ago.
2983.2.diff (19.6 KB) - added by SmileyChris 6 years ago.

Download all attachments as: .zip

Change History (29)

Changed 8 years ago by Tim Goh <timgoh@…>

against r4085

comment:1 Changed 8 years ago by Tim Goh <timgoh@…>

  • Summary changed from ImageField not deleing previously attached file when updated to [patch] ImageField not deleing previously attached file when updated

The thumbnail library (http://djangoutils.python-hosting.com/wiki/Thumbnails) has code with this functionality we can reuse. Patch attached

comment:2 Changed 8 years ago by SmileyChris

  • Triage Stage changed from Unreviewed to Accepted

comment:3 follow-up: Changed 8 years ago by SmileyChris

  • Summary changed from [patch] ImageField not deleing previously attached file when updated to ImageField not deleing previously attached file when updated
  • Triage Stage changed from Accepted to Design decision needed

Actually, first the decision needs to be made about whether this is the right course of action.

comment:4 in reply to: ↑ 3 Changed 8 years ago by steven.potter@…

Replying to SmileyChris:

Actually, first the decision needs to be made about whether this is the right course of action.

I think the appropriate action would be to provide this functionality as an option.

comment:5 follow-up: Changed 8 years ago by emperorcezar

I think it's be great to keep the old images around, at least back one revision. It should be possible to make this an option in the model definition "undo=True". I suggest this because I have experience with Content Management systems. End users typically want an undo function when there are multiple editors. And honestly, most don't remember to save the old image in case they make a mistake. Of course this adds some complexity. But in reality only add one column to the database. Say your field is named 'test_image', you might make the backup be 'test_image_backup' or something similar. Just my 2 cents.

comment:6 in reply to: ↑ 5 ; follow-up: Changed 8 years ago by Shimon

Regarding the current behavior, IMHO: I think it's confusing, a security concern, and non-transparent.
Regarding keeping this behavior as an option in an attempt to support history/undo for the image field: I'm a bit uncomfortable with this behavior being part of the core framework, as I don't think this is a common usage case.

No other fields in django support recording their history - if you change the contents of a text field there is no way of bringing the old value back. Why should the image field be any different? Why should adding an image to a django model add yet another field/table to the database?

In addition, IMHO there are security concerns with the current behavior - a malicious user could attack an unsuspecting django site by filling up the hard disk. This simply by repeatedly changing an image field.

I also think that the current behavior is not what a user would expect to happen - why should the framework keep old image files around even though the contents of the image field have changed? If there where a method to access the old image files that would be a different issue.

In short: I think we should move forward with the current patch.

If anyone wants to preserve the current behavior, then maybe open another ticket to support the undo=True option specifically - i.e. with relevant documentation and utilities for cleanup.

comment:7 in reply to: ↑ 6 ; follow-up: Changed 8 years ago by steven.potter@…

In addition, IMHO there are security concerns with the current behavior - a malicious user could attack an unsuspecting django site by filling up the hard disk. This simply by repeatedly changing an image field.

You are absolutely correctly, but it is also possible that a malicious user would upload a new file/image just so django would delete the existing one.

I don't think of this as a way to "undo"... I guess I just have a problem with Django deleting files from the web server.

Think of this case. There are already image files on the server (in use by another non-Django application). I create a Django app that also makes use of these files. So I import the files into the Django models (not making use of forms to upload them). Then I decide to later change which image the Django app is using. If Django was to delete the old file, It would break my non-Django application.

The issue I see is that you are simply looking at this from the perspective that ImageFields are always used with forms.

I'm not saying that this this behavior is a bad idea, I'm just saying that I would like some method of disabling it.

After all, it seems that is how the rest of the Django ORM works, Make the default behaviors reasonable, but have options to override the defaults for special circumstances.

comment:8 Changed 8 years ago by ubernostrum

#4979 was a duplicate. See also #4339.

comment:9 in reply to: ↑ 7 ; follow-up: Changed 8 years ago by anonymous

Replying to steven.potter@gmail.com:

You are absolutely correctly, but it is also possible that a malicious user would upload a new file/image just so django would delete the existing one.

What your referring to is a data access problem, i.e. a user has rights to change data they shouldn't. On this level IMHO there is no conceptual difference between the image file and a text field in the database.

The only difference is that instead of the data being stored inside the database (A Unicode byte string in the case of a database text field), the actual bits are stored externally to the database, and only a pointer to the data (the file's path) is stored in the database.

From this viewpoint, not having django deleting the file when the database value is changed to point to another file is like not having a working garbage collector - users can fill up memory [Hard Disk] with old unreferenced values.

Think of this case. There are already image files on the server (in use by another non-Django application). I create a Django app that also makes use of these files. So I import the files into the Django models (not making use of forms to upload them). Then I decide to later change which image the Django app is using. If Django was to delete the old file, It would break my non-Django application.

Could it be that the feature you guys are after is to have ImageField who's attached images are not "managed by django" i.e. django should only store pointers to them but not delete them?

After all, it seems that is how the rest of the Django ORM works, Make the default behaviors reasonable, but have options to override the defaults for special circumstances.

In other words, the default behavior should be: Django takes responsibility over all image files and collects it's own garbage in this respect (deletes image files no longer referenced from the db).

Override the defaults: provide functionality for not touching the image files on disk and letting another piece of software manage them?

Does this sound satisfactory?

ubernordtrom: The ticket you assigned as duplicate is IMHO a different issue - it refers to the ability to delete an existing image file directly from the admin application. This ticket is about deleting an old image file when a new image file is uploaded.

comment:10 in reply to: ↑ 9 Changed 8 years ago by steven.potter@…

In other words, the default behavior should be: Django takes responsibility over all image files and collects it's own garbage in this respect (deletes image files no longer referenced from the db).

Override the defaults: provide functionality for not touching the image files on disk and letting another piece of software manage them?

Does this sound satisfactory?

This is exactly what I had in mind.

comment:11 Changed 7 years ago by jacob

  • Keywords fs-r added
  • Triage Stage changed from Design decision needed to Accepted

This should be fixed when #5361 lands.

comment:12 Changed 7 years ago by tbecker

  • Owner changed from nobody to tbecker
  • Status changed from new to assigned

comment:13 Changed 7 years ago by tbecker

Nevermind, I just saw Jacob's comment.

For what it's worth, I think undo would be a nice feature, but I'd rather get to a stable 1.0 feature set that does not include undo. Any undo implementation should be part of a larger design change that allows undo for *all* admin actions, imho.

comment:14 Changed 7 years ago by Gulopine

  • Keywords fs-rf added; fs-r removed

comment:15 Changed 7 years ago by Gulopine

  • Keywords fs-rf-docs added; fs-rf removed

comment:16 Changed 7 years ago by Gulopine

The patch in #5361 will address this, but it won't include this functionality by default. Instead, it makes it trivial to implement this yourself in just a few lines of code, and that process is covered in its documentation.

comment:17 Changed 7 years ago by Gulopine

  • milestone set to 1.0 beta

comment:18 Changed 7 years ago by jacob

  • Resolution set to wontfix
  • Status changed from assigned to closed

File storage is in as of [8244], so I'm marking this as wontfix -- as Marty says, it's very easy to write a custom file storage backend that works exactly how you like it.

comment:19 Changed 6 years ago by Alexey

  • Component changed from Core framework to File uploads/storage
  • milestone changed from 1.0 beta to 1.1
  • Patch needs improvement set
  • Resolution wontfix deleted
  • Status changed from closed to reopened

You still can upload millions of pictures to lawrence.com.

  1. Register there
  2. Login
  3. Get session ID from firebug or opera's site prefs
  4. Make image. Call it "image.gif"
  5. Run this script in infinitive loop:
curl -b sessionid=SESSION_ID -F avatar=@image.gif http://www.lawrence.com/accounts/profile/edit/

You can change filename from time to time to upload more images.

lawrence.com will have tonns of images in upload dir. If you have goood connection - you can easy fill server's storage.

I mailed about it to lawrence.com staff at 17 Feb, but seems that you don't want to fix it.

comment:20 Changed 6 years ago by kmtracey

  • Resolution set to wontfix
  • Status changed from reopened to closed

Restoring previous status. Alexey, what lawrence.com chooses to do is up to it. There may be some Django maintainers that have work for or have close ties to lawrence.com, but Django's tracker is not the right place for issues with an independent web site. We couldn't check in some changes to Django code and have it automagically affect lawrence.com even if we wanted to.

comment:21 Changed 6 years ago by anonymous

But why docs are not saying that if you update FileField/ImageField, old file remain on filesystem?

If you don't want to fix it, write an example, how to avoid it, please. The exact goal is "let old file be deleted after replacement with new one".

That's obvious behaviour. Your text fields, char fields, etc are replaced on model field update. Why FileField behave differently?

comment:22 Changed 6 years ago by anonymous

It can't be done with FileStorage because FileStorage knows nothing about context of file creation. It can't decide wether owerwrite the old file (if it belongs to the same FileField as the new one), or append "_" to it's name (if old file belongs to other record's FileField).

comment:23 Changed 6 years ago by anonymous

I would also be curious to see how this is done if it is supposedly trivial.

Changed 6 years ago by SmileyChris

comment:24 Changed 6 years ago by SmileyChris

  • milestone 1.1 deleted
  • Needs documentation set
  • Needs tests set
  • Patch needs improvement unset

While I'm not going to re-open the ticket just yet, but I'm pretty sure that this isn't quite as easy as using a custom storage backend for the field.

A common case is that you have a ModelForm (bound to an existing instance) for which you use to upload a new file. The form is saved in your view which overwrites the linked FieldFile with a new one (which will be based on the data from the UploadedFile). Now there is no reference to the old file, so even custom storage can't touch it (unless you use an upload_to callable which names it the same and your custom storage is changed to [dangerously] blindly overwrite files with the same name).

Anyway, I've attached a patch which works well for me in limited testing. It'd need tests and docs of course.

comment:25 Changed 6 years ago by SmileyChris

  • Needs tests unset

Rewritten and now with tests.

Changed 6 years ago by SmileyChris

comment:26 Changed 6 years ago by SmileyChris

Since this ticket is messy, and at Marty's suggestion, I'm posting this as a new ticket: #11663

Note: See TracTickets for help on using tickets.
Back to Top