|Version 2 (modified by David Danier <goliath.mailinglist@…>, 8 years ago) (diff)|
With #5361 ready for review, this page provides a basic rundown of what changed, in case the code doesn't easily expose all the improvements.
- All file handling is now done through storage classes, which handle basic interactions with an underlying storage system. Django will ship with just one, which deals with the filesystem just like everything works now, but users can create whatever other backends they like. Examples could be storing files on S3 (a common request), customizing file naming behavior, encrypting/decrypting files transparently, etc. The docs on the ticket do a good job of explaining how to use them, so I won't bore you with that here. The default storage system is FileSystemStorage, and is specified by the new DEFAULT_FILE_STORAGE setting.
- FileField now provides a FieldFile object instead just a filename, so that file operations can take place on it directly. This was needed on one hand to move people away from the open(instance.get_avatar_filename()), since that won't work once other backends enter the picture. Instead, instance.avatar can be used as a file-like object directly. Also, it has path, size and url properties instead of get_foo_*.
- Since there's bound to be a lot of code in the wild that uses open(), I tried my hand at an easier path toward backwards-compatibility for those folks, by supplying django.core.files.open. It works just same as the builtin, but uses FileSystemStorage behind the scenes, so the object it returns is a django.core.files.File, with all of its bells and whistles available. NOTE: Since this is a replacement for the builtin open(), it does not obey the DEFAULT_FILE_STORAGE setting. It just uses FileSystemStorage directly, all the time.
- The open() method of storage objects also accepts a mixin argument, which allows the returned File to have overrides and extra methods for specific file types.
- FileSystemBackend won't allow access to any file that's not beneath the path it was instantiated with. This is primarily useful for security, but also as a deterrent against accidentally putting a leading slash in upload_to.
- Speaking of upload_to, it now accepts a callable as well as a string. If a callable is provided, it's called with the model instance and the filename, so user code has much greater control over how files are named.
- The current default storage system will always be available, both as a class and as an object, from django.core.files.storage. The class is DefaultStorage and the instance of it is default_storage. This way, subclasses can override just some behavior, such as file naming, without worrying about how the files are really being stored, or views can save/retrieve files manually, with the same flexibility.
- FileField also accepts a storage argument, where a custom storage object can be passed in, to be used as an override of DEFAULT_FILE_STORAGE.
Differences from previous patches
- django.core.filestorage from older patches is gone, and everything has been moved into django.core.files, in keeping with the trend set by . The basic File object is at django.core.files.base, while all storage-related classes and functions are at django.core.files.storage. This means that, by default, all new storage systems would start as third-party apps, since there's no longer a "dedicated" place for them.
- There's now a single base django.core.files.base.File class for all file types, regardless of whether they come from a storage system, an upload, or whatever. This means, for instance, that all storage-related operations are now capable of chunking, while all uploaded files also have __nonzero__ based on file.name. Both UploadedFile and FieldFile have customizations on top of it though, so it's not like File is built to be all things to all people.
- Other image-related functionality has also been moved to django.utils.images, in the form of ImageFile, a mixin that provides width and height options.
- The API for getting meta-information about a file (such as its size, filesystem path, URL, width and height) has changed from methods to read-only properties. Those that would have to access the content (size, width and height) cache the results so they don't have to do so more than necessary.
I've tried very hard to maintain backwards-compatibility wherever reasonable, but there are still a few places where API improvements merit some changes. In addition, there's one (hopefully) rare case where backwards-compatbility is impossible to retain.
Deprecated get_FOO_* methods
Most of the Model._get_FIELD_* methods have been deprecated, pointing to the appropriate attributes on the new FieldFile instance. Given a model like this:
class FileContent(models.Model): content = models.FileField(upload_to='content')
Here's how the changes map out:
|Old way||New way|
django.utils.images has moved
The new location is django.core.files.images, where it's far more appropriate. An import and a DeprecationWarning have been left at the old location.
Use File to save raw content
Passing raw file content as strings to save() has been deprecated, in favor of using a File subclass. In addition to a DeprecationWarning and an automatic conversion, there's now a django.core.files.base.ContentFile, which is a simpler class than SimpleUploadedFile, as it doesn't deal with content-type, charset or even a filename. It's basically just a light wrapper around StringIO that adds chunking behavior, since most of the internals expect to be able to use that.
Empty FileField values are no longer None
FileField will always provide an object to model instances, regardless of whether there's actually a file associated with it, which is necessary for the instance.content.save() behavior. Previously, if there was no file attached, instance.content would be None, which is no longer true, so the following will no longer work:
if instance.content is not None: # Process the file's content here.
Instead, File objects evaluate to True or False on their own, so the following is functionally identical:
if instance.content: # Process the file's content here.
FileField can't be unique, primary_key or core
The exact behavior of these with a FileField was undefined, and was causing problems in many cases, so they now raise a TypeError if supplied.
There are a number of tickets marked fs-rf, indicating that they're impacted by this patch. First, the issues that are truly fixed, and can be marked as fixed in the commit:
- #5361 - The main file storage ticket, where the patch itself resides.
- #3621 - If upload_to starts with a slash, FileSystemStorage's increased security will now raise a SuspiciousOperation when saving a file, long before it hits the database.
- #5655 - The supplied patch was adapted and included, resolving the issue. Tests have been included to verify this.
- #7415 - Saved files are now always stored in the database using forward slashes, and retrieving using os.path.normpath()
And there are also a few that will be made possible, but not provided in core directly (probably mark as wontfix):
- #2983 - Since saving and deleting behavior has been moved into FileField instead of Model, a subclass can provide this behavior. If not that way, a custom storage object can do the rest, passing it into the FileField as its storage argument.
- #4339 - By providing a custom storage class, it's easy to change this type of file naming behavior. The patch's tests include a Trac-style example, using numbers instead of underscores.
- #4948 - If this is even still an issue (see this comment), more fine-grained locking can be provided by a custom storage class.
- #5485 - Like #4339 above, custom file naming across the board is easy with a custom storage class.
- #5966 - Custom storage can create or delete directories however is necessary for a given environment.
- #6390 - Custom backends will be quite possible, but are best suited as third-party apps.
And a couple where the problem was resolved by removing the feature that was causing problems (I'm not sure if these should be fixed or wontfix):