Opened 5 years ago

Closed 4 years ago

Last modified 4 years ago

#26058 closed Cleanup/optimization (fixed)

Custom storage backend's not entirely decoupled from FileField

Reported by: Korijn van Golen Owned by: nobody
Component: File uploads/storage Version: 1.9
Severity: Normal Keywords: custom storage filefield
Cc: Triage Stage: Accepted
Has patch: yes Needs documentation: no
Needs tests: no Patch needs improvement: no
Easy pickings: no UI/UX: no

Description (last modified by Korijn van Golen)

Let me start by saying that I've implemented custom FileFields that can handle Numpy objects, to provide some convenience methods to access the typed objects.

Later, when implementing a custom storage backend for Azure Storage, I encountered some issues when handling filenames. Cloud storage doesn't behave like local file system storage does. Most of Django's code is agnostic of this problem, except for a small section of FileField:

https://github.com/django/django/blob/77b8d8cb6d6d6345f479c68c4892291c1492ba7e/django/db/models/fields/files.py#L306-L320

In Azure, blobs are stored in containers, and are optionally stored in subfolders (known as prefixes in Azure-speak). To be able to use the upload_to property, I need to be able to pass the full path, e.g.:

container/blob
container/sub/blob
container/sub/sub/blob

Unfortunately, FileField strips the directory part and only passes the blob name to my custom storage backend's get_valid_name implementation, where I really need to validate the whole string. I ended up subclassing FileField:

class CloudStorageFileField(FileField):
    """
    Provides some overrides to enable cloud storage backends
    """

    def get_directory_name(self):
        return force_text(datetime.datetime.now().strftime(force_str(self.upload_to)))

    def get_filename(self, filename):
        return filename

    def generate_filename(self, instance, filename):
        if callable(self.upload_to):
            filename = self.upload_to(instance, filename)
            filename = self.storage.get_valid_name(filename)
            return filename

        return self.get_directory_name() + self.storage.get_valid_name(filename)

As you can see, all I did was remove the local file system related logic. Unfortunately, my custom file fields need to inherit from this class to work with Azure storage, and when I switch to local storage in a different environment, they have to inherit from the regular FileField! This is obviously an undesirable situation, imposed by the tight coupling to local file system logic in the three methods of FileField.

In order to truly decouple this, FileField would need to be a little bit more agnostic about the storage backend. This could be done by moving the generate_filename method to the backend entirely, for example.

I classified this as a bug, as this case shows that you're not entirely able to do anything you want when implementing custom storage backends.

You can see the implementation of both the custom storage backend and the NumpyFileField here: https://gist.github.com/Korijn/e0bcbdcedb494509973a

Your thoughts?

Change History (9)

comment:1 Changed 5 years ago by Korijn van Golen

Description: modified (diff)

comment:2 Changed 5 years ago by Tim Graham

Triage Stage: UnreviewedAccepted
Type: BugCleanup/optimization

The main question is whether or not the changes can be made in a backwards compatible fashion.

comment:3 Changed 5 years ago by Korijn van Golen

I took a little bit more time to do this neatly, here's the backwards-compatible version:

class StorageBackendAgnosticFileField(FileField):
    """
    Subclass of Django's `FileField` that allows for storage backends to optionally implement a function
    `generate_filename` through which they can generate filenames in a manner suitable for their storage mechanisms,
    rather than through Django's implementation which is hardwired to specific file systems.
    """

    def parse_upload_to(self):
        """
        Same as `FileField.get_directory_name`, but without the file system specifics. Allows for date formatting.

        Returns
        -------
        str
            The parsed `upload_to` string.
        """
        return force_text(datetime.datetime.now().strftime(force_str(self.upload_to)))

    def generate_filename(self, instance, filename):
        """
        Overrides the regular implementation by first checking of the storage backend doesn't already implement this
        logic. Otherwise, falls back to the traditional implementation.

        Parameters
        ----------
        instance : object
            The model instance the current file is attached to. It's passed to the `upload_to` call when `upload_to` is
            callable.
        filename : str
            The filename, if any.

        Returns
        -------
        str
            The newly generated and/or validated filename.
        """
        if hasattr(self.storage, 'generate_filename') and callable(self.storage.generate_filename):
            if callable(self.upload_to):
                filename = self.upload_to(instance, filename)
                return self.storage.generate_filename(filename)

            directory_name = self.parse_upload_to()
            return self.storage.generate_filename(directory_name + filename)

        return super(StorageBackendAgnosticFileField, self).generate_filename(instance, filename)
Last edited 5 years ago by Korijn van Golen (previous) (diff)

comment:4 Changed 5 years ago by Korijn van Golen

Not sure if my edit generated a notification, so, bump. Please see comment:3.

Last edited 5 years ago by Korijn van Golen (previous) (diff)

comment:5 Changed 4 years ago by Tim Graham

Has patch: set

#26382 is a duplicate with a patch proposed: PR

comment:6 Changed 4 years ago by Tim Graham

Patch needs improvement: set

comment:7 Changed 4 years ago by Tim Graham

Patch needs improvement: unset

comment:8 Changed 4 years ago by Tim Graham <timograham@…>

Resolution: fixed
Status: newclosed

In 914c72b:

Fixed #26058 -- Delegated os.path bits of FileField's filename generation to the Storage.

comment:9 Changed 4 years ago by Tim Graham <timograham@…>

In 0dfc5479:

Refs #26058 -- Removed deprecated FileField.get_directory_name()/get_filename().

Note: See TracTickets for help on using tickets.
Back to Top