Opened 3 years ago

Closed 3 years ago

#32719 closed New feature (duplicate)

Re-allow path components in `name` for Storage backends which allow it.

Reported by: Carlton Gibson Owned by: nobody
Component: File uploads/storage Version: 3.2
Severity: Normal Keywords:
Cc: Florian Apolloner, Cristiano Coelho Triage Stage: Accepted
Has patch: no Needs documentation: no
Needs tests: no Patch needs improvement: no
Easy pickings: no UI/UX: no


The fix for CVE-2021-31542 (0b79eb36915d178aef5c6a7bbce71b1e76d376d3) disallowed path components in file names.

This caused a failure on Windows for the test added in 914c72be2abb1c6dd860cb9279beaa66409ae1b2, specifically checking that such components (\\) are allowed, which is appropriate for some storage backends, the core example of which being S3.
(A skip for this test was added in a708f39ce67af174df90c5b5e50ad1976cec7cb8.)

The fix is to refactor the storage API to allow different filename validation. Comment from Florian on GitHub:

Did you consider making the security fix in the FileSystemStorage class rather than in Storage so that the restriction isn't placed on storages where directory traversal isn't applicable?

Yes, but it would have resulted in an override of a few functions and might have even required more supporting functions to keep a certain level of DRY which I did not feel comfortable doing in a security release. I guess now we could start working towards that goal in public. There are a few things to consider though: Will it be worth the extra code for not much gain or is it cheaper to have storages that want to have the previous behavior override get_available_name etc and we make it "easier" for them.

Sadly the methods on the storage classes do not make it any easier for us. For instance we have generate_filename which takes a filename as argument and splits that into dirname & name. So it seems to suggest that we want a relative path to the storage root here. So far so good, but now this function calls into get_valid_name which takes a name argument which is solely the basename -- but it's docs say:

Return a filename, based on the provided filename, that's suitable for use in the target storage system.

So how should one know which function can get called with what and especially if it is important to call them in some order… Even worse, depending how people are storing stuff in the storage backend, one can simply skip a few functions. As an example get_valid_name is not even called unless you use generate_filename to create a name for the storage backend. But one could simply call save() with any name (though the name here is not a basename but a path relative to the storage root). One would only notice the missing get_valid_name when your storage all of a sudden throws errors because you are trying to store something it doesn't support.

All in all I do not have a good answer. I think a slight rewrite of the storage backends with the issues outlined above might get us a long way… In the meantime I am somewhat hesitant to add new functions that might require immediate deprecation.

Change History (2)

comment:1 by Mariusz Felisiak, 3 years ago

Cc: Florian Apolloner Cristiano Coelho added
Triage Stage: UnreviewedAccepted

comment:2 by Mariusz Felisiak, 3 years ago

Resolution: duplicate
Status: newclosed

Closing as a duplicate of #32718 which allows for relative paths and Windows separators in the file names.

Note: See TracTickets for help on using tickets.
Back to Top