Opened 4 years ago
Closed 3 years ago
#32719 closed New feature (duplicate)
Re-allow path components in `name` for Storage backends which allow it.
Reported by: | Carlton Gibson | Owned by: | nobody |
---|---|---|---|
Component: | File uploads/storage | Version: | 3.2 |
Severity: | Normal | Keywords: | |
Cc: | Florian Apolloner, Cristiano Coelho | Triage Stage: | Accepted |
Has patch: | no | Needs documentation: | no |
Needs tests: | no | Patch needs improvement: | no |
Easy pickings: | no | UI/UX: | no |
Description
The fix for CVE-2021-31542 (0b79eb36915d178aef5c6a7bbce71b1e76d376d3) disallowed path components in file names.
This caused a failure on Windows for the test added in 914c72be2abb1c6dd860cb9279beaa66409ae1b2, specifically checking that such components (\\
) are allowed, which is appropriate for some storage backends, the core example of which being S3.
(A skip for this test was added in a708f39ce67af174df90c5b5e50ad1976cec7cb8.)
The fix is to refactor the storage API to allow different filename validation. Comment from Florian on GitHub:
Did you consider making the security fix in the
FileSystemStorage
class rather than inStorage
so that the restriction isn't placed on storages where directory traversal isn't applicable?
Yes, but it would have resulted in an override of a few functions and might have even required more supporting functions to keep a certain level of DRY which I did not feel comfortable doing in a security release. I guess now we could start working towards that goal in public. There are a few things to consider though: Will it be worth the extra code for not much gain or is it cheaper to have storages that want to have the previous behavior override
get_available_name
etc and we make it "easier" for them.
Sadly the methods on the storage classes do not make it any easier for us. For instance we have
generate_filename
which takes afilename
as argument and splits that into dirname & name. So it seems to suggest that we want a relative path to the storage root here. So far so good, but now this function calls intoget_valid_name
which takes aname
argument which is solely the basename -- but it's docs say:
Return a filename, based on the provided filename, that's suitable for use in the target storage system.
So how should one know which function can get called with what and especially if it is important to call them in some order… Even worse, depending how people are storing stuff in the storage backend, one can simply skip a few functions. As an example
get_valid_name
is not even called unless you usegenerate_filename
to create a name for the storage backend. But one could simply callsave()
with any name (though the name here is not a basename but a path relative to the storage root). One would only notice the missingget_valid_name
when your storage all of a sudden throws errors because you are trying to store something it doesn't support.
All in all I do not have a good answer. I think a slight rewrite of the storage backends with the issues outlined above might get us a long way… In the meantime I am somewhat hesitant to add new functions that might require immediate deprecation.
Change History (2)
comment:1 by , 3 years ago
Cc: | added |
---|---|
Triage Stage: | Unreviewed → Accepted |
comment:2 by , 3 years ago
Resolution: | → duplicate |
---|---|
Status: | new → closed |
Closing as a duplicate of #32718 which allows for relative paths and Windows separators in the file names.