#32718 closed Bug (fixed)
Saving a FileField raises SuspiciousFileOperation in some scenarios.
Reported by: | Jakub Kleň | Owned by: | Mariusz Felisiak |
---|---|---|---|
Component: | Database layer (models, ORM) | Version: | 2.2 |
Severity: | Release blocker | Keywords: | 3.2.1 file model filefield fieldfile |
Cc: | Carlton Gibson, Florian Apolloner, Markus Holtermann | Triage Stage: | Accepted |
Has patch: | yes | Needs documentation: | no |
Needs tests: | no | Patch needs improvement: | no |
Easy pickings: | no | UI/UX: | no |
Description
I came across this issue today when I was updating Django from 3.2.0 -> 3.2.1.
It's directly caused by: https://docs.djangoproject.com/en/3.2/releases/3.2.1/#cve-2021-31542-potential-directory-traversal-via-uploaded-files
Starting from 3.2.1, Django requires that only the basename is passed to FieldFile.save
method, because otherwise it raises a new exception:
SuspiciousFileOperation: File name ... includes path elements
The issue is that in FileField.pre_save
, a full path is passed to FieldFile.save
, causing the exception to be raised.
Correct me if I'm wrong, but file-like objects always contain the full path to the file in the name
attribute (the built-in Django File
class even uses it to reopen the file if it was closed), and so it seems to be a bug in Django itself.
Steps to reproduce:
model_instance.file_attribute = File(open(path, 'rb')) model_instance.save()
I also created a PR with the fix: https://github.com/django/django/pull/14354
Change History (45)
comment:1 by , 4 years ago
Severity: | Normal → Release blocker |
---|---|
Triage Stage: | Unreviewed → Accepted |
comment:2 by , 4 years ago
comment:3 by , 4 years ago
Oh, good catch Brian! I forgot to mention the bug is also present in 3.1.9 and 2.2.21 as they contain the CVE-2021-31542
fix, too.
follow-up: 13 comment:4 by , 4 years ago
Hi, I came across this as well. I had to makes like this one all over my project to get things to work after the CV in 2.2.21.
merged_path = '/tmp/merged.png' with open(img_path, mode='rb') as f: - field.merged_map = ImageFile(f) + field.merged_map = ImageFile(f, name=os.path.basename(merged_path)) field.save()
This shouldn't have had to be the case, as the CV was about "empty file names and paths with dot segments", and that doesn't apply in any of my cases.
To me, it looks like Jakub's patch is sound and would solve this.
comment:5 by , 4 years ago
Correct me if I'm wrong, but this fix and also this commit breaks the Django file storage completely...
For instance -
- Having a file field with an upload_to="users"
- Create a new file with name = "steve/file.txt"
- Create another file with name = "john/file.txt"
- Both files will be written (possibly overwritten!!) to: "users/file.txt" as you are only using the file basename??
Further the commit above (now released in 3.2.1) added: if name != os.path.basename(name):
now invalidates the use of paths in the filename, which previously worked...
comment:6 by , 4 years ago
Cc: | added |
---|
follow-up: 8 comment:7 by , 4 years ago
Correct me if I'm wrong, but this fix and also this commit breaks the Django file storage completely...
It's documented that the FieldField.save()'s
name
argument "is the name of the file" not a path. You can define upload_to
as a function to use subdirectories, see an example in FileField.upload_to
docs.
follow-ups: 10 11 comment:8 by , 4 years ago
Replying to Mariusz Felisiak:
Correct me if I'm wrong, but this fix and also this commit breaks the Django file storage completely...
It's documented that the
FieldField.save()'s
name
argument "is the name of the file" not a path. You can defineupload_to
as a function to use subdirectories, see an example inFileField.upload_to
docs.
Outch, sadly it is not that simple. When you look at your documentation link about the .save()
method it says:
Note that the content argument should be an instance of django.core.files.File, not Python’s built-in file object. You can construct a File from an existing Python file object like this:
This one links to https://docs.djangoproject.com/en/3.2/ref/files/file/#django.core.files.File which says name
is
The name of the file including the relative path from MEDIA_ROOT.
And now all fixes are getting ugly… So now the question becomes: If you do File(…, name='path/to/something.txt')
and then call .save(…)
, what should be the final path? Should it add the full relative path to upload_to
, should it take the basename and set that in conjunction to upload_to
or should it just take the path as is?
comment:9 by , 4 years ago
Needs documentation: | set |
---|---|
Needs tests: | set |
Patch needs improvement: | set |
Version: | 3.2 → 2.2 |
follow-up: 18 comment:10 by , 4 years ago
Replying to Florian Apolloner:
Replying to Mariusz Felisiak:
Thanks Mariusz Felisiak, you're correct - these changes will break Django's file system and cause files to possibly be overwritten. It must not go ahead.
The core issue was the addition of an unneccessary check added here:
https://github.com/django/django/blob/main/django/core/files/utils.py#L7
def validate_file_name(name): if name != os.path.basename(name): raise SuspiciousFileOperation("File name '%s' includes path elements" % name)
It currently appends (joins) the file name to upload_to.
comment:11 by , 4 years ago
Replying to Florian Apolloner:
Florian - it was your commit that has caused this break - surely this can be simply removed/refactored?
https://github.com/django/django/commit/0b79eb36915d178aef5c6a7bbce71b1e76d376d3
This will affect thousands of applications and systems... many not frequently maintained. For the sake of 2 lines of code.
comment:12 by , 4 years ago
In my opinion, this ticket should get special care and urgency. I also had to update applications. I'm not convinced such a change is appropriate in stable releases.
follow-up: 23 comment:13 by , 4 years ago
Replying to Phillip Marshall:
Hi, I came across this as well. I had to make changes like this one all over my project to get things to work after the CV in 2.2.21.
merged_path = '/tmp/merged.png' with open(img_path, mode='rb') as f: - field.merged_map = ImageFile(f) + field.merged_map = ImageFile(f, name=os.path.basename(merged_path)) field.save() # raises SuspiciousFileOperation("File name '/tmp/merged.png' includes path elements") unless change is made on previous line
Hi Phillip, I have tried your original code (without the basename) on 3.2 & 2.2.20 and it yielded this for me:
SuspiciousFileOperation: The joined path (/tmp/merged.png) is located outside of the base path component (/home/florian/sources/django.git/testing)
This was my code (functionally equal to yours):
In [1]: from django.core.files import File In [2]: from testabc.models import * In [3]: f = open('/tmp/merged.png') In [4]: f2 = File(f) In [5]: x = Test() In [6]: x.file = f2 In [7]: x.save()
So your code already fails for me before any changes. Can you please provide a reproducer and full traceback?
follow-ups: 15 20 22 comment:14 by , 4 years ago
Replying to Jakub Kleň:
Correct me if I'm wrong, but file-like objects always contain the full path to the file in the
name
attribute (the built-in DjangoFile
class even uses it to reopen the file if it was closed), and so it seems to be a bug in Django itself.
This is true, but according to my tests on earlier versions a full (absolute) path did fail already because it would usually be outside the MEDIA_ROOT. Can you provide some more details?
follow-up: 17 comment:15 by , 4 years ago
Replying to Florian Apolloner:
Replying to Jakub Kleň:
Correct me if I'm wrong, but file-like objects always contain the full path to the file in the
name
attribute (the built-in DjangoFile
class even uses it to reopen the file if it was closed), and so it seems to be a bug in Django itself.
This is true, but according to my tests on earlier versions a full (absolute) path did fail already because it would usually be outside the MEDIA_ROOT. Can you provide some more details?
I am setting the images as I described in the initial post, with absolute paths which are outside of MEDIA_ROOT
, and am not having any issues with it. I'm currently actually using v3.2.1, because I had to update. I initially updated to 3.2.0, but that caused some crashes because of a bug in legacy cookie decoding, so I really had no other option than to update to 3.2.1. So I temporarily monkey patched django.core.files.utils.validate_file_name
to just return name
, and everything works perfectly.
Regarding the FileField.save
method and the parameter it takes, to me it kind of makes sense to only pass in the basename of the file. I'm not completely sure if we should be passing in a path in the case of pre_save
. It doesn't make sense to me to derive the final path of the file from that. The full path should be generated in a custom upload_to
, and that parameter should only be used to e.g. reuse the same extension of the original file. But not to append that full path to upload_to
. Note I'm talking only about this special case with pre_save
, where files set to the model fields are handled. It would still be possible to manually call field.save('some/path/file.png
. Although I'm not sure why someone would do that, because the path should be provided in upload_to
I think. But I know we also have to think about backwards compatibility, so I'm not quite sure what is a viable solution here.
Imagine the following scenario (assuming the FileField has upload_to='media'
):
model.file_field = File(open('/folder/file.png', 'rb')) model.save()
Should the file be written to media/folder/file.png
? I don't think the final path in MEDIA_ROOT
should depend on the original path of the file.
And we can't even provide File(open(), name='file.png')
manually, because it would break File.open()
where the name
is being used.
If you want to provide a custom path during save
, you can still do that by manually calling file_field.save(path
, but in this special case of pre_save
, where the name
is the full path, I think it would make sense to only use the basename.
I'm not completely sure what Django does by default when upload_to
is a path (string), as I'm using a callable where the only thing I take from the filename is the extension, which I reuse. So both ways will work for me.
comment:16 by , 4 years ago
Florian,
You can also see some other use case here: https://gitlab.gnome.org/Infrastructure/damned-lies/-/blob/085328022/vertimus/models.py#L751, where we copy the file attached to a model to a new model. I had to wrap action.file.name
with basename
to make it work with 3.2.1. The tests failed otherwise.
comment:17 by , 4 years ago
Replying to Jakub Kleň:
Replying to Florian Apolloner:
Replying to Jakub Kleň:
Correct me if I'm wrong, but file-like objects always contain the full path to the file in the
name
attribute (the built-in DjangoFile
class even uses it to reopen the file if it was closed), and so it seems to be a bug in Django itself.
This is true, but according to my tests on earlier versions a full (absolute) path did fail already because it would usually be outside the MEDIA_ROOT. Can you provide some more details?
I am setting the images as I described in the initial post, with absolute paths which are outside of
MEDIA_ROOT
, and am not having any issues with it. I'm currently actually using v3.2.1, because I had to update. I initially updated to 3.2.0, but that caused some crashes because of a bug in legacy cookie decoding, so I really had no other option than to update to 3.2.1. So I temporarily monkey patcheddjango.core.files.utils.validate_file_name
to justreturn name
, and everything works perfectly.
Regarding the
FileField.save
method and the parameter it takes, to me it kind of makes sense to only pass in the basename of the file. I'm not completely sure if we should be passing in a path in the case ofpre_save
. It doesn't make sense to me to derive the final path of the file from that. The full path should be generated in a customupload_to
, and that parameter should only be used to e.g. reuse the same extension of the original file. But not to append that full path toupload_to
. Note I'm talking only about this special case withpre_save
, where files set to the model fields are handled. It would still be possible to manually callfield.save('some/path/file.png
. Although I'm not sure why someone would do that, because the path should be provided inupload_to
I think. But I know we also have to think about backwards compatibility, so I'm not quite sure what is a viable solution here.
Imagine the following scenario (assuming the FileField has
upload_to='media'
):
model.file_field = File(open('/folder/file.png', 'rb')) model.save()Should the file be written to
media/folder/file.png
? I don't think the final path inMEDIA_ROOT
should depend on the original path of the file.
And we can't even provideFile(open(), name='file.png')
manually, because it would breakFile.open()
where thename
is being used.
If you want to provide a custom path duringsave
, you can still do that by manually callingfile_field.save(path
, but in this special case ofpre_save
, where thename
is the full path, I think it would make sense to only use the basename.
I'm not completely sure what Django does by default when
upload_to
is a path (string), as I'm using a callable where the only thing I take from the filename is the extension, which I reuse. So both ways will work for me.
Sorry, I was mistaken. It wouldn't be possible to even do it manually using file_field.save(path
, because the check if made inside the FileField.save
method. But it still kind of makes sense to me for the FileField.save
method to only take the basename. But I'm not sure everyone would agree, and it would be a backwards incompatible change.
And even if we allow FileField.save
to take the full path, shouldn't we still be calling it with just the basename from the pre_save
method?
follow-up: 19 comment:18 by , 4 years ago
Replying to carderm:
Replying to Florian Apolloner:
Replying to Mariusz Felisiak:
The core issue was the addition of an unneccessary check added here:
https://github.com/django/django/blob/main/django/core/files/utils.py#L7
def validate_file_name(name): if name != os.path.basename(name): raise SuspiciousFileOperation("File name '%s' includes path elements" % name)
These two lines broke our application which uses the 2.2 LTS. Our application is shipped, so that's thousands of installations out there all with various Y versions of our software over the past few years. Even if we can port our application to use the new name=...
argument, we would have to backport and release many older versions, or force our userbase to upgrade. I agree with @cardem; I see the impact of these two lines as huge.
Can someone identify what is the motivation for having these two lines? I don't see how they are required for the CVE, but even if they are, since the CVE is graded as low, is this appropriate for the LTS?
One of my other concerns is that what I see happening is folks are pinning to 2.2.20, which is going to prevent them from receiving the moderate CVE fix for Python 3.9 environments with 2.2.22.
Thank you for all the effort you all put into Django. We couldn't do what we do without you.
comment:19 by , 4 years ago
First off, can we please stop with arguments like "this will break thousand installations". It is simply not true because I do not assume that those installations will automatically update. And bumping the Django dependency should result in your tests discovering the issue. If not, well we are in the same boat, Django didn't catch it either even though our test coverage is not that bad…
Replying to Brian Bouterse:
These two lines broke our application which uses the 2.2 LTS. Our application is shipped, so that's thousands of installations out there all with various Y versions of our software over the past few years. Even if we can port our application to use the new
name=...
argument, we would have to backport and release many older versions, or force our userbase to upgrade. I agree with @cardem; I see the impact of these two lines as huge.
I fully understand that this change might have impacts for you. That said, assume this was a different security fix, all your users would have to upgrade anyways; how is it much different whether the upgrade just Django or your app + Django? Please be aware that literally every point release has the possibility to break your app because you might have used some undocumented feature. I am trying to understand how your normal upgrade process looks like.
Can someone identify what is the motivation for having these two lines? I don't see how they are required for the CVE, but even if they are, since the CVE is graded as low, is this appropriate for the LTS?
It is an in-depth defense against path traversal issues inside MEDIA_ROOT. The CVE could have been worded differently, sorry about that. The fallout seems to be larger than expected. That said, please be aware that we do develop security fixes in private (in a rather small group) to not allow early exploits. As such issues & mistakes like this are more likely to occur with security fixes than with regular fixes (where there is at least the chance of a larger review). As for whether or not this is appropriate for a LTS. The point of a LTS is exactly to get those security fixes (and a few normal high profile issues). Depending on how exactly your application works, this low issue could just as well be a high for you (although many things have to go wrong for that).
One of my other concerns is that what I see happening is folks are pinning to 2.2.20, which is going to prevent them from receiving the moderate CVE fix for Python 3.9 environments with 2.2.22.
That is fine, as with all things, we do recommend pinning. If one of the versions causes problems there is always the possibility to maintain your own forks as needed or monkeypatch around issue. And yes I really think people should have this as a last resort option because we simply cannot be perfect and will fail from time to time. Rushing out a fix for this to break yet another valid usecase will make things just worse.
Thank you for all the effort you all put into Django. We couldn't do what we do without you.
Thank you for the kind words. The are highly appreciated especially in heated tickets like this one. As said above already I am having a difficulty in reproducing all the mentioned problems. Some of the issues mentioned (especially absolute path) occur for me even on unpatched Django versions and I am yet trying to understand what the difference between our systems is. The more we know the faster we can fix this.
comment:20 by , 4 years ago
Replying to Florian Apolloner:
Replying to Jakub Kleň:
Correct me if I'm wrong, but file-like objects always contain the full path to the file in the
name
attribute (the built-in DjangoFile
class even uses it to reopen the file if it was closed), and so it seems to be a bug in Django itself.
This is true, but according to my tests on earlier versions a full (absolute) path did fail already because it would usually be outside the MEDIA_ROOT. Can you provide some more details?
Where exactly is it crashing for you Florian with absolute path when you comment out the checks in validate_file_name
(3.2.1)? Maybe I could look into the code to search for some difference I might have in my setup. A stack trace would be awesome.
follow-up: 24 comment:21 by , 4 years ago
I think I see why some installations broke with this but other's haven't. Look at this reproducer:
from django.core.files import File from django.db import models class MyModelWithUploadTo(models.Model): def upload_to_func(self, name): return 'qqq' file = models.FileField(upload_to=upload_to_func) class MyModel(models.Model): file = models.FileField() with open('/tmp/somefile', 'w') as f: f.write('asdfasdfasdf') f.flush() with open('/tmp/anotherfile', 'w') as f: f.write('fsfoisufsiofdsoiufd') f.flush() model_instance_with_upload_to = MyModelWithUploadTo() model_instance_with_upload_to.file = File(open('/tmp/somefile', 'rb')) model_instance_with_upload_to.save() print('I saved the one with upload_to()\n\n') model_instance = MyModel() model_instance.file = File(open('/tmp/anotherfile', 'rb')) model_instance.save()
On 2.2.20 when I makemigrations for those two models, apply them, and the run the reproducer I get:
I saved the one with upload_to() Traceback (most recent call last): File "/home/vagrant/devel/django_file_name_2_2_21_reproducer.py", line 30, in <module> model_instance.save() File "/usr/local/lib/pulp/lib64/python3.9/site-packages/django/db/models/base.py", line 743, in save self.save_base(using=using, force_insert=force_insert, File "/usr/local/lib/pulp/lib64/python3.9/site-packages/django/db/models/base.py", line 780, in save_base updated = self._save_table( File "/usr/local/lib/pulp/lib64/python3.9/site-packages/django/db/models/base.py", line 873, in _save_table result = self._do_insert(cls._base_manager, using, fields, update_pk, raw) File "/usr/local/lib/pulp/lib64/python3.9/site-packages/django/db/models/base.py", line 910, in _do_insert return manager._insert([self], fields=fields, return_id=update_pk, File "/usr/local/lib/pulp/lib64/python3.9/site-packages/django/db/models/manager.py", line 82, in manager_method return getattr(self.get_queryset(), name)(*args, **kwargs) File "/usr/local/lib/pulp/lib64/python3.9/site-packages/django/db/models/query.py", line 1186, in _insert return query.get_compiler(using=using).execute_sql(return_id) File "/usr/local/lib/pulp/lib64/python3.9/site-packages/django/db/models/sql/compiler.py", line 1376, in execute_sql for sql, params in self.as_sql(): File "/usr/local/lib/pulp/lib64/python3.9/site-packages/django/db/models/sql/compiler.py", line 1318, in as_sql value_rows = [ File "/usr/local/lib/pulp/lib64/python3.9/site-packages/django/db/models/sql/compiler.py", line 1319, in <listcomp> [self.prepare_value(field, self.pre_save_val(field, obj)) for field in fields] File "/usr/local/lib/pulp/lib64/python3.9/site-packages/django/db/models/sql/compiler.py", line 1319, in <listcomp> [self.prepare_value(field, self.pre_save_val(field, obj)) for field in fields] File "/usr/local/lib/pulp/lib64/python3.9/site-packages/django/db/models/sql/compiler.py", line 1270, in pre_save_val return field.pre_save(obj, add=True) File "/usr/local/lib/pulp/lib64/python3.9/site-packages/django/db/models/fields/files.py", line 288, in pre_save file.save(file.name, file.file, save=False) File "/usr/local/lib/pulp/lib64/python3.9/site-packages/django/db/models/fields/files.py", line 87, in save self.name = self.storage.save(name, content, max_length=self.field.max_length) File "/usr/local/lib/pulp/lib64/python3.9/site-packages/django/core/files/storage.py", line 52, in save return self._save(name, content) File "/home/vagrant/devel/pulpcore/pulpcore/app/models/storage.py", line 46, in _save full_path = self.path(name) File "/usr/local/lib/pulp/lib64/python3.9/site-packages/django/core/files/storage.py", line 323, in path return safe_join(self.location, name) File "/usr/local/lib/pulp/lib64/python3.9/site-packages/django/utils/_os.py", line 44, in safe_join raise SuspiciousFileOperation( django.core.exceptions.SuspiciousFileOperation: The joined path (/tmp/anotherfile) is located outside of the base path component (/var/lib/pulp/media)
Notice how the output includes the I saved the one with upload_to()
. That shows this was working with 2.2.20.
Now run that same reproducer, migrations, etc against 2.2.21. I see:
Traceback (most recent call last): File "/home/vagrant/devel/django_file_name_2_2_21_reproducer.py", line 25, in <module> model_instance_with_upload_to.save() File "/usr/local/lib/pulp/lib64/python3.9/site-packages/django/db/models/base.py", line 743, in save self.save_base(using=using, force_insert=force_insert, File "/usr/local/lib/pulp/lib64/python3.9/site-packages/django/db/models/base.py", line 780, in save_base updated = self._save_table( File "/usr/local/lib/pulp/lib64/python3.9/site-packages/django/db/models/base.py", line 873, in _save_table result = self._do_insert(cls._base_manager, using, fields, update_pk, raw) File "/usr/local/lib/pulp/lib64/python3.9/site-packages/django/db/models/base.py", line 910, in _do_insert return manager._insert([self], fields=fields, return_id=update_pk, File "/usr/local/lib/pulp/lib64/python3.9/site-packages/django/db/models/manager.py", line 82, in manager_method return getattr(self.get_queryset(), name)(*args, **kwargs) File "/usr/local/lib/pulp/lib64/python3.9/site-packages/django/db/models/query.py", line 1186, in _insert return query.get_compiler(using=using).execute_sql(return_id) File "/usr/local/lib/pulp/lib64/python3.9/site-packages/django/db/models/sql/compiler.py", line 1376, in execute_sql for sql, params in self.as_sql(): File "/usr/local/lib/pulp/lib64/python3.9/site-packages/django/db/models/sql/compiler.py", line 1318, in as_sql value_rows = [ File "/usr/local/lib/pulp/lib64/python3.9/site-packages/django/db/models/sql/compiler.py", line 1319, in <listcomp> [self.prepare_value(field, self.pre_save_val(field, obj)) for field in fields] File "/usr/local/lib/pulp/lib64/python3.9/site-packages/django/db/models/sql/compiler.py", line 1319, in <listcomp> [self.prepare_value(field, self.pre_save_val(field, obj)) for field in fields] File "/usr/local/lib/pulp/lib64/python3.9/site-packages/django/db/models/sql/compiler.py", line 1270, in pre_save_val return field.pre_save(obj, add=True) File "/usr/local/lib/pulp/lib64/python3.9/site-packages/django/db/models/fields/files.py", line 289, in pre_save file.save(file.name, file.file, save=False) File "/usr/local/lib/pulp/lib64/python3.9/site-packages/django/db/models/fields/files.py", line 87, in save name = self.field.generate_filename(self.instance, name) File "/usr/local/lib/pulp/lib64/python3.9/site-packages/django/db/models/fields/files.py", line 303, in generate_filename filename = validate_file_name(filename) File "/usr/local/lib/pulp/lib64/python3.9/site-packages/django/core/files/utils.py", line 8, in validate_file_name raise SuspiciousFileOperation("File name '%s' includes path elements" % name) django.core.exceptions.SuspiciousFileOperation: File name '/tmp/somefile' includes path elements
Here the SuspiciousFileOperation
is also raised on the saving of MyModelWithUploadTo
. Do you all know why this difference is significant?
Regarding non reproducer discussion...
Yes let's take how many folks are broken out of the equation; we have to do what's secure, even if its inconvenient. I agree with that, and thank you for saying that. I think what is contributing to the issue is confusion on why these 2 lines are necessary given how the CVE reads. I'm not here to second guess the good work of the security response team or it's review process, but I am confused.
As a practical matter, I'm still trying to figure out if another release removing these two lines will occur, or if the CVE description needs revision. Do you have some advice for me on which you think will happen?
Also I'd like to exchange perspectives on version pinning for shipped django apps depending on the Django LTS, but not until we resolve the matter at hand first.
Thank you for everything. Our project and users really appreciate it. Please let me know how we can help.
follow-ups: 25 27 comment:22 by , 4 years ago
Replying to Florian Apolloner:
Replying to Jakub Kleň:
Correct me if I'm wrong, but file-like objects always contain the full path to the file in the
name
attribute (the built-in DjangoFile
class even uses it to reopen the file if it was closed), and so it seems to be a bug in Django itself.
This is true, but according to my tests on earlier versions a full (absolute) path did fail already because it would usually be outside the MEDIA_ROOT. Can you provide some more details?
I believe an absolute path outside of Media Root should fail, that is correct (in my opinion) - as an absolute path might be able to write shell scripts around the file system...
It should however, work for a relative file path.
Hack example here
# Some model class FileModel(models.Model): file_field = models.FileField(_('UsersFile'), storage='users', max_length=1000) model_inst = FileModel() # Add some file new_file = ContentFile(some_file.read()) new_file.name = f"{user.name}/file.txt" # E.g. "steve/file.txt" <<< This file should only be for 'steve' model_inst.file = new_file # Save the Model model_inst.save() #<<< in Django 3.2.0 this creates file "MEDIA_ROOT/users/steve/file.txt ! which is correct model_inst.save() #<<< in Django 3.2.1 this FAILS with Path issue ! Incorrect model_inst.save() #<<< in Fix above [https://github.com/django/django/pull/14354/commits/1c78e83791163b034a7f1689673bff02f9969368 commit] will create "MEDIA_ROOT/users/file.txt ! which is incorrect - missing path! # What we need to stop: bad_file = ContentFile(malicious_script.read()) bad_file.name = "/etc/init.d/nameofscript.sh" # or bad_file.name = "../../../../../usr/local/lib/bad.file" model_inst.file = bad_file # On Save we need to protect here - imho model_inst.save() # <<< This *should Fail* with Security Error
follow-up: 26 comment:23 by , 4 years ago
Replying to Florian Apolloner:
Can you please provide a reproducer and full traceback?
I hope this isn't too overboard, but I went ahead and copied out a chunk of my production code into a new project.
please look at the readme to see my full comments and traceback: https://github.com/wizpig64/django_32718_repro
disclaimer: i don't use upload_to strings, just upload_to functions, so i dont know how to account for those users.
thanks for your time and your hard work.
follow-up: 32 comment:24 by , 4 years ago
Replying to Brian Bouterse:
Here the
SuspiciousFileOperation
is also raised on the saving ofMyModelWithUploadTo
. Do you all know why this difference is significant?
This is perfect! The difference can be significant due to this: https://github.com/django/django/blob/c4ee3b208a2c95a5102b5e4fa789b10f8ee29b84/django/db/models/fields/files.py#L309-L322 -- This means when upload_to is set it is supposed to return the final filename (including a path relative path). Since you are just returning 'qqq' there it will be valid in 2.2.20. 2.2.21 validates the name beforehand and will break that. This really helps. Phillip had similar code (ie upload_to
in a function as well).
As a practical matter, I'm still trying to figure out if another release removing these two lines will occur, or if the CVE description needs revision. Do you have some advice for me on which you think will happen?
There will be another release fixing this and the CVE will probably get adjusted to drop the sentence "Specifically, empty file names and paths with dot segments will be
rejected.". Does the wording make more sense for you then?
Thank you for everything. Our project and users really appreciate it. Please let me know how we can help.
Will do, testing will certainly help once we have a suitable PR :)
comment:25 by , 4 years ago
Replying to carderm:
model_inst.save() #<<< in Django Latest Git commit will create "MEDIA_ROOT/users/file.txt ! which is incorrect - missing path!
I cannot reproduce this, this fails also with SuspiciousFileOperation: File name 'steve/file.txt' includes path elements
for me. Please provide an actual reproducer.
comment:26 by , 4 years ago
Replying to Phillip Marshall:
please look at the readme to see my full comments and traceback: https://github.com/wizpig64/django_32718_repro
disclaimer: i don't use upload_to strings, just upload_to functions, so i dont know how to account for those users.
Perfect, this explains a lot! Your upload_to
basically ignores the passed filename (aside from the extension). Now we have a check before you even get the filename which ensures that there is no full path.
comment:27 by , 4 years ago
Replying to carderm:
I believe an absolute path outside of Media Root should fail, that is correct (in my opinion) - as an absolute path might be able to write shell scripts around the file system...
It should however, work for a relative file path.
Absolute paths wouldn't be able to write shell scripts around the file system, because the path always starts with MEDIA_ROOT
, and ../
and such are disallowed.
Also, I'm not sure if we should be forced to override the name
attribute of File
, because overriding it breaks the File.open
function from working properly.
comment:28 by , 4 years ago
I'm still not completely sure if we should disallow absolute paths in the File
. When I'm for example setting a file from /tmp
, which I'm doing in my project, it would force me to override the File.name
like this:
file = File(open('/tmp/image.png', 'rb'), name='image.png') # which would then break: file.close() file.open()
I know that the open
method is not called when using the File
to update a model, but it doesn't seem to be the right thing to do.
Should we be using the path from File.name
and append it to upload_to
? In my project, I use a callable upload_to
, which takes care of the path and filename, and only takes the extension of the original filename. Isn't that a better solution if we want a custom path that depends on the model instance?
def custom_upload_to(instance, filename): extension = os.path.splitext(filename)[1][1:].lower() actual_filename = 'image.' + extension return os.path.join('users', instance.username, actual_filename) model.file = File(open('/tmp/sth.png', 'rb'))
My point here is, shouldn't we be using upload_to
for the purpose of constructing a custom file path?
follow-up: 30 comment:29 by , 4 years ago
Cc: | added |
---|---|
Has patch: | unset |
Needs documentation: | unset |
Needs tests: | unset |
Owner: | changed from | to
Patch needs improvement: | unset |
Status: | new → assigned |
Summary: | [3.2.1] Issue with assigning file to FileField → Saving a FileField raises SuspiciousFileOperation in some scenarios. |
Thanks y'all for various test projects and detailed reports of encountered issue with saving FileFields
.
After some discussions, we decided to prepare a patch for FileField.generate_filename() that should solve most of these issues. Please see the following points.
- If
filename
passed to theFileField.generate_filename()
is an absolute path, it will be converted to theos.path.basename(filename)
. - Validate
filename
returned byFileField.upload_to()
not afilename
passed to theFileField.generate_filename()
(upload_to()
may completely ignored passedfilename
). - Allow relative paths (without dot segments) in the generated filename.
We're going to prepare a patch in the next few days.
comment:30 by , 4 years ago
Replying to Mariusz Felisiak:
Thanks y'all for various test projects and detailed reports of encountered issue with saving
FileFields
.
After some discussions, we decided to prepare a patch for FileField.generate_filename() that should solve most of these issues. Please see the following points.
- If
filename
passed to theFileField.generate_filename()
is an absolute path, it will be converted to theos.path.basename(filename)
.- Validate
filename
returned byFileField.upload_to()
not afilename
passed to theFileField.generate_filename()
(upload_to()
may completely ignored passedfilename
).- Allow relative paths (without dot segments) in the generated filename.
We're going to prepare a patch in the next few days.
Thanks a lot Mariusz!
I was thinking about this for a while now, and it seems like a great solution which will both be backwards compatible and will also fit the use cases for all of us! It also takes my concerns into account, which I'm super happy about!
Thanks a lot for the great work all of you guys are doing for Django. Love the framework!
comment:31 by , 4 years ago
We're going to prepare a patch in the next few days.
The sooner the better! :) An Open edX release is hinging on this. We've always loved Django's stability and the trust we've had in the security patches.
comment:32 by , 4 years ago
Replying to Florian Apolloner:
There will be another release fixing this and the CVE will probably get adjusted to drop the sentence "Specifically, empty file names and paths with dot segments will be
rejected.". Does the wording make more sense for you then?
It does! Thank you so much for making this clearer.
Will do, testing will certainly help once we have a suitable PR :)
I can do that. The description of the plan in Comment 29 sounds great!
follow-up: 40 comment:33 by , 4 years ago
I added PR with the proposed solution. Please give it a spin.
- If
filename
passed to theFileField.generate_filename()
is an absolute path, it will be converted to theos.path.basename(filename)
.
IMO this is not necessary.
comment:34 by , 4 years ago
Has patch: | set |
---|
comment:35 by , 4 years ago
I tested the PR against our code, and it fully resolved our issues. Thank you all so much! https://github.com/django/django/pull/14372#issuecomment-836798374
comment:40 by , 4 years ago
Replying to Mariusz Felisiak:
I added PR with the proposed solution. Please give it a spin.
This patch resolves the issue for my project's use case as well. Thanks a lot!
I am also experiencing this issue on 2.2.21. It's just as you described it. I'm going to apply your PR onto my 2.2.21 checkout and see if it resolves it for me also.