Opened 3 years ago

Closed 3 years ago

Last modified 3 years ago

#32718 closed Bug (fixed)

Saving a FileField raises SuspiciousFileOperation in some scenarios.

Reported by: Jakub Kleň Owned by: Mariusz Felisiak
Component: Database layer (models, ORM) Version: 2.2
Severity: Release blocker Keywords: 3.2.1 file model filefield fieldfile
Cc: Carlton Gibson, Florian Apolloner, Markus Holtermann Triage Stage: Accepted
Has patch: yes Needs documentation: no
Needs tests: no Patch needs improvement: no
Easy pickings: no UI/UX: no

Description

I came across this issue today when I was updating Django from 3.2.0 -> 3.2.1.
It's directly caused by: https://docs.djangoproject.com/en/3.2/releases/3.2.1/#cve-2021-31542-potential-directory-traversal-via-uploaded-files

Starting from 3.2.1, Django requires that only the basename is passed to FieldFile.save method, because otherwise it raises a new exception:

SuspiciousFileOperation: File name ... includes path elements

The issue is that in FileField.pre_save, a full path is passed to FieldFile.save, causing the exception to be raised.

Correct me if I'm wrong, but file-like objects always contain the full path to the file in the name attribute (the built-in Django File class even uses it to reopen the file if it was closed), and so it seems to be a bug in Django itself.

Steps to reproduce:

model_instance.file_attribute = File(open(path, 'rb'))
model_instance.save()

I also created a PR with the fix: https://github.com/django/django/pull/14354

Change History (45)

comment:1 by Markus Holtermann, 3 years ago

Severity: NormalRelease blocker
Triage Stage: UnreviewedAccepted

comment:2 by Brian Bouterse, 3 years ago

I am also experiencing this issue on 2.2.21. It's just as you described it. I'm going to apply your PR onto my 2.2.21 checkout and see if it resolves it for me also.

comment:3 by Jakub Kleň, 3 years ago

Oh, good catch Brian! I forgot to mention the bug is also present in 3.1.9 and 2.2.21 as they contain the CVE-2021-31542 fix, too.

comment:4 by Phillip Marshall, 3 years ago

Hi, I came across this as well. I had to make changes like this one all over my project to get things to work after the CV in 2.2.21.

 merged_path = '/tmp/merged.png'
 with open(img_path, mode='rb') as f:
-     field.merged_map = ImageFile(f)
+     field.merged_map = ImageFile(f, name=os.path.basename(merged_path))
      field.save()  # raises SuspiciousFileOperation("File name '/tmp/merged.png' includes path elements") unless change is made on previous line

(this hotfix still works with my upload_to functions, so the file still goes to the right place)

This shouldn't have had to be the case, as the CV was about "empty file names and paths with dot segments", and that doesn't apply in any of my cases.

To me, it looks like Jakub's patch is sound and would solve this.

Last edited 3 years ago by Phillip Marshall (previous) (diff)

comment:5 by carderm, 3 years ago

Correct me if I'm wrong, but this fix and also this commit breaks the Django file storage completely...

For instance -

  • Having a file field with an upload_to="users"
  • Create a new file with name = "steve/file.txt"
  • Create another file with name = "john/file.txt"
  • Both files will be written (possibly overwritten!!) to: "users/file.txt" as you are only using the file basename??

Further the commit above (now released in 3.2.1) added: if name != os.path.basename(name): now invalidates the use of paths in the filename, which previously worked...

comment:6 by Mariusz Felisiak, 3 years ago

Cc: Carlton Gibson Florian Apolloner added

comment:7 by Mariusz Felisiak, 3 years ago

Correct me if I'm wrong, but this fix and also this ​commit breaks the Django file storage completely...

It's documented that the FieldField.save()'s name argument "is the name of the file" not a path. You can define upload_to as a function to use subdirectories, see an example in FileField.upload_to docs.

in reply to:  7 ; comment:8 by Florian Apolloner, 3 years ago

Replying to Mariusz Felisiak:

Correct me if I'm wrong, but this fix and also this ​commit breaks the Django file storage completely...

It's documented that the FieldField.save()'s name argument "is the name of the file" not a path. You can define upload_to as a function to use subdirectories, see an example in FileField.upload_to docs.

Outch, sadly it is not that simple. When you look at your documentation link about the .save() method it says:

Note that the content argument should be an instance of django.core.files.File, not Python’s built-in file object. You can construct a File from an existing Python file object like this:

This one links to https://docs.djangoproject.com/en/3.2/ref/files/file/#django.core.files.File which says name is

The name of the file including the relative path from MEDIA_ROOT.

And now all fixes are getting ugly… So now the question becomes: If you do File(…, name='path/to/something.txt') and then call .save(…), what should be the final path? Should it add the full relative path to upload_to, should it take the basename and set that in conjunction to upload_to or should it just take the path as is?

comment:9 by Mariusz Felisiak, 3 years ago

Needs documentation: set
Needs tests: set
Patch needs improvement: set
Version: 3.22.2

in reply to:  8 ; comment:10 by carderm, 3 years ago

Replying to Florian Apolloner:
Replying to Mariusz Felisiak:

Thanks Mariusz Felisiak, you're correct - these changes will break Django's file system and cause files to possibly be overwritten. It must not go ahead.

The core issue was the addition of an unneccessary check added here:
https://github.com/django/django/blob/main/django/core/files/utils.py#L7

def validate_file_name(name):
    if name != os.path.basename(name):
        raise SuspiciousFileOperation("File name '%s' includes path elements" % name)


It currently appends (joins) the file name to upload_to.

in reply to:  8 comment:11 by carderm, 3 years ago

Replying to Florian Apolloner:

Florian - it was your commit that has caused this break - surely this can be simply removed/refactored?

https://github.com/django/django/commit/0b79eb36915d178aef5c6a7bbce71b1e76d376d3

This will affect thousands of applications and systems... many not frequently maintained. For the sake of 2 lines of code.

Last edited 3 years ago by carderm (previous) (diff)

comment:12 by Claude Paroz, 3 years ago

In my opinion, this ticket should get special care and urgency. I also had to update applications. I'm not convinced such a change is appropriate in stable releases.

in reply to:  4 ; comment:13 by Florian Apolloner, 3 years ago

Replying to Phillip Marshall:

Hi, I came across this as well. I had to make changes like this one all over my project to get things to work after the CV in 2.2.21.

 merged_path = '/tmp/merged.png'
 with open(img_path, mode='rb') as f:
-     field.merged_map = ImageFile(f)
+     field.merged_map = ImageFile(f, name=os.path.basename(merged_path))
      field.save()  # raises SuspiciousFileOperation("File name '/tmp/merged.png' includes path elements") unless change is made on previous line

Hi Phillip, I have tried your original code (without the basename) on 3.2 & 2.2.20 and it yielded this for me:

SuspiciousFileOperation: The joined path (/tmp/merged.png) is located outside of the base path component (/home/florian/sources/django.git/testing)

This was my code (functionally equal to yours):

In [1]: from django.core.files import File
In [2]: from testabc.models import *
In [3]: f = open('/tmp/merged.png')
In [4]: f2 = File(f)
In [5]: x = Test()
In [6]: x.file = f2
In [7]: x.save()

So your code already fails for me before any changes. Can you please provide a reproducer and full traceback?

in reply to:  description ; comment:14 by Florian Apolloner, 3 years ago

Replying to Jakub Kleň:

Correct me if I'm wrong, but file-like objects always contain the full path to the file in the name attribute (the built-in Django File class even uses it to reopen the file if it was closed), and so it seems to be a bug in Django itself.

This is true, but according to my tests on earlier versions a full (absolute) path did fail already because it would usually be outside the MEDIA_ROOT. Can you provide some more details?

in reply to:  14 ; comment:15 by Jakub Kleň, 3 years ago

Replying to Florian Apolloner:

Replying to Jakub Kleň:

Correct me if I'm wrong, but file-like objects always contain the full path to the file in the name attribute (the built-in Django File class even uses it to reopen the file if it was closed), and so it seems to be a bug in Django itself.

This is true, but according to my tests on earlier versions a full (absolute) path did fail already because it would usually be outside the MEDIA_ROOT. Can you provide some more details?

I am setting the images as I described in the initial post, with absolute paths which are outside of MEDIA_ROOT, and am not having any issues with it. I'm currently actually using v3.2.1, because I had to update. I initially updated to 3.2.0, but that caused some crashes because of a bug in legacy cookie decoding, so I really had no other option than to update to 3.2.1. So I temporarily monkey patched django.core.files.utils.validate_file_name to just return name, and everything works perfectly.

Regarding the FileField.save method and the parameter it takes, to me it kind of makes sense to only pass in the basename of the file. I'm not completely sure if we should be passing in a path in the case of pre_save. It doesn't make sense to me to derive the final path of the file from that. The full path should be generated in a custom upload_to, and that parameter should only be used to e.g. reuse the same extension of the original file. But not to append that full path to upload_to. Note I'm talking only about this special case with pre_save, where files set to the model fields are handled. It would still be possible to manually call field.save('some/path/file.png. Although I'm not sure why someone would do that, because the path should be provided in upload_to I think. But I know we also have to think about backwards compatibility, so I'm not quite sure what is a viable solution here.

Imagine the following scenario (assuming the FileField has upload_to='media'):

model.file_field = File(open('/folder/file.png', 'rb'))
model.save()

Should the file be written to media/folder/file.png? I don't think the final path in MEDIA_ROOT should depend on the original path of the file.
And we can't even provide File(open(), name='file.png') manually, because it would break File.open() where the name is being used.
If you want to provide a custom path during save, you can still do that by manually calling file_field.save(path, but in this special case of pre_save, where the name is the full path, I think it would make sense to only use the basename.

I'm not completely sure what Django does by default when upload_to is a path (string), as I'm using a callable where the only thing I take from the filename is the extension, which I reuse. So both ways will work for me.

comment:16 by Claude Paroz, 3 years ago

Florian,

You can also see some other use case here: https://gitlab.gnome.org/Infrastructure/damned-lies/-/blob/085328022/vertimus/models.py#L751, where we copy the file attached to a model to a new model. I had to wrap action.file.name with basename to make it work with 3.2.1. The tests failed otherwise.

in reply to:  15 comment:17 by Jakub Kleň, 3 years ago

Replying to Jakub Kleň:

Replying to Florian Apolloner:

Replying to Jakub Kleň:

Correct me if I'm wrong, but file-like objects always contain the full path to the file in the name attribute (the built-in Django File class even uses it to reopen the file if it was closed), and so it seems to be a bug in Django itself.

This is true, but according to my tests on earlier versions a full (absolute) path did fail already because it would usually be outside the MEDIA_ROOT. Can you provide some more details?

I am setting the images as I described in the initial post, with absolute paths which are outside of MEDIA_ROOT, and am not having any issues with it. I'm currently actually using v3.2.1, because I had to update. I initially updated to 3.2.0, but that caused some crashes because of a bug in legacy cookie decoding, so I really had no other option than to update to 3.2.1. So I temporarily monkey patched django.core.files.utils.validate_file_name to just return name, and everything works perfectly.

Regarding the FileField.save method and the parameter it takes, to me it kind of makes sense to only pass in the basename of the file. I'm not completely sure if we should be passing in a path in the case of pre_save. It doesn't make sense to me to derive the final path of the file from that. The full path should be generated in a custom upload_to, and that parameter should only be used to e.g. reuse the same extension of the original file. But not to append that full path to upload_to. Note I'm talking only about this special case with pre_save, where files set to the model fields are handled. It would still be possible to manually call field.save('some/path/file.png. Although I'm not sure why someone would do that, because the path should be provided in upload_to I think. But I know we also have to think about backwards compatibility, so I'm not quite sure what is a viable solution here.

Imagine the following scenario (assuming the FileField has upload_to='media'):

model.file_field = File(open('/folder/file.png', 'rb'))
model.save()

Should the file be written to media/folder/file.png? I don't think the final path in MEDIA_ROOT should depend on the original path of the file.
And we can't even provide File(open(), name='file.png') manually, because it would break File.open() where the name is being used.
If you want to provide a custom path during save, you can still do that by manually calling file_field.save(path, but in this special case of pre_save, where the name is the full path, I think it would make sense to only use the basename.

I'm not completely sure what Django does by default when upload_to is a path (string), as I'm using a callable where the only thing I take from the filename is the extension, which I reuse. So both ways will work for me.

Sorry, I was mistaken. It wouldn't be possible to even do it manually using file_field.save(path, because the check if made inside the FileField.save method. But it still kind of makes sense to me for the FileField.save method to only take the basename. But I'm not sure everyone would agree, and it would be a backwards incompatible change.

And even if we allow FileField.save to take the full path, shouldn't we still be calling it with just the basename from the pre_save method?

Last edited 3 years ago by Jakub Kleň (previous) (diff)

in reply to:  10 ; comment:18 by Brian Bouterse, 3 years ago

Replying to carderm:

Replying to Florian Apolloner:
Replying to Mariusz Felisiak:

The core issue was the addition of an unneccessary check added here:
https://github.com/django/django/blob/main/django/core/files/utils.py#L7

def validate_file_name(name):
    if name != os.path.basename(name):
        raise SuspiciousFileOperation("File name '%s' includes path elements" % name)

These two lines broke our application which uses the 2.2 LTS. Our application is shipped, so that's thousands of installations out there all with various Y versions of our software over the past few years. Even if we can port our application to use the new name=... argument, we would have to backport and release many older versions, or force our userbase to upgrade. I agree with @cardem; I see the impact of these two lines as huge.

Can someone identify what is the motivation for having these two lines? I don't see how they are required for the CVE, but even if they are, since the CVE is graded as low, is this appropriate for the LTS?

One of my other concerns is that what I see happening is folks are pinning to 2.2.20, which is going to prevent them from receiving the moderate CVE fix for Python 3.9 environments with 2.2.22.

Thank you for all the effort you all put into Django. We couldn't do what we do without you.

in reply to:  18 comment:19 by Florian Apolloner, 3 years ago

First off, can we please stop with arguments like "this will break thousand installations". It is simply not true because I do not assume that those installations will automatically update. And bumping the Django dependency should result in your tests discovering the issue. If not, well we are in the same boat, Django didn't catch it either even though our test coverage is not that bad…

Replying to Brian Bouterse:

These two lines broke our application which uses the 2.2 LTS. Our application is shipped, so that's thousands of installations out there all with various Y versions of our software over the past few years. Even if we can port our application to use the new name=... argument, we would have to backport and release many older versions, or force our userbase to upgrade. I agree with @cardem; I see the impact of these two lines as huge.

I fully understand that this change might have impacts for you. That said, assume this was a different security fix, all your users would have to upgrade anyways; how is it much different whether the upgrade just Django or your app + Django? Please be aware that literally every point release has the possibility to break your app because you might have used some undocumented feature. I am trying to understand how your normal upgrade process looks like.

Can someone identify what is the motivation for having these two lines? I don't see how they are required for the CVE, but even if they are, since the CVE is graded as low, is this appropriate for the LTS?

It is an in-depth defense against path traversal issues inside MEDIA_ROOT. The CVE could have been worded differently, sorry about that. The fallout seems to be larger than expected. That said, please be aware that we do develop security fixes in private (in a rather small group) to not allow early exploits. As such issues & mistakes like this are more likely to occur with security fixes than with regular fixes (where there is at least the chance of a larger review). As for whether or not this is appropriate for a LTS. The point of a LTS is exactly to get those security fixes (and a few normal high profile issues). Depending on how exactly your application works, this low issue could just as well be a high for you (although many things have to go wrong for that).

One of my other concerns is that what I see happening is folks are pinning to 2.2.20, which is going to prevent them from receiving the moderate CVE fix for Python 3.9 environments with 2.2.22.

That is fine, as with all things, we do recommend pinning. If one of the versions causes problems there is always the possibility to maintain your own forks as needed or monkeypatch around issue. And yes I really think people should have this as a last resort option because we simply cannot be perfect and will fail from time to time. Rushing out a fix for this to break yet another valid usecase will make things just worse.

Thank you for all the effort you all put into Django. We couldn't do what we do without you.

Thank you for the kind words. The are highly appreciated especially in heated tickets like this one. As said above already I am having a difficulty in reproducing all the mentioned problems. Some of the issues mentioned (especially absolute path) occur for me even on unpatched Django versions and I am yet trying to understand what the difference between our systems is. The more we know the faster we can fix this.

in reply to:  14 comment:20 by Jakub Kleň, 3 years ago

Replying to Florian Apolloner:

Replying to Jakub Kleň:

Correct me if I'm wrong, but file-like objects always contain the full path to the file in the name attribute (the built-in Django File class even uses it to reopen the file if it was closed), and so it seems to be a bug in Django itself.

This is true, but according to my tests on earlier versions a full (absolute) path did fail already because it would usually be outside the MEDIA_ROOT. Can you provide some more details?

Where exactly is it crashing for you Florian with absolute path when you comment out the checks in validate_file_name (3.2.1)? Maybe I could look into the code to search for some difference I might have in my setup. A stack trace would be awesome.

Last edited 3 years ago by Jakub Kleň (previous) (diff)

comment:21 by Brian Bouterse, 3 years ago

I think I see why some installations broke with this but other's haven't. Look at this reproducer:

from django.core.files import File
from django.db import models


class MyModelWithUploadTo(models.Model):
    def upload_to_func(self, name):
        return 'qqq'

    file = models.FileField(upload_to=upload_to_func)

class MyModel(models.Model):
    file = models.FileField()


with open('/tmp/somefile', 'w') as f:
    f.write('asdfasdfasdf')
    f.flush()

with open('/tmp/anotherfile', 'w') as f:
    f.write('fsfoisufsiofdsoiufd')
    f.flush()

model_instance_with_upload_to = MyModelWithUploadTo()
model_instance_with_upload_to.file = File(open('/tmp/somefile', 'rb'))
model_instance_with_upload_to.save()
print('I saved the one with upload_to()\n\n')

model_instance = MyModel()
model_instance.file = File(open('/tmp/anotherfile', 'rb'))
model_instance.save()

On 2.2.20 when I makemigrations for those two models, apply them, and the run the reproducer I get:

I saved the one with upload_to()


Traceback (most recent call last):
  File "/home/vagrant/devel/django_file_name_2_2_21_reproducer.py", line 30, in <module>
    model_instance.save()
  File "/usr/local/lib/pulp/lib64/python3.9/site-packages/django/db/models/base.py", line 743, in save
    self.save_base(using=using, force_insert=force_insert,
  File "/usr/local/lib/pulp/lib64/python3.9/site-packages/django/db/models/base.py", line 780, in save_base
    updated = self._save_table(
  File "/usr/local/lib/pulp/lib64/python3.9/site-packages/django/db/models/base.py", line 873, in _save_table
    result = self._do_insert(cls._base_manager, using, fields, update_pk, raw)
  File "/usr/local/lib/pulp/lib64/python3.9/site-packages/django/db/models/base.py", line 910, in _do_insert
    return manager._insert([self], fields=fields, return_id=update_pk,
  File "/usr/local/lib/pulp/lib64/python3.9/site-packages/django/db/models/manager.py", line 82, in manager_method
    return getattr(self.get_queryset(), name)(*args, **kwargs)
  File "/usr/local/lib/pulp/lib64/python3.9/site-packages/django/db/models/query.py", line 1186, in _insert
    return query.get_compiler(using=using).execute_sql(return_id)
  File "/usr/local/lib/pulp/lib64/python3.9/site-packages/django/db/models/sql/compiler.py", line 1376, in execute_sql
    for sql, params in self.as_sql():
  File "/usr/local/lib/pulp/lib64/python3.9/site-packages/django/db/models/sql/compiler.py", line 1318, in as_sql
    value_rows = [
  File "/usr/local/lib/pulp/lib64/python3.9/site-packages/django/db/models/sql/compiler.py", line 1319, in <listcomp>
    [self.prepare_value(field, self.pre_save_val(field, obj)) for field in fields]
  File "/usr/local/lib/pulp/lib64/python3.9/site-packages/django/db/models/sql/compiler.py", line 1319, in <listcomp>
    [self.prepare_value(field, self.pre_save_val(field, obj)) for field in fields]
  File "/usr/local/lib/pulp/lib64/python3.9/site-packages/django/db/models/sql/compiler.py", line 1270, in pre_save_val
    return field.pre_save(obj, add=True)
  File "/usr/local/lib/pulp/lib64/python3.9/site-packages/django/db/models/fields/files.py", line 288, in pre_save
    file.save(file.name, file.file, save=False)
  File "/usr/local/lib/pulp/lib64/python3.9/site-packages/django/db/models/fields/files.py", line 87, in save
    self.name = self.storage.save(name, content, max_length=self.field.max_length)
  File "/usr/local/lib/pulp/lib64/python3.9/site-packages/django/core/files/storage.py", line 52, in save
    return self._save(name, content)
  File "/home/vagrant/devel/pulpcore/pulpcore/app/models/storage.py", line 46, in _save
    full_path = self.path(name)
  File "/usr/local/lib/pulp/lib64/python3.9/site-packages/django/core/files/storage.py", line 323, in path
    return safe_join(self.location, name)
  File "/usr/local/lib/pulp/lib64/python3.9/site-packages/django/utils/_os.py", line 44, in safe_join
    raise SuspiciousFileOperation(
django.core.exceptions.SuspiciousFileOperation: The joined path (/tmp/anotherfile) is located outside of the base path component (/var/lib/pulp/media)

Notice how the output includes the I saved the one with upload_to(). That shows this was working with 2.2.20.

Now run that same reproducer, migrations, etc against 2.2.21. I see:

Traceback (most recent call last):
  File "/home/vagrant/devel/django_file_name_2_2_21_reproducer.py", line 25, in <module>
    model_instance_with_upload_to.save()
  File "/usr/local/lib/pulp/lib64/python3.9/site-packages/django/db/models/base.py", line 743, in save
    self.save_base(using=using, force_insert=force_insert,
  File "/usr/local/lib/pulp/lib64/python3.9/site-packages/django/db/models/base.py", line 780, in save_base
    updated = self._save_table(
  File "/usr/local/lib/pulp/lib64/python3.9/site-packages/django/db/models/base.py", line 873, in _save_table
    result = self._do_insert(cls._base_manager, using, fields, update_pk, raw)
  File "/usr/local/lib/pulp/lib64/python3.9/site-packages/django/db/models/base.py", line 910, in _do_insert
    return manager._insert([self], fields=fields, return_id=update_pk,
  File "/usr/local/lib/pulp/lib64/python3.9/site-packages/django/db/models/manager.py", line 82, in manager_method
    return getattr(self.get_queryset(), name)(*args, **kwargs)
  File "/usr/local/lib/pulp/lib64/python3.9/site-packages/django/db/models/query.py", line 1186, in _insert
    return query.get_compiler(using=using).execute_sql(return_id)
  File "/usr/local/lib/pulp/lib64/python3.9/site-packages/django/db/models/sql/compiler.py", line 1376, in execute_sql
    for sql, params in self.as_sql():
  File "/usr/local/lib/pulp/lib64/python3.9/site-packages/django/db/models/sql/compiler.py", line 1318, in as_sql
    value_rows = [
  File "/usr/local/lib/pulp/lib64/python3.9/site-packages/django/db/models/sql/compiler.py", line 1319, in <listcomp>
    [self.prepare_value(field, self.pre_save_val(field, obj)) for field in fields]
  File "/usr/local/lib/pulp/lib64/python3.9/site-packages/django/db/models/sql/compiler.py", line 1319, in <listcomp>
    [self.prepare_value(field, self.pre_save_val(field, obj)) for field in fields]
  File "/usr/local/lib/pulp/lib64/python3.9/site-packages/django/db/models/sql/compiler.py", line 1270, in pre_save_val
    return field.pre_save(obj, add=True)
  File "/usr/local/lib/pulp/lib64/python3.9/site-packages/django/db/models/fields/files.py", line 289, in pre_save
    file.save(file.name, file.file, save=False)
  File "/usr/local/lib/pulp/lib64/python3.9/site-packages/django/db/models/fields/files.py", line 87, in save
    name = self.field.generate_filename(self.instance, name)
  File "/usr/local/lib/pulp/lib64/python3.9/site-packages/django/db/models/fields/files.py", line 303, in generate_filename
    filename = validate_file_name(filename)
  File "/usr/local/lib/pulp/lib64/python3.9/site-packages/django/core/files/utils.py", line 8, in validate_file_name
    raise SuspiciousFileOperation("File name '%s' includes path elements" % name)
django.core.exceptions.SuspiciousFileOperation: File name '/tmp/somefile' includes path elements

Here the SuspiciousFileOperation is also raised on the saving of MyModelWithUploadTo. Do you all know why this difference is significant?

Regarding non reproducer discussion...

Yes let's take how many folks are broken out of the equation; we have to do what's secure, even if its inconvenient. I agree with that, and thank you for saying that. I think what is contributing to the issue is confusion on why these 2 lines are necessary given how the CVE reads. I'm not here to second guess the good work of the security response team or it's review process, but I am confused.

As a practical matter, I'm still trying to figure out if another release removing these two lines will occur, or if the CVE description needs revision. Do you have some advice for me on which you think will happen?

Also I'd like to exchange perspectives on version pinning for shipped django apps depending on the Django LTS, but not until we resolve the matter at hand first.

Thank you for everything. Our project and users really appreciate it. Please let me know how we can help.

in reply to:  14 ; comment:22 by carderm, 3 years ago

Replying to Florian Apolloner:

Replying to Jakub Kleň:

Correct me if I'm wrong, but file-like objects always contain the full path to the file in the name attribute (the built-in Django File class even uses it to reopen the file if it was closed), and so it seems to be a bug in Django itself.

This is true, but according to my tests on earlier versions a full (absolute) path did fail already because it would usually be outside the MEDIA_ROOT. Can you provide some more details?

I believe an absolute path outside of Media Root should fail, that is correct (in my opinion) - as an absolute path might be able to write shell scripts around the file system...

It should however, work for a relative file path.

Hack example here

# Some model 
class FileModel(models.Model):
    file_field = models.FileField(_('UsersFile'), storage='users', max_length=1000)

model_inst = FileModel()

# Add some file
new_file = ContentFile(some_file.read())
new_file.name = f"{user.name}/file.txt"    # E.g. "steve/file.txt"   <<< This file should only be for 'steve' 
model_inst.file = new_file

# Save the Model
model_inst.save()    #<<< in Django 3.2.0 this creates file "MEDIA_ROOT/users/steve/file.txt   ! which is correct
model_inst.save()    #<<< in Django 3.2.1 this FAILS with Path issue ! Incorrect 
model_inst.save()    #<<< in Fix above [https://github.com/django/django/pull/14354/commits/1c78e83791163b034a7f1689673bff02f9969368 commit] will create "MEDIA_ROOT/users/file.txt ! which is incorrect - missing path!

# What we need to stop:
bad_file = ContentFile(malicious_script.read())
bad_file.name = "/etc/init.d/nameofscript.sh" 
# or 
bad_file.name = "../../../../../usr/local/lib/bad.file"
model_inst.file = bad_file

# On Save we need to protect here - imho
model_inst.save()     # <<< This *should Fail* with Security Error

Last edited 3 years ago by carderm (previous) (diff)

in reply to:  13 ; comment:23 by Phillip Marshall, 3 years ago

Replying to Florian Apolloner:

Can you please provide a reproducer and full traceback?

I hope this isn't too overboard, but I went ahead and copied out a chunk of my production code into a new project.

please look at the readme to see my full comments and traceback: https://github.com/wizpig64/django_32718_repro

disclaimer: i don't use upload_to strings, just upload_to functions, so i dont know how to account for those users.

thanks for your time and your hard work.

in reply to:  21 ; comment:24 by Florian Apolloner, 3 years ago

Replying to Brian Bouterse:

Here the SuspiciousFileOperation is also raised on the saving of MyModelWithUploadTo. Do you all know why this difference is significant?

This is perfect! The difference can be significant due to this: https://github.com/django/django/blob/c4ee3b208a2c95a5102b5e4fa789b10f8ee29b84/django/db/models/fields/files.py#L309-L322 -- This means when upload_to is set it is supposed to return the final filename (including a path relative path). Since you are just returning 'qqq' there it will be valid in 2.2.20. 2.2.21 validates the name beforehand and will break that. This really helps. Phillip had similar code (ie upload_to in a function as well).

As a practical matter, I'm still trying to figure out if another release removing these two lines will occur, or if the CVE description needs revision. Do you have some advice for me on which you think will happen?

There will be another release fixing this and the CVE will probably get adjusted to drop the sentence "Specifically, empty file names and paths with dot segments will be
rejected.". Does the wording make more sense for you then?

Thank you for everything. Our project and users really appreciate it. Please let me know how we can help.

Will do, testing will certainly help once we have a suitable PR :)

in reply to:  22 comment:25 by Florian Apolloner, 3 years ago

Replying to carderm:

model_inst.save()    #<<< in Django Latest Git commit will create "MEDIA_ROOT/users/file.txt ! which is incorrect - missing path!

I cannot reproduce this, this fails also with SuspiciousFileOperation: File name 'steve/file.txt' includes path elements for me. Please provide an actual reproducer.

in reply to:  23 comment:26 by Florian Apolloner, 3 years ago

Replying to Phillip Marshall:

please look at the readme to see my full comments and traceback: https://github.com/wizpig64/django_32718_repro

disclaimer: i don't use upload_to strings, just upload_to functions, so i dont know how to account for those users.

Perfect, this explains a lot! Your upload_to basically ignores the passed filename (aside from the extension). Now we have a check before you even get the filename which ensures that there is no full path.

in reply to:  22 comment:27 by Jakub Kleň, 3 years ago

Replying to carderm:

I believe an absolute path outside of Media Root should fail, that is correct (in my opinion) - as an absolute path might be able to write shell scripts around the file system...

It should however, work for a relative file path.

Absolute paths wouldn't be able to write shell scripts around the file system, because the path always starts with MEDIA_ROOT, and ../ and such are disallowed.

Also, I'm not sure if we should be forced to override the name attribute of File, because overriding it breaks the File.open function from working properly.

Last edited 3 years ago by Jakub Kleň (previous) (diff)

comment:28 by Jakub Kleň, 3 years ago

I'm still not completely sure if we should disallow absolute paths in the File. When I'm for example setting a file from /tmp, which I'm doing in my project, it would force me to override the File.name like this:

file = File(open('/tmp/image.png', 'rb'), name='image.png')
# which would then break:
file.close()
file.open()

I know that the open method is not called when using the File to update a model, but it doesn't seem to be the right thing to do.

Should we be using the path from File.name and append it to upload_to? In my project, I use a callable upload_to, which takes care of the path and filename, and only takes the extension of the original filename. Isn't that a better solution if we want a custom path that depends on the model instance?

def custom_upload_to(instance, filename):
    extension = os.path.splitext(filename)[1][1:].lower()
    actual_filename = 'image.' + extension
    return os.path.join('users', instance.username, actual_filename)

model.file = File(open('/tmp/sth.png', 'rb'))

My point here is, shouldn't we be using upload_to for the purpose of constructing a custom file path?

comment:29 by Mariusz Felisiak, 3 years ago

Cc: Markus Holtermann added
Has patch: unset
Needs documentation: unset
Needs tests: unset
Owner: changed from nobody to Mariusz Felisiak
Patch needs improvement: unset
Status: newassigned
Summary: [3.2.1] Issue with assigning file to FileFieldSaving a FileField raises SuspiciousFileOperation in some scenarios.

Thanks y'all for various test projects and detailed reports of encountered issue with saving FileFields.

After some discussions, we decided to prepare a patch for FileField.generate_filename() that should solve most of these issues. Please see the following points.

  1. If filename passed to the FileField.generate_filename() is an absolute path, it will be converted to the os.path.basename(filename).
  2. Validate filename returned by FileField.upload_to() not a filename passed to the FileField.generate_filename() (upload_to() may completely ignored passed filename).
  3. Allow relative paths (without dot segments) in the generated filename.

We're going to prepare a patch in the next few days.

in reply to:  29 comment:30 by Jakub Kleň, 3 years ago

Replying to Mariusz Felisiak:

Thanks y'all for various test projects and detailed reports of encountered issue with saving FileFields.

After some discussions, we decided to prepare a patch for FileField.generate_filename() that should solve most of these issues. Please see the following points.

  1. If filename passed to the FileField.generate_filename() is an absolute path, it will be converted to the os.path.basename(filename).
  2. Validate filename returned by FileField.upload_to() not a filename passed to the FileField.generate_filename() (upload_to() may completely ignored passed filename).
  3. Allow relative paths (without dot segments) in the generated filename.

We're going to prepare a patch in the next few days.

Thanks a lot Mariusz!

I was thinking about this for a while now, and it seems like a great solution which will both be backwards compatible and will also fit the use cases for all of us! It also takes my concerns into account, which I'm super happy about!

Thanks a lot for the great work all of you guys are doing for Django. Love the framework!

comment:31 by Ned Batchelder, 3 years ago

We're going to prepare a patch in the next few days.

The sooner the better! :) An Open edX release is hinging on this. We've always loved Django's stability and the trust we've had in the security patches.

in reply to:  24 comment:32 by Brian Bouterse, 3 years ago

Replying to Florian Apolloner:

There will be another release fixing this and the CVE will probably get adjusted to drop the sentence "Specifically, empty file names and paths with dot segments will be
rejected.". Does the wording make more sense for you then?

It does! Thank you so much for making this clearer.

Will do, testing will certainly help once we have a suitable PR :)

I can do that. The description of the plan in Comment 29 sounds great!

comment:33 by Mariusz Felisiak, 3 years ago

I added PR with the proposed solution. Please give it a spin.

  1. If filename passed to the FileField.generate_filename() is an absolute path, it will be converted to the os.path.basename(filename).

IMO this is not necessary.

comment:34 by Mariusz Felisiak, 3 years ago

Has patch: set

comment:35 by Brian Bouterse, 3 years ago

I tested the PR against our code, and it fully resolved our issues. Thank you all so much! https://github.com/django/django/pull/14372#issuecomment-836798374

comment:36 by GitHub <noreply@…>, 3 years ago

In d1f1417:

Refs #32718 -- Corrected CVE-2021-31542 release notes.

comment:37 by Mariusz Felisiak <felisiak.mariusz@…>, 3 years ago

In dc7b495d:

[3.2.x] Refs #32718 -- Corrected CVE-2021-31542 release notes.

Backport of d1f1417caed648db2f81a1ec28c47bf958c01958 from main

comment:38 by Mariusz Felisiak <felisiak.mariusz@…>, 3 years ago

In 9fb9944d:

[3.1.x] Refs #32718 -- Corrected CVE-2021-31542 release notes.

Backport of d1f1417caed648db2f81a1ec28c47bf958c01958 from main.

comment:39 by Mariusz Felisiak <felisiak.mariusz@…>, 3 years ago

In 3ba089a:

[2.2.x] Refs #32718 -- Corrected CVE-2021-31542 release notes.

Backport of d1f1417caed648db2f81a1ec28c47bf958c01958 from main.

in reply to:  33 comment:40 by Phillip Marshall, 3 years ago

Replying to Mariusz Felisiak:

I added PR with the proposed solution. Please give it a spin.

This patch resolves the issue for my project's use case as well. Thanks a lot!

comment:41 by GitHub <noreply@…>, 3 years ago

Resolution: fixed
Status: assignedclosed

In b5569996:

Fixed #32718 -- Relaxed file name validation in FileField.

  • Validate filename returned by FileField.upload_to() not a filename passed to the FileField.generate_filename() (upload_to() may completely ignored passed filename).
  • Allow relative paths (without dot segments) in the generated filename.

Thanks to Jakub Kleň for the report and review.
Thanks to all folks for checking this patch on existing projects.
Thanks Florian Apolloner and Markus Holtermann for the discussion and
implementation idea.

Regression in 0b79eb36915d178aef5c6a7bbce71b1e76d376d3.

comment:42 by Mariusz Felisiak <felisiak.mariusz@…>, 3 years ago

In b7d4a6fa:

[3.1.x] Fixed #32718 -- Relaxed file name validation in FileField.

  • Validate filename returned by FileField.upload_to() not a filename passed to the FileField.generate_filename() (upload_to() may completely ignored passed filename).
  • Allow relative paths (without dot segments) in the generated filename.

Thanks to Jakub Kleň for the report and review.
Thanks to all folks for checking this patch on existing projects.
Thanks Florian Apolloner and Markus Holtermann for the discussion and
implementation idea.

Regression in 0b79eb36915d178aef5c6a7bbce71b1e76d376d3.

Backport of b55699968fc9ee985384c64e37f6cc74a0a23683 from main.

comment:43 by Mariusz Felisiak <felisiak.mariusz@…>, 3 years ago

In b8ecb064:

[2.2.x] Fixed #32718 -- Relaxed file name validation in FileField.

  • Validate filename returned by FileField.upload_to() not a filename passed to the FileField.generate_filename() (upload_to() may completely ignored passed filename).
  • Allow relative paths (without dot segments) in the generated filename.

Thanks to Jakub Kleň for the report and review.
Thanks to all folks for checking this patch on existing projects.
Thanks Florian Apolloner and Markus Holtermann for the discussion and
implementation idea.

Regression in 0b79eb36915d178aef5c6a7bbce71b1e76d376d3.

Backport of b55699968fc9ee985384c64e37f6cc74a0a23683 from main.

comment:44 by Mariusz Felisiak <felisiak.mariusz@…>, 3 years ago

In 224b8e5a:

[3.2.x] Fixed #32718 -- Relaxed file name validation in FileField.

  • Validate filename returned by FileField.upload_to() not a filename passed to the FileField.generate_filename() (upload_to() may completely ignored passed filename).
  • Allow relative paths (without dot segments) in the generated filename.

Thanks to Jakub Kleň for the report and review.
Thanks to all folks for checking this patch on existing projects.
Thanks Florian Apolloner and Markus Holtermann for the discussion and
implementation idea.

Regression in 0b79eb36915d178aef5c6a7bbce71b1e76d376d3.
Backport of b55699968fc9ee985384c64e37f6cc74a0a23683 from main

comment:45 by Mariusz Felisiak <felisiak.mariusz@…>, 3 years ago

In 63f0d7a0:

[2.2.x] Refs #32718 -- Fixed file_storage.test_generate_filename and model_fields.test_filefield tests on Python 3.5.

Note: See TracTickets for help on using tickets.
Back to Top