Opened 10 years ago
Last modified 3 days ago
#27150 assigned Cleanup/optimization
Wrong File object behavior if file object has no name attribute
| Reported by: | Sergei Zh | Owned by: | VIZZARD-X |
|---|---|---|---|
| Component: | File uploads/storage | Version: | 1.10 |
| Severity: | Normal | Keywords: | |
| Cc: | Triage Stage: | Accepted | |
| Has patch: | yes | Needs documentation: | no |
| Needs tests: | no | Patch needs improvement: | yes |
| Easy pickings: | no | UI/UX: | no |
Description
django.core.files.base.File supports file-like objects (synonym to file object in python glossary) and in fact can't rely on additional name attribute, but in current implementation of __bool__ method it does:
def __bool__(self):
return bool(self.name)
For example, until tempfile.SpooledTemporaryFile isn't large enough to have been written to disk, his name attribute is None and if statement will return False.
import tempfile
from django.core.files.base import File
python_file_like_object = tempfile.SpooledTemporaryFile()
file_obj = File(python_file_like_object) # name keyword is None by default
if file_object: # Falsy, because python_file_like_object.name is None
print("42")
The same behavior with streams (BytesIO, StringIO).
Maybe __bool__ method should rely on self.file attribute to avoid this behavior?
Change History (18)
comment:1 by , 10 years ago
comment:2 by , 10 years ago
Hi.
Sure, it makes sense. I'll try to investigate failed tests ASAP. Think it's the best place to begin.
comment:3 by , 10 years ago
| Component: | Core (Other) → Documentation |
|---|---|
| Summary: | Wrong File object behavior if file object has no name attribute → Document that a name should be provided when wrapping file-like objects that don't have one with File |
| Triage Stage: | Unreviewed → Accepted |
| Type: | Bug → Cleanup/optimization |
See #26495 for a case where this came up. Unless investigation yields otherwise, I think documenting that you should pass a name for file-like objects that don't have one is the best course of action. I'm not sure if raising an error for that case might be acceptable (there might be use cases where providing name isn't needed) but if so, they would likely be even more helpful.
comment:4 by , 10 years ago
As I understood from #26495 python file-like objects works fine with requests library, because in CPython file objects use object.__bool__ and always returns True if no __bool__ and __len__.
I think documenting that you should pass a name for file-like objects that don't have one is the best course of action.
It's little bit strange for me to allow pass name=None to Storage.save, while disallow use empty name for file-like objects. Maybe we can restore default __bool__ behavior in Storage.save method?
content.__bool__ = lambda _: True
comment:5 by , 10 years ago
I've realized it's wrong, we should delete __bool__ method, cause default implementation depends on __len__ method.
comment:6 by , 10 years ago
I haven't looked into the issue so I can't propose an alternative, however, I'm not sure that patching methods like that is a good practice. At least, I'm not aware of anywhere else in Django that uses that pattern.
comment:8 by , 4 months ago
PR submitted: https://github.com/django/django/pull/20327
Updated the documentation to clarify that wrapping file-like objects
without a name attribute requires explicitly providing one. The docs
now build cleanly, including lint and spelling checks.
comment:9 by , 4 months ago
| Has patch: | set |
|---|
comment:11 by , 2 weeks ago
| Triage Stage: | Accepted → Unreviewed |
|---|
A 10 year old ticket that never really reached consensus may not be a good ticket to pick up and introduce new wording to the docs.
Reading between the lines it feels like this is one where someone needs to do a bit of work investigating deeper. Defining a base File class and then giving it a __bool__ method that relies on a name seems like a break in the abstraction.
I amended the __bool__ call to log its usage and ran the test suite, and then did some searching to see what other cases I could see. I think this is pretty comprehensive. I've given the intent of the calling code, why does it want to know if the file is true, and what type of file is it working with currently.
| Location | Calls | Intent | Instance(s) |
fields/files.py:336 pre_save | 847 | Skip committing the file if the field is empty (name is the DB value; empty name means nothing to save) | FieldFile |
fields/files.py:42 _require_file | 85 | Guard: raise ValueError if no file is associated with this field before attempting file operations | FieldFile |
http/multipartparser.py:374 handle_file_complete | 59 | Check whether an upload handler returned a completed file object (vs None meaning not yet done) * | InMemoryUploadedFile, TemporaryUploadedFile |
forms/fields.py:696 FileField.clean | 37 | Fall back to the initial value if no new file was uploaded | FieldFile, SimpleUploadedFile |
forms/widgets.py:552 ClearableFileInput.is_initial | 24 | Determine whether the current value is an existing stored file (to decide whether to render the "currently" link) | FieldFile, SimpleUploadedFile |
fields/files.py:368 save_form_data | 11 | Use empty string if no file was provided (data or "") to avoid storing False in the DB column | FieldFile, SimpleUploadedFile |
fields/files.py:109 delete | 9 | Early return if no file is associated with the field — nothing to delete | FieldFile |
forms/widgets.py:521 FileInput.use_required_attribute | 5 | Suppress the HTML required attribute if a file is already set as the initial value | FieldFile |
forms/widgets.py:585 ClearableFileInput.value_from_datadict | 2 | Detect contradiction: user uploaded a new file AND checked the clear checkbox simultaneously | SimpleUploadedFile |
contrib/admin/utils.py:463 display_for_field | 2 | Only render a hyperlink to the file if a file is actually set on the field | FieldFile |
fields/files.py:500 update_dimension_fields | 0 | Skip reading image dimensions if no file is associated with the ImageField | FieldFile |
fields/files.py:518 update_dimension_fields | 0 | Read width/height from the image if a file is set; clear dimension fields to None otherwise | FieldFile |
core/files/base.py:23 File.__repr__ | 0 | Presentational: display the file's name in repr output, or "None" if no name is set
|
In reality the cases are checking either a FieldFile or a subclass of UploadedFile, in those cases it makes sense to check the name because if you have no name, then the row in the db is empty or the uploadedfile didn't work. We want to keep that behaviour.
But the default File looks like a thin wrapper around a python File, and the docs support this https://docs.djangoproject.com/en/6.0/ref/files/file/#django.core.files.File and have examples of constructing a file without a name, which makes sense for a thin wrapper, but results in a False file https://docs.djangoproject.com/en/6.0/topics/files/#the-file-object and you get bugs like #26495
So what I think we should do is have __bool__ on File return True, which would make sense for a wrapper class, files are kind of truthy by default in python, it's what ContentFile does already. But then add __bool__ methods for FieldFile and UploadedFile that rely on the name. This preserves the behaviour that most of the code is working with and makes sense for those classes, no name = not a file = no need to check the underlying file object.
That leaves that __repr__ method, which should probably be updated to return "<%s: %s>" % (self.__class__.__name__, self.name or "None")
The docs don't really then need updating at all, you could make a case for documenting the __bool__ behaviour of FieldFile and UploadedFile but I think you'd be fine without it.
I think we should close and pause the current PR and instead I've moved this ticket back to unreviewed, so a member of the triage team can have a think about it.
* The intent in handle_file_complete is genuinely is not None, the upload handler API documents that handlers return None to signal "not done yet". This case works incidentally, and would continue to work with the __bool__ changes. Maybe that line could be if file_obj is not None: as a separate/additional cleanup.
comment:12 by , 2 weeks ago
| Component: | Documentation → Core (Other) |
|---|---|
| Summary: | Document that a name should be provided when wrapping file-like objects that don't have one with File → Wrong File object behavior if file object has no name attribute |
comment:13 by , 11 days ago
| Component: | Core (Other) → File uploads/storage |
|---|---|
| Triage Stage: | Unreviewed → Accepted |
Marvelous work, James. I find your proposal in comment:11 convincing. I'm not certain whether we should document the change as a minor backward-incompatible change, on account of an application subclassing the thin File wrapper. We would be changing the semantic.
comment:14 by , 11 days ago
| Owner: | changed from to |
|---|---|
| Status: | new → assigned |
comment:15 by , 9 days ago
| Patch needs improvement: | set |
|---|
comment:16 by , 4 days ago
| Patch needs improvement: | unset |
|---|
comment:17 by , 4 days ago
| Triage Stage: | Accepted → Ready for checkin |
|---|
comment:18 by , 3 days ago
| Patch needs improvement: | set |
|---|---|
| Triage Stage: | Ready for checkin → Accepted |
Hi,
As far as I can tell,
File.__bool__is not documented but it does seem to be used internally and is somewhat tested (though maybe not directly): changing it toreturn Falseyields quite a few errors and failures in the test suite.Because of this lack of documentation, it's hard to tell if the behavior you're describing is a bug.
I think it'd be worth it to:
1) Document
File.__bool__(both in the reference docs and in the docstring)2) Figure out if we can improve it by handling underlying files with no names as you suggest.
What do you think?