Opened 16 years ago
Closed 16 years ago
#7593 closed (fixed)
Temporary uploaded file API (TemporaryFile) is inadequate
Reported by: | vomjom | Owned by: | Jacob |
---|---|---|---|
Component: | Uncategorized | Version: | dev |
Severity: | Keywords: | ||
Cc: | vomjom@…, mike@… | Triage Stage: | Accepted |
Has patch: | yes | Needs documentation: | no |
Needs tests: | no | Patch needs improvement: | no |
Easy pickings: | no | UI/UX: | no |
Description
The temporary file is not properly deleted much of the time for me. There's already a known issue where if you kill/restart apache, the temporary files will stay around forever (at least until system reboot), but if you also fork a process and pass the files to it, they stick around. I don't understand why there's a TemporaryFile class inside of uploadhandler.py when it could just use tempfile.TemporaryFile (which unlinks the file immediately after creation, rather than at garbage collection).
The other issue with the uploadhandler.TemporaryFile class is that it's not iterable (it defines iter via its getattr, but that doesn't work), so it doesn't act like a real file object. This is an issue if you try to use some python libraries with the uploaded file, e.g. csv. This again would be fixed if tempfile.TemporaryFile was used.
Also, it's a bit strange that the UploadedFile API doesn't offer direct access to the file object. It seems to me that a very common use-case for upload handling is to process the file using python's standard library. Two examples I currently use in my program are zipfile and csv. Both require access to the underlying file object. It seems especially strange because the UploadedFile has a TemporaryFile object that it could return (although as I mentioned above, it is inadequate as a replacement for a file object). When handling and processing huge uploads, it doesn't seem right to force the user to chunk the UploadedFile to a new file (which much of the time will be temporary itself) to be able to open it in a standard python libs.
Attachments (2)
Change History (15)
comment:1 by , 16 years ago
Owner: | changed from | to
---|---|
Status: | new → assigned |
Triage Stage: | Unreviewed → Accepted |
comment:2 by , 16 years ago
Summary: | Patch 2070 issues → Temporary uploaded file API (TemporaryFile) is inadequate |
---|
comment:3 by , 16 years ago
Has patch: | set |
---|
comment:4 by , 16 years ago
Cc: | added |
---|
by , 16 years ago
Attachment: | 7814_cleanups.2.diff added |
---|
Overall cleanups for r7814...addressing all API changes that I know of.
comment:5 by , 16 years ago
Thanks!
It looks like the patch solved my issues, except I wasn't able to completely test it.
There are a few issues in the patch that I had to resolve:
In uploadedfile.py, you have:
if settings.FILE_UPLOAD_TEMP_DIR:
It will fail with 'module' object has no attribute 'FILE_UPLOAD_TEMP_DIR'. You should probably do getattr(settings, 'FILE_UPLOAD_TEMP_DIR', None) instead.
I think the same applies to settings.FILE_UPLOAD_MAX_MEMORY_SIZE
I also get an error for temporary files
217. obj.__class__ = type('TemporaryUploadedFile', (obj.__class__, UploadedFile), obj.__dict__) Exception Type: TypeError at /db/job/2/run/ Exception Value: __class__ must be set to a class
I couldn't figure out how to resolve this.
In the code
return InMemoryUploadedFile( file = self.file, field_name = self.field_name, file_name = self.file_name, content_type = self.content_type, file_size = file_size, charset = self.charset )
file_name should be name and file_size should be size, otherwise it errors out.
comment:6 by , 16 years ago
Both FILE_UPLOAD_*
settings constants exist in django/conf/global_settings.py
. So you shouldn't ever see "module has no attribute" errors unless your Django installation isn't correct or you're manually configuring settings and didn't pass in a default (in which case the error is in your code, not Django's).
follow-up: 8 comment:7 by , 16 years ago
Well, then the specific problem is that django.core.files.uploadedfile.py does:
import settings
Rather than
from django.conf import settings
follow-up: 9 comment:8 by , 16 years ago
Replying to Jonathan Hseu <vomjom@gmail.com>:
Well, then the specific problem is that django.core.files.uploadedfile.py does:
import settingsRather than
from django.conf import settings
I noticed this too a couple days ago, but unfortunately haven't been in internet contact.
As for the type() usage, I noticed that the methods dictionary should be {
}, but that shouldn't cause the problem you're seeing... what version of Python are you using?
comment:9 by , 16 years ago
Replying to axiak:
As for the type() usage, I noticed that the methods dictionary should be
{
}, but that shouldn't cause the problem you're seeing... what version of Python are you using?
2.5.2. I tried reproducing it, and this is what I've found:
>>> class A(object): ... pass ... >>> a = A() >>> a.__class__ = type('foo', (object,), {}) >>> a <__main__.foo object at 0x7f5d3e62e4d0> >>> class B: ... pass ... >>> b = B() >>> b.__class__ = type('bar', (object,), {}) Traceback (most recent call last): File "<stdin>", line 1, in <module> TypeError: __class__ must be set to a class
So I imagine the problem is:
>>> import tempfile >>> f = tempfile.NamedTemporaryFile(mode='w', suffix='.upload') >>> f.__class__ <class tempfile._TemporaryFileWrapper at 0x7f414f12b1d0>
And in my tempfile.py, _TemporaryFileWrapper has no parent class.
comment:10 by , 16 years ago
Resolution: | → duplicate |
---|---|
Status: | assigned → closed |
This is being fixed with the same patch as #7614, so I'm marking this as a duplicate. It's not, exactly, but it'll get fixed at the same time.
comment:11 by , 16 years ago
(In [7859]) Fixed #7614: the quickening has come, and there now is only one UploadedFile. On top of that, UploadedFile's interface has been improved:
- The API now more closely matches a proper file API. This unfortunately means a few backwards-incompatible renamings; see BackwardsIncompatibleChanges. This refs #7593.
- While we were at it, renamed chunk() to chunks() to clarify that it's an iterator.
- Temporary uploaded files now property use the tempfile library behind the scenes which should ensure better cleanup of tempfiles (refs #7593 again).
Thanks to Mike Axiak for the bulk of this patch.
comment:12 by , 16 years ago
Resolution: | duplicate |
---|---|
Status: | closed → reopened |
Thanks Mike for the new SVN patch. It works great except for a few minor things:
- Please put a tell() function in TemporaryUploadedFile
- Please also map the file functions to the StringIOs in InMemoryUploadedFile and SimpleUploadedFile.
I currently have an awkward situation where my project works well if the file is large enough (and so goes to a temporary file), but fails on small files (and becomes an InMemoryUploadedFile). It also fails in my unit tests (SimpleUploadedFile).
comment:13 by , 16 years ago
Resolution: | → fixed |
---|---|
Status: | reopened → closed |
#7769 is tracking these improvements. Jonathan, in general, please don't reopen fixed tickets to add more patches; new tickets are far preferred.
Jonathan,
Since the
UploadedFile
objects aren't necessarily files on the file system, we cannot just provide all of the methods that are available to the same object asopen('myfile', 'r')
.However, that doesn't mean we can't try to provide as much as possible. It's just that the more we provide, the more limiting we are on what
UploadedFile
can do -- it's a balancing act.In this attached patch, the following changes are made, which will hopefully help what you are trying to accomplish:
TemporaryUploadFile
inherits from bothUploadedFile
and the class of the object thattempfile.NamedTemporaryFile()
returns.UploadedFile
objects now support__iter__()
,readlines()
, andxreadlines()
.In addition to this, the patch also updates some of the API and attempts to fix #7614.
Please try this patch and let me know if it solves any issue you are having.
Cheers,
Mike