Opened 2 years ago

Last modified 10 months ago

#29510 assigned Bug

QueryDict.copy() returns closed files when the type of file is TemporaryUploadedFile

Reported by: Liquid Scorpio Owned by: Paulo
Component: File uploads/storage Version: 1.11
Severity: Normal Keywords: QueryDict, upload, file
Cc: Jeff, Herbert Fortes Triage Stage: Accepted
Has patch: no Needs documentation: no
Needs tests: no Patch needs improvement: no
Easy pickings: no UI/UX: no

Description (last modified by Liquid Scorpio)

When uploaded file size is greater than FILE_UPLOAD_MAX_MEMORY_SIZE, Django uses TemporaryUploadedFile to represent the file object. However, when executing .copy() on a QueryDict containing such a file, the returned object has the file but it is in closed state (seekable() is False).

Expected: File should be present in open state (seekable() should be True)

Below is a reproducible example and also contains version details:

Python 2.7.12 (default, Dec  4 2017, 14:50:18) 
In [18]: import django

In [19]: django.VERSION
Out[19]: (1, 11, 11, u'final', 0)

In [20]: from django.http.request import QueryDict

In [21]: from django.core.files.uploadedfile import TemporaryUploadedFile

In [22]: d = QueryDict(mutable=True)

In [23]: f = TemporaryUploadedFile('test.jpg', 'image/jpeg', 100, 'utf-8')

In [24]: f.seekable()
Out[24]: True

In [25]: d.appendlist('image', f)

In [26]: d['image'].seekable()
Out[25]: True

In [27]: c = d.copy()

In [28]: c['image'].seekable()
Out[28]: False

In [30]: 

Change History (10)

comment:1 Changed 2 years ago by Tim Graham

Component: UncategorizedFile uploads/storage
Triage Stage: UnreviewedAccepted

I can reproduce with Python 2, even if I'm unsure that the behavior is incorrect:

>>> from django.core.files.uploadedfile import TemporaryUploadedFile
>>> from copy import deepcopy
>>> f = TemporaryUploadedFile('test.jpg', 'image/jpeg', 100, 'utf-8')
f.seekable()
True
a = deepcopy(f)
>>> a.seekable()
False

However, the deepcopy() crashes on Python 3 with:
TypeError: cannot serialize '_io.BufferedRandom' object

Accepting for further investigation.

comment:2 Changed 2 years ago by Liquid Scorpio

Description: modified (diff)

comment:3 Changed 2 years ago by Jeff

Cc: Jeff added
Owner: changed from nobody to Jeff
Status: newassigned

comment:4 Changed 2 years ago by Jeff

This is not the reported issue, however:

In python 3, the copy.deepcopy function is not able to copy TemporaryUploadedFiles because python's tempfile.TemporaryFile is not serializable. I'm not certain how serious the issue is, or if we want to fix it, but I believe this makes QueryDicts unable to be copied if it contains a TemporaryUploadedFile obj.

comment:5 Changed 2 years ago by Jeff

Owner: Jeff deleted
Status: assignednew

comment:6 Changed 2 years ago by Herbert Fortes

Cc: Herbert Fortes added

comment:7 Changed 10 months ago by Paulo

Owner: set to Paulo
Status: newassigned

comment:8 Changed 10 months ago by Paulo

The underlying issue here is a change in Python attempting to pickle the file.
On python 2.7, I can reproduce the reported issue:

In [19]: file = tempfile.NamedTemporaryFile(suffix='.upload')

In [20]: copy.deepcopy(file)
Out[20]: <closed file '<uninitialized file>', mode '<uninitialized file>' at 0x7f45f2d0b420>

On Python 3.6 I can't reproduce the original issue, instead I get Tim's issue:

In [17]: file = tempfile.NamedTemporaryFile(suffix='.upload')

In [18]: copy.deepcopy(file)
---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
<ipython-input-18-36f4ecfb14c1> in <module>
----> 1 copy.deepcopy(file)

/usr/local/lib/python3.6/copy.py in deepcopy(x, memo, _nil)
    167                     reductor = getattr(x, "__reduce_ex__", None)
    168                     if reductor:
--> 169                         rv = reductor(4)
    170                     else:
    171                         reductor = getattr(x, "__reduce__", None)

/usr/local/lib/python3.6/tempfile.py in func_wrapper(*args, **kwargs)
    483             @_functools.wraps(func)
    484             def func_wrapper(*args, **kwargs):
--> 485                 return func(*args, **kwargs)
    486             # Avoid closing the file as long as the wrapper is alive,
    487             # see issue #18879.

TypeError: cannot serialize '_io.BufferedRandom' object

Personally I think the 2.7 issue is not as bad as the 3.x since it at least allows for the copy to proceed (while silently closing the file).
My suggestion would be to preemptively check that the objects being copied are serializable or to catch the TypeError and raise a different error indicating the user that the query dict contains a non-seralizable value (like file, etc..)

comment:9 Changed 10 months ago by Claude Paroz

Did you explore possible handling of such file objects in QueryDict.__deepcopy__?

comment:10 in reply to:  9 Changed 10 months ago by Paulo

Replying to Claude Paroz:

Did you explore possible handling of such file objects in QueryDict.__deepcopy__?

To keep the file on copy?
If so then yes, we can check value for UploadedFile and create a new instance that points to the same underlying file (although I guess that's more shallow than deep copy).
To have fully deep copy we'd have to create new tmp if is on disk or somehow copy the file stream.
That said, I think "soft" copy should be enough and is more expected behavior for the file itself.
My concern with this approach was if we have any other non file values that are not serializable, would this open the door to having to handle those too?
Guess not since there's only so many types that are handled by QueryDict.

Note: See TracTickets for help on using tickets.
Back to Top