Opened 8 years ago

Closed 6 years ago

Last modified 5 years ago

#11158 closed (fixed)

get_image_dimensions very slow after 1 call

Reported by: kua Owned by: SAn
Component: Core (Other) Version: master
Severity: Keywords: get_image_dimensions, field, save, slow
Cc: Triage Stage: Ready for checkin
Has patch: yes Needs documentation: no
Needs tests: no Patch needs improvement: no
Easy pickings: UI/UX:

Description

get_image_dimensions in django.core.files.images is very slow after 1 call

This has the effect of slowing down calls to save on model image fields, the whole thing shudders to a halt while we spin around in PIL.Image.feed

The issue is that the read pointer is not being reset after the call to get_image_dimensions. Subsequent calls to file.read

To reproduce:

from django.core.files.images import get_image_dimensions
from django.core.files.uploadedfile import File

f = File(open('/tmp/some_multi_megabyte_file.jpg')
get_image_dimensions(f)
get_image_dimensions(f)

I suggest this function should save the read pointer pos, reset it to 0, then do all the PIL fancyness, then reset the pointer when finished. I've tried to attach a patch that reflects this

Imagine this scenario: Someone mucks with the read pointer on a hundred meg image, say reads 2 kilobytes. Then this function is called and starts reading data in 1K chunks looking for a header (which has already been skipped).

So 100 Megabytes - 2 kilobytes / 1024 bytes per read means this function will be called over 100,000 times

Attachments (6)

patch (1.7 KB) - added by kua 8 years ago.
patch.diff (938 bytes) - added by kua 7 years ago.
updated patch for this bug
svn_patch.diff (1.2 KB) - added by kua 7 years ago.
a patch for the trunk
11158.diff (1.3 KB) - added by Chris Beaven 7 years ago.
11158_test.diff (1.1 KB) - added by SAn 7 years ago.
test case
11158-combined.diff (2.4 KB) - added by jkocherhans 7 years ago.
Merged SmilyChris's patch and SAn's tests into a single patch.

Download all attachments as: .zip

Change History (28)

comment:1 Changed 8 years ago by kua

Component: UncategorizedCore framework

Changed 8 years ago by kua

Attachment: patch added

comment:2 Changed 8 years ago by kua

In [110]: f.seek(0)

In [111]: get_image_dimensions(f)
#very quick/snappy
Out[111]: (4000, 3000)

In [112]: f.seek(1)

In [113]: get_image_dimensions(f)
#very slow ~30 seconds, and returns none
In [114]:

comment:3 in reply to:  2 Changed 8 years ago by kua

Also, a work around for the 1.0x, current, code base: fd.seek(0) anytime before you save that model which has an ImageField

comment:4 Changed 7 years ago by Chris Beaven

Patch needs improvement: set
Triage Stage: UnreviewedAccepted

Patch is messy, but the ticket seems valid.

Changed 7 years ago by kua

Attachment: patch.diff added

updated patch for this bug

comment:5 Changed 7 years ago by kua

I added a new patch for this ticket.

I'm not sure what's going on in the svn trunk r10737, it appears that some underlying code in File etc has changed. But this issue depends purely on PIL so we would need a patch for the svn trunk as well..

Changed 7 years ago by kua

Attachment: svn_patch.diff added

a patch for the trunk

comment:6 Changed 7 years ago by kua

Version: 1.0SVN

comment:7 Changed 7 years ago by kua

I changed the Version to SVN because I just tested and this bug is present in the trunk.

I've attached a patch against trunk using the same solution in the previous patch

comment:8 Changed 7 years ago by rvanlaar

milestone: 1.1

I wrote a testcase for it, but I don't know how to generate a image with PIL.
Putting a 40 MB file in the svn wouldn't be a good a idea.

from time import time
from django.core.files.images import get_image_dimensions
from django.core.files.uploadedfile import File

f = File(open('/tmp/some_multi_megabyte_file.jpg')
t1 = time()
get_image_dimensions(f)
t2 = time()
run1_time = t2 - t1
t3 = time()
get_image_dimensions(f)
t4 = time()
run2_time = t4 - t3

# The second run time should be in the same order of the first run.
# So we check if the second runtime is not more then one order of 
# magnitude off.

assert(run2_time < 10 * run1_time)

A quick solution is using:

get_image_dimensions(f.name)

A better way to fix this is having an exposed classfunction
to call get_image_dimensions, since django.core.files.image.ImageFile
already has a self._dimensions_cache.

comment:9 Changed 7 years ago by rvanlaar

milestone: 1.1

Should be fixed for 1.1.1

comment:10 Changed 7 years ago by rvanlaar

milestone: 1.2

Can wait for 1.2.

Use the ImageFile.width and ImageFile.height.

from django.core.files.images import ImageFile
im = ImageFile(open('/tmp/some_multi_megabyte_file.jpg')
im.width
im.height

Changed 7 years ago by Chris Beaven

Attachment: 11158.diff added

comment:11 Changed 7 years ago by Chris Beaven

Simpler patch, same effect.

comment:12 Changed 7 years ago by SAn

Needs tests: set
Owner: changed from nobody to SAn

Changed 7 years ago by SAn

Attachment: 11158_test.diff added

test case

Changed 7 years ago by jkocherhans

Attachment: 11158-combined.diff added

Merged SmilyChris's patch and SAn's tests into a single patch.

comment:13 Changed 7 years ago by jkocherhans

Needs tests: unset

comment:14 Changed 7 years ago by SAn

I think that the patch is ok and marker patch_need_improvement = 1 is outdated.

comment:15 Changed 7 years ago by Russell Keith-Magee

Patch needs improvement: unset

comment:16 Changed 7 years ago by Russell Keith-Magee

Triage Stage: AcceptedReady for checkin

comment:17 Changed 7 years ago by Russell Keith-Magee

milestone: 1.21.3

Ready for checkin, but not critical for 1.2.

comment:18 in reply to:  17 Changed 7 years ago by SAn

Replying to russellm:

Ready for checkin, but not critical for 1.2.

rusellm, the problem here is not only that it is slow to get the image dimensions, the problem is that it returns None after one call. I am sure it's not critical but if you expect (x, y) and you get None you will be bitten.

comment:19 Changed 7 years ago by Russell Keith-Magee

@SAn: I'm not denying that it is a problem. I'm just denying that it is a problem that is a blocker for 1.2. This isn't a regression of behaviour in 1.2, and it doesn't cause catastrophic data loss; as such, it doesn't meet the criteria for a release critical bug.

comment:20 Changed 6 years ago by Luke Plant

Fixed in [13715]. Don't know why it wasn't automatically closed.

comment:21 Changed 6 years ago by Luke Plant

Resolution: fixed
Status: newclosed

(In [13716]) [1.2.X] Fixed #11158 - get_image_dimensions very slow/incorrect after 1 call

Thanks to kua for the report, and to kua, SmileyChris and SAn for the patch

Backport of [13715] from trunk

comment:22 Changed 5 years ago by Jacob

milestone: 1.3

Milestone 1.3 deleted

Note: See TracTickets for help on using tickets.
Back to Top