#11158 closed (fixed)
get_image_dimensions very slow after 1 call
Reported by: | kua | Owned by: | SAn |
---|---|---|---|
Component: | Core (Other) | Version: | dev |
Severity: | Keywords: | get_image_dimensions, field, save, slow | |
Cc: | Triage Stage: | Ready for checkin | |
Has patch: | yes | Needs documentation: | no |
Needs tests: | no | Patch needs improvement: | no |
Easy pickings: | no | UI/UX: | no |
Description
get_image_dimensions in django.core.files.images is very slow after 1 call
This has the effect of slowing down calls to save on model image fields, the whole thing shudders to a halt while we spin around in PIL.Image.feed
The issue is that the read pointer is not being reset after the call to get_image_dimensions. Subsequent calls to file.read
To reproduce:
from django.core.files.images import get_image_dimensions
from django.core.files.uploadedfile import File
f = File(open('/tmp/some_multi_megabyte_file.jpg')
get_image_dimensions(f)
get_image_dimensions(f)
I suggest this function should save the read pointer pos, reset it to 0, then do all the PIL fancyness, then reset the pointer when finished. I've tried to attach a patch that reflects this
Imagine this scenario: Someone mucks with the read pointer on a hundred meg image, say reads 2 kilobytes. Then this function is called and starts reading data in 1K chunks looking for a header (which has already been skipped).
So 100 Megabytes - 2 kilobytes / 1024 bytes per read means this function will be called over 100,000 times
Attachments (6)
Change History (28)
comment:1 by , 15 years ago
Component: | Uncategorized → Core framework |
---|
by , 15 years ago
follow-up: 3 comment:2 by , 15 years ago
comment:3 by , 15 years ago
Also, a work around for the 1.0x, current, code base: fd.seek(0) anytime before you save that model which has an ImageField
comment:4 by , 15 years ago
Patch needs improvement: | set |
---|---|
Triage Stage: | Unreviewed → Accepted |
Patch is messy, but the ticket seems valid.
comment:5 by , 15 years ago
I added a new patch for this ticket.
I'm not sure what's going on in the svn trunk r10737, it appears that some underlying code in File etc has changed. But this issue depends purely on PIL so we would need a patch for the svn trunk as well..
comment:6 by , 15 years ago
Version: | 1.0 → SVN |
---|
comment:7 by , 15 years ago
I changed the Version to SVN because I just tested and this bug is present in the trunk.
I've attached a patch against trunk using the same solution in the previous patch
comment:8 by , 15 years ago
milestone: | → 1.1 |
---|
I wrote a testcase for it, but I don't know how to generate a image with PIL.
Putting a 40 MB file in the svn wouldn't be a good a idea.
from time import time from django.core.files.images import get_image_dimensions from django.core.files.uploadedfile import File f = File(open('/tmp/some_multi_megabyte_file.jpg') t1 = time() get_image_dimensions(f) t2 = time() run1_time = t2 - t1 t3 = time() get_image_dimensions(f) t4 = time() run2_time = t4 - t3 # The second run time should be in the same order of the first run. # So we check if the second runtime is not more then one order of # magnitude off. assert(run2_time < 10 * run1_time)
A quick solution is using:
get_image_dimensions(f.name)
A better way to fix this is having an exposed classfunction
to call get_image_dimensions, since django.core.files.image.ImageFile
already has a self._dimensions_cache.
comment:10 by , 15 years ago
milestone: | → 1.2 |
---|
Can wait for 1.2.
Use the ImageFile.width and ImageFile.height.
from django.core.files.images import ImageFile im = ImageFile(open('/tmp/some_multi_megabyte_file.jpg') im.width im.height
by , 15 years ago
Attachment: | 11158.diff added |
---|
comment:12 by , 15 years ago
Needs tests: | set |
---|---|
Owner: | changed from | to
by , 15 years ago
Attachment: | 11158-combined.diff added |
---|
Merged SmilyChris's patch and SAn's tests into a single patch.
comment:13 by , 15 years ago
Needs tests: | unset |
---|
comment:14 by , 15 years ago
I think that the patch is ok and marker patch_need_improvement = 1 is outdated.
comment:15 by , 15 years ago
Patch needs improvement: | unset |
---|
comment:16 by , 15 years ago
Triage Stage: | Accepted → Ready for checkin |
---|
follow-up: 18 comment:17 by , 15 years ago
milestone: | 1.2 → 1.3 |
---|
Ready for checkin, but not critical for 1.2.
comment:18 by , 15 years ago
Replying to russellm:
Ready for checkin, but not critical for 1.2.
rusellm, the problem here is not only that it is slow to get the image dimensions, the problem is that it returns None after one call. I am sure it's not critical but if you expect (x, y) and you get None you will be bitten.
comment:19 by , 15 years ago
@SAn: I'm not denying that it is a problem. I'm just denying that it is a problem that is a blocker for 1.2. This isn't a regression of behaviour in 1.2, and it doesn't cause catastrophic data loss; as such, it doesn't meet the criteria for a release critical bug.
comment:21 by , 14 years ago
Resolution: | → fixed |
---|---|
Status: | new → closed |
In [110]: f.seek(0)
In [111]: get_image_dimensions(f)
#very quick/snappy
Out[111]: (4000, 3000)
In [112]: f.seek(1)
In [113]: get_image_dimensions(f)
#very slow ~30 seconds, and returns none
In [114]: