Opened 11 years ago

Closed 9 years ago

#20620 closed Bug (worksforme)

CachedFileMixin.post_process breaks when cache size is exceeded

Reported by: julians37@… Owned by: jcatalan
Component: contrib.staticfiles Version: 1.5
Severity: Normal Keywords:
Cc: julians37@… Triage Stage: Accepted
Has patch: no Needs documentation: no
Needs tests: no Patch needs improvement: no
Easy pickings: no UI/UX: no

Description

Using staticfiles with STATICFILES_STORAGE = 'django.contrib.staticfiles.storage.CachedStaticFilesStorage', running collectstatic without the --no-post-process flag can break if the number of static files exceeds settings.CACHES['staticfiles']['OPTIONS']['MAX_ENTRIES'] (i.e. 300 files in the default configuration), with an unhelpful error message:

ValueError: The file 'foo' could not be found with <django.contrib.staticfiles.storage.CachedStaticFilesStorage object at 0x94f654c>.

(Full stack trace below.)

The reason is that with more than MAX_ENTRIES files, some files might be evicted from the cache at some point before they are referenced by the post-processing code.

The workaround is to increase MAX_ENTRIES to a value larger than the number of static files.

I believe this can be reproduced fairly easily by setting MAX_ENTRIES to 1 and having a bunch of static files that reference each other (I think this bug only kicks in when url_converter has enough work to do, because that's where additional cache entries are created. So just dumping a bunch of empty files into the static directory won't do.)

It would be nice if (ideally) the post-processing code would use a different cache that never evicts items, or (less ideal) provide a more helpful error message when the limit is reached. Perhaps the easiest fix would be to guard the invocation of self.cache.set in CachedFileMixin.url to ensure that the cache still has capacity, but I'm not sure this is the correct or best fix for the issue.

stderr: Traceback (most recent call last):
  File "manage.py", line 10, in <module>
    execute_from_command_line(sys.argv)
  File "/usr/local/lib/python2.7/dist-packages/django/core/management/__init__.py", line 443, in execute_from_command_line
    utility.execute()
  File "/usr/local/lib/python2.7/dist-packages/django/core/management/__init__.py", line 382, in execute
    self.fetch_command(subcommand).run_from_argv(self.argv)
  File "/usr/local/lib/python2.7/dist-packages/django/core/management/base.py", line 196, in run_from_argv
    self.execute(*args, **options.__dict__)
  File "/usr/local/lib/python2.7/dist-packages/django/core/management/base.py", line 232, in execute
    output = self.handle(*args, **options)
  File "/usr/local/lib/python2.7/dist-packages/django/core/management/base.py", line 371, in handle
    return self.handle_noargs(**options)
  File "/usr/local/lib/python2.7/dist-packages/django/contrib/staticfiles/management/commands/collectstatic.py", line 163, in handle_noargs
    collected = self.collect()
  File "/usr/local/lib/python2.7/dist-packages/django/contrib/staticfiles/management/commands/collectstatic.py", line 120, in collect
    for original_path, processed_path, processed in processor:
  File "/usr/local/lib/python2.7/dist-packages/django/contrib/staticfiles/storage.py", line 226, in post_process
    content = pattern.sub(converter, content)
  File "/usr/local/lib/python2.7/dist-packages/django/contrib/staticfiles/storage.py", line 167, in converter
    hashed_url = self.url(unquote(joined_result), force=True)
  File "/usr/local/lib/python2.7/dist-packages/django/contrib/staticfiles/storage.py", line 114, in url
    hashed_name = self.hashed_name(clean_name).replace('\\', '/')
  File "/usr/local/lib/python2.7/dist-packages/django/contrib/staticfiles/storage.py", line 74, in hashed_name
    (clean_name, self))
ValueError: The file 'foo' could not be found with <django.contrib.staticfiles.storage.CachedStaticFilesStorage object at 0x94f654c>.

Change History (7)

comment:1 by jcatalan, 11 years ago

Owner: changed from nobody to jcatalan
Status: newassigned

comment:2 by jcatalan, 11 years ago

Hi,

I've been trying to reproduce this but not being able to do so. Could you please provide me with an example set of static files that would generate this behavior.

Thanks,

Juan

comment:3 by Tim Graham, 11 years ago

Triage Stage: UnreviewedAccepted

The report seems credible to me. It doesn't seem like a specific set of static files would be needed to reproduce it. Just modify MAX_ENTRIES as described in the description.

comment:4 by julians37@…, 11 years ago

Hi,

original submitter here. I'm sorry I haven't replied to Juan's comment yet, I was planning to try and reproduce it again here but haven't been able to set time aside for it so far.

I'll try to find some time in the next couple of weeks, but in the meantime, yes if you could try again with a low setting for MAX_ENTRIES as suggested by Timo (and me, in the original submission) perhaps you can manage to reproduce yourself after all.

Any questions please feel free to contact me.

Julian

comment:5 by julians37@…, 11 years ago

Hi again,

I've just tried various ways to come up with a test case that shows this issue, without success.

I'll try reproducing it again with the data we use in production... I'm positive there is a bug somewhere, but it doesn't seem to be as easy to trigger as I thought.

My suspicion is that the cache "reanimation" code at https://github.com/django/django/blob/1.5.4/django/contrib/staticfiles/storage.py#L139 fails in some corner case.

comment:6 by David Sanders, 9 years ago

IMO this bug could be closed, I don't believe it's a real issue. I've been in the relevant code a lot recently, and I think I can fairly confidently say there's no real way for this to happen unless other factors are in play, hence the difficulty reproducing. Even if there's a cache miss during post-processing, the worst that happens is the file is hashed again to get the result. You could run the post-processing with MAX_ENTRIES at 1 and it should also still work.

The hint of what went wrong here is in the error "ValueError: The file 'foo' could not be found with <django.contrib.staticfiles.storage.CachedStaticFilesStorage object at 0x94f654c>". If the file can't be found then it must have been moved or deleted during the post-processing. A cache miss during post-processing will perform the same as a cache miss when live, so if a cache miss during post-processing leads to a file can't be found error, that would happen to the live code if the cache got cleared.

I think the root cause of this case was something modified the files during post-processing (perhaps a misguided effort to delete the original file after pre-processing to save disk space) and when the cache miss occurred the original file couldn't be found to hash again.

comment:7 by Claude Paroz, 9 years ago

Resolution: worksforme
Status: assignedclosed

Of course, if anyone can reproduce, feel free to reopen.

Note: See TracTickets for help on using tickets.
Back to Top