Django

Code

Ticket #11260 (closed: wontfix)

Opened 1 year ago

Last modified 5 months ago

File based cache not very efficient with large amounts of cached files

Reported by: anteater_sa Assigned to: josh
Milestone: 1.2 Component: Cache system
Version: 1.1 Keywords: filebased file cache
Cc: john@nuatech.net Triage Stage: Design decision needed
Has patch: 1 Needs documentation: 1
Needs tests: 1 Patch needs improvement: 0

Description

When using the file based cache having a large number of cached pages (in my case over 100,000) makes the system inefficient as there is a function in django.core.cache.backends.filebased called _get_num_entries which actually walks through the cache direcotry structure counting files.

Maybe setting max_entries to 0 in the settings file could mean unlimited cached files, then the _get_num_entries function could be as follows:

    def _get_num_entries(self):
        count = 0
        if max_entries == 0: return count # Don't count files if max_entries is set to 0
        return count
        for _,_,files in os.walk(self._dir):
            count += len(files)
        return count
    _num_entries = property(_get_num_entries)

Attachments

patch.diff (475 bytes) - added by josh on 01/04/10 12:50:32.
Patch that fixes issue
patch.2.diff (478 bytes) - added by josh on 01/04/10 12:58:33.
Opps, made a mistake with the first patch.

Change History

06/04/09 08:18:42 changed by anteater_sa

  • needs_better_patch changed.
  • needs_tests changed.
  • needs_docs changed.

sorry, should be as follows:

    def _get_num_entries(self):
        count = 0
        if max_entries == 0: return count # Don't count files if max_entries is set to 0
        for _,_,files in os.walk(self._dir):
            count += len(files)
        return count
    _num_entries = property(_get_num_entries)

06/05/09 07:53:02 changed by dc

  • needs_docs set to 1.
  • needs_tests set to 1.
  • milestone changed from 1.1 to 1.2.

1.1 is at a feature freeze right now so moving to 1.2.

08/06/09 17:54:40 changed by Alex

  • stage changed from Unreviewed to Design decision needed.

10/23/09 11:01:07 changed by JohnMoylan

  • cc set to john@nuatech.net.
  • has_patch deleted.

I've been bitten by this also. My fix is to return count while it is still 0 and manage the cache myself using a daily cron job.

It would be nice if Django allowed me to disable the filebased cache management feature using settings.py.

01/04/10 12:50:32 changed by josh

  • attachment patch.diff added.

Patch that fixes issue

01/04/10 12:51:16 changed by josh

  • owner changed from nobody to josh.
  • status changed from new to assigned.
  • has_patch set to 1.

01/04/10 12:58:33 changed by josh

  • attachment patch.2.diff added.

Opps, made a mistake with the first patch.

01/04/10 15:27:44 changed by josh

Just to explain the patch...

If you pass 'max_entries=0' no culling of the cache will ever occur. If number of entries is less than 'max_entries' then culling will be performed.

02/11/10 06:47:51 changed by russellm

  • status changed from assigned to closed.
  • resolution set to wontfix.

I'm going to wontfix, on the grounds that the filesystem cache is intended as an easy way to test caching, not as a serious caching strategy. The default cache size and the cull strategy implemented by the file cache should make that obvious.

If you need a cache capable of holding 100000 items, I strongly recommend you look at memcache. If you insist on using the filesystem as a cache, it isn't hard to subclass and extend the existing cache.

02/11/10 06:55:47 changed by JohnMoylan

Are you saying that file based cache is not suitable for production? File based cache is more suitable for some scenarios than memcache.

I use file caching to cache processed JPG's. Memcache is not as capable for such a scenario.

03/12/10 06:13:10 changed by anteater_sa

  • version changed from 1.0 to 1.1.

I have Django sites of tens of thousands of pages running for over 2 years using the above patches, so your statements about filesystem caching not a serious strategy are irrelevant. Also, filesystem caching is not comparable to memcaching, they solve two completely different problems.


Add/Change #11260 (File based cache not very efficient with large amounts of cached files)




Change Properties
Action