Opened 4 years ago

Closed 18 months ago

#15825 closed Bug (fixed)

File-based caching doesn't cull truly randomly

Reported by: gkuenning Owned by: nobody
Component: Core (Cache system) Version: master
Severity: Normal Keywords:
Cc: jaap3 Triage Stage: Accepted
Has patch: yes Needs documentation: no
Needs tests: no Patch needs improvement: no
Easy pickings: no UI/UX: no

Description

The culling algorithm in filebased.py uses an essentially random culling method, which is well known to lead to an approximation to LRU. However, the randomness is improperly applied. For example, consider a frequently-accessed page that, by luck, happens to sort first in the file list (due to an alphatbetically low hash value). The culling code will always delete this file, while a truly random algorithm would only do so 1/n of the time.

The fix is easy: at approximately line 131 in filebased.py, "== 0" should be replaced with "== ranno" where ranno is chosen in the immediately preceding line with:

ranno = random.randint(0, self._cull_frequency)

Attachments (2)

patch_for_15825.diff (2.7 KB) - added by desh 4 years ago.
Initial patch against trunk (r16347)
merged_patch.diff (4.1 KB) - added by desh 4 years ago.
Patch for both #15825 and #15806 (as they address same functions)

Download all attachments as: .zip

Change History (11)

comment:1 Changed 4 years ago by carljm

  • Easy pickings unset
  • Needs documentation unset
  • Needs tests unset
  • Patch needs improvement unset
  • Triage Stage changed from Unreviewed to Accepted
  • Version changed from 1.2 to SVN

The referenced line of code in trunk is 123, or doomed = [os.path.join(self._dir, k) for (i, k) in enumerate(filelist) if i % self._cull_frequency == 0]

comment:2 Changed 4 years ago by lukeplant

If this is fixed, please note that the tests will need some subtle fixing, particularly FileBasedCacheTests.test_cull. The tests check for a certain number deleted, but because of the way that files are grouped into folders in the file based backend, if this is randomized the number deleted can change. The 'sorted' call just a few lines above this line was added because of this issue, to fix a difference that appeared on different platforms, #14250.

comment:3 Changed 4 years ago by jacob

See also #15806. Not exactly related, but right around in that same part of code and probably worth fixing at the same time.

Changed 4 years ago by desh

Initial patch against trunk (r16347)

Changed 4 years ago by desh

Patch for both #15825 and #15806 (as they address same functions)

comment:4 Changed 4 years ago by desh

  • Has patch set
  • UI/UX unset

I've added patch against trunk AND patch that also fixes #15806 (it's actually just a result of merge) to avoid merging conflicts.

I've removed sorted() cause it's make no sense with random culling. I've also replaced nested for that traverse directories to delete paths by shutils.rmtree, as it does the same.

Tests changed in two ways: "cull%d" became "cull-%d" and initial inserts count for cull test was reduced from 49 to 34, to have 34 unique 2-char prefixes after hashing. It's actually a bit rude but in fact, #14250 was about as rude as this.

I see two other ways to solve last problem with tests:

  • change length of directory name from 2 to 3, so there will be 4096 possible prefixes (conceptually, same hack but from different pov);
  • change the way of culling: delete exactly _num_entries_cull_frequency FILES, not directories (bad thing, cause it will not scale with big cache sizes).

Waiting for review.

comment:5 Changed 2 years ago by anonymous

This seems to have been forgotten. Has the patch rotted?

comment:6 Changed 22 months ago by susan

I ran the applied patch against the full test suite, which produced a failure. Stack trace here:

FAIL: test_cull (cache.tests.FileBasedCacheTests)


Traceback (most recent call last):

File "../django/tests/cache/tests.py", line 1085, in test_cull

self.perform_cull_test(35, 24)

File "../django/tests/cache/tests.py", line 495, in perform_cull_test

self.assertEqual(count, final_count)

AssertionError: 23 != 24

Last edited 22 months ago by susan (previous) (diff)

comment:7 Changed 22 months ago by jaap3

  • Cc jaap3 added

comment:8 Changed 20 months ago by jaap3

Created a pull request that addresses this issue: https://github.com/django/django/pull/1514 (also #20536)

comment:9 Changed 18 months ago by Anssi Kääriäinen <akaariai@…>

  • Resolution set to fixed
  • Status changed from new to closed

In 7be638390e18fcbfaaed638f9908673360c280d3:

Fixed #20536 -- rewrite of the file based cache backend

  • Safer for use in multiprocess environments
  • Better random culling
  • Cache files use less disk space
  • Safer delete behavior

Also fixed #15806, fixed #15825.

Note: See TracTickets for help on using tickets.
Back to Top