Django

Code

Ticket #6413 (closed: fixed)

Opened 11 months ago

Last modified 4 months ago

Deadlock while culling of cache using locmem backend

Reported by: Joe Holloway <jholloway7+django@gmail.com> Assigned to: permon
Milestone: Component: Cache system
Version: SVN Keywords:
Cc: Triage Stage: Ready for checkin
Has patch: 1 Needs documentation: 0
Needs tests: 0 Patch needs improvement: 0

Description

I'm playing around with the locmem cache backend and I set my 'max_entries' to a particularly low value such that the culling algorithm will kick in and prune my cache.

ORG_CACHE = cache.get_cache("locmem:///?max_entries=20&timeout=3600")

When the cull logic actually fires, I experience what appears to be a deadlock condition, on this line of the locmem backend:

def _cull(self):
    if self._cull_frequency == 0:
        self._cache.clear()
        self._expire_info.clear()
    else:
        doomed = [k for (i, k) in enumerate(self._cache) if i % self._cull_frequency == 0]
        for k in doomed:
            self.delete(k)  

def delete (self, key):
    self._lock.writer_enters()  # <------ deadlocks here
    try:
        self._delete(key)
    finally:
        self._lock.writer_leaves()

Based on the way the surrounding code is written, I think the '_cull' method should actually be calling the '_delete' method instead of 'delete' as the lock has already been acquired earlier in the call stack. Indeed, when I change it to call the '_delete' method, the deadlock condition no longer occurs.

Attachments

locmem_6413.diff (0.6 kB) - added by Joe Holloway <jholloway7+django@gmail.com> on 01/18/08 11:21:47.
Patch for ticket #6413

Change History

01/18/08 11:21:47 changed by Joe Holloway <jholloway7+django@gmail.com>

  • attachment locmem_6413.diff added.

Patch for ticket #6413

01/20/08 18:44:40 changed by SmileyChris

  • needs_better_patch changed.
  • needs_tests changed.
  • needs_docs changed.

You're probably right... I'll pass to a design decision to let the smarty-heads look at it.

Perhaps bring this up on django-dev to get some confirmation.

01/20/08 18:44:44 changed by SmileyChris

  • stage changed from Unreviewed to Design decision needed.

01/21/08 00:36:03 changed by Joe Holloway <jholloway7+django@gmail.com>

Here's a pretty simple test case.

>>> from django.core.cache import get_cache
>>> cache = get_cache ('locmem:///?max_entries=20')
>>> for x in range (25):
...    print x
...    cache.set (x, 'foo')  # loop will hang on iteration x=20
... 

I think my patch will fix the locmem cache, but this also indicates non-intuitive behavior in django.utils.synch.RWLock. I'm not a foremost expert on threading libraries, but in my experience with similar locking mechanisms, they typically don't deadlock a thread that tries to acquire the same lock more than once.

(follow-up: ↓ 5 ) 02/14/08 04:50:49 changed by fredd4@gmail.com

I don't know, why it is in stage "design decision needed". This is no-brainer. It is just a typo.

(in reply to: ↑ 4 ; follow-up: ↓ 6 ) 06/25/08 04:49:39 changed by permon

Replying to fredd4@gmail.com:

I don't know, why it is in stage "design decision needed". This is no-brainer. It is just a typo.

I agree, probably this ticket should be closed and patched. Furthermore open a design decision discussion about changing RWLock semantics. I personally think that Joe Holloway is correct.

(in reply to: ↑ 5 ) 07/05/08 17:51:13 changed by permon

  • owner changed from nobody to permon.
  • status changed from new to assigned.
  • stage changed from Design decision needed to Ready for checkin.

Replying to permon:

Replying to fredd4@gmail.com:

I don't know, why it is in stage "design decision needed". This is no-brainer. It is just a typo.

I agree, probably this ticket should be closed and patched. Furthermore open a design decision discussion about changing RWLock semantics. I personally think that Joe Holloway is correct.

Design was enlightened in http://groups.google.cz/group/django-developers/browse_thread/thread/a8e8d9b84f525d7/d3603fcc4fd388cf?lnk=gst&q=RWLock#d3603fcc4fd388cf So I think, that patch i sufficient.

07/26/08 00:28:02 changed by mtredinnick

  • status changed from assigned to closed.
  • resolution set to fixed.

(In [8090]) Fixed #6413 -- Fixed a deadlock situation in the locmem culling implementation. Thanks, Joe Holloway.


Add/Change #6413 (Deadlock while culling of cache using locmem backend)




Change Properties
Action