Django

Code

Ticket #10470 (closed: fixed)

Opened 1 year ago

Last modified 1 year ago

Core handlers are not threadsafe

Reported by: tdterry Assigned to: tdterry
Milestone: Component: Core framework
Version: 1.0 Keywords: wsgi, basehandler, mod_python, threadsafe, threading
Cc: Triage Stage: Accepted
Has patch: 1 Needs documentation: 0
Needs tests: 0 Patch needs improvement: 0

Description

I have found an incomplete-initialization bug the WSGIHandler when running under Apache mpm_worker with mod_wsgi. Code review shows this is a potential problem with all of the handlers.

On the first request the WSGIHandler.__call__(...), we set up the middleware. There is a lock wrapping the load of the middleware, but the logic still allows incomplete initialization on other threads. Below is the block of code in question from django/core/handlers/wsgi.py (line 226):

if self._request_middleware is None:
    self.initLock.acquire()
    # Check that middleware is still uninitialised.
    if self._request_middleware is None:
        self.load_middleware()
    self.initLock.release()

Example:

  1. Initial start or restart of the Apache instance
  2. Thread T1 - a request comes in. self._request_middleware is None
  3. Thread T1 - acquires the self.initLock, calls self.load_middleware()
  4. Thread T2 - a request comes in (T1 is part of the way through self.load_middleware())
  5. Thread T2 - self._request_middleware is not None, but it's not completely loaded either, continues with the request
  6. Thread T1 - completes self.load_middleware(), release the lock

The attached patch changes BaseHandler.load_middleware() to do an atomic set of all four middleware lists. self._request_middleware is set last to signal the completion of middleware loading.

Attachments

thread-safe-handlers.diff (2.4 kB) - added by tdterry on 03/11/09 11:02:35.
Patch against r10030 to solve multi-thread issues with the core handlers
thread-safe-handlers_minimal-patch.diff (1.4 kB) - added by mrts on 03/11/09 12:40:32.
Minimal changes that assure thread-safety
omission.diff (0.6 kB) - added by mrts on 03/12/09 02:34:10.
Omitted append to right list

Change History

03/11/09 11:02:35 changed by tdterry

  • attachment thread-safe-handlers.diff added.

Patch against r10030 to solve multi-thread issues with the core handlers

03/11/09 11:05:32 changed by tdterry

  • needs_better_patch changed.
  • needs_tests changed.
  • needs_docs changed.

Also, testing requires a multi-threaded setup and a good load test. These are not available in the framework, so I have not included a specific test for this patch.

03/11/09 11:18:07 changed by tdterry

  • owner changed from nobody to tdterry.
  • status changed from new to assigned.

03/11/09 11:50:37 changed by shai

May I suggest a minor improvement to the patch: Instead of assigning to the _request_middleware last, just use tuple assignment to assign them all together, i.e.

 self._request_middleware, self._view_middleware, self._response_middleware, self._exception_middleware = \
      _request_middleware,      _view_middleware,      _response_middleware,      _exception_middleware

This way you don't need to make assumptions about the users.

03/11/09 12:34:59 changed by mrts

  • keywords changed from wsgi, basehandler, mod_python, threadsafe to wsgi, basehandler, mod_python, threadsafe, threading.
  • stage changed from Unreviewed to Accepted.

shai, tuple assignment is not atomic, so the suggestion does not help.

The patch looks OK, although, strictly speaking, temporary list is only required for _request_middleware as only that is checked in client code (i.e. only that serves as initialization guard, not the other lists).

It would make things a bit less opaque if a _middleware_not_loaded boolean property was added to BaseHandler (that would do the very same return self._request_middleware is None) and used throughout client code. It may make sense to avoid this for optimization though, method call would add quite a few bytecode instructions.

03/11/09 12:40:32 changed by mrts

  • attachment thread-safe-handlers_minimal-patch.diff added.

Minimal changes that assure thread-safety

03/11/09 12:58:38 changed by mrts

  • stage changed from Accepted to Unreviewed.

On second thought, shouldn't initLock guard against this? tdterry, were you hit by this or did you just think that this a possible scenario?

AFAIK:

  1. Initial start or restart of the Apache instance
  2. Thread T1 - a request comes in. self._request_middleware is None
  3. Thread T1 - acquires the self.initLock, calls self.load_middleware()
  4. Thread T2 - a request comes in (T1 is part of the way through self.load_middleware())
  5. Thread T2 - waits for the initLock shared by all instances
  6. Thread T1 - completes self.load_middleware(), releases the lock
  7. Thread T2 - lock released, all middleware properly loaded

Looks good to me, pushing back to unreviewed.

03/11/09 13:43:38 changed by mrts

IMHO, can be closed as invalid.

(follow-up: ↓ 16 ) 03/11/09 13:45:13 changed by shai

mrts, I will take your word on tuple assignment as I can't find any reference in a quick search.

However, your threading analysis is wrong: In your line 5, T2 doesn't wait for the initLock. This is because without the (original) patch, self._request_middleware is not None (django/core/handlers/wsgi.py line 226).

03/11/09 13:47:37 changed by Alex

Right so it gets past the first check, but then when it goes to acquite the lock it has to wait until the other thread is done, and then it fails the second test.

03/11/09 13:54:01 changed by shai

The point is that it doesn't wait for the lock, doesn't do the second check. It thinks that the middleware is already initialized, and just happily goes on to use it.

03/11/09 13:57:56 changed by Alex

That's not consistant with the documented behavior of the lock: http://docs.python.org/library/threading.html#threading.Lock.acquire

03/11/09 14:01:31 changed by shai

The code does not attempt to acquire the lock. It only does that when self._request_middleware is None. In the described scenario, it isn't. The whole block is skipped.

03/11/09 14:03:40 changed by shai

(FWIW: the Python disassembler agrees with mrts; tuple assignment is not atomic. Sorry about that one).

03/11/09 14:11:18 changed by Alex

Ok look at the code: http://code.djangoproject.com/browser/django/trunk/django/core/handlers/wsgi.py#L221

1) Is request middleware none if not go on 2) if it is acquire the lock -- block until we can 3) check if request middleware is none if not go on 4) load the middlware 5) release the lock

03/11/09 14:22:33 changed by Alex

Ok I think we've been arguing past each other, the issue isn't anything that occurs in that block of code, the issue is that once that block of code begins exceuting for real in the first thread if a second thread comes in it won't block on that, while it should.

03/11/09 14:26:32 changed by Alex

  • stage changed from Unreviewed to Accepted.

(in reply to: ↑ 7 ) 03/12/09 02:23:38 changed by mrts

Replying to shai:

However, your threading analysis is wrong: In your line 5, T2 doesn't wait for the initLock. This is because without the (original) patch, self._request_middleware is not None (django/core/handlers/wsgi.py line 226).

Indeed, my mistake, should have been more careful.

03/12/09 02:27:47 changed by mtredinnick

  • status changed from assigned to closed.
  • resolution set to fixed.

(In [10036]) Fixed #10470 -- Fixed a race condition in middleware initialization.

Thanks to Travis Terry and mrts.

03/12/09 02:34:10 changed by mrts

  • attachment omission.diff added.

Omitted append to right list

03/12/09 02:35:46 changed by mrts

  • status changed from closed to reopened.
  • resolution deleted.

Malcolm, it shows that it is quite early in your timezone :) -- there's an accidental omission in r10036, please see the attached patch.

03/12/09 02:37:08 changed by mtredinnick

  • status changed from reopened to closed.
  • resolution set to fixed.

(In [10038]) [1.0.X] Fixed #10470 -- Fixed a race condition in middleware initialization.

Thanks to Travis Terry and mrts.

Backport of r10036 from trunk.

03/12/09 02:46:28 changed by mrts

  • status changed from closed to reopened.
  • resolution deleted.

03/12/09 03:00:47 changed by mtredinnick

  • status changed from reopened to closed.
  • resolution set to fixed.

(In [10041]) [1.0.X] Fixed a problem from r10038. Fixed #10470.

Backport of r10040 from trunk.


Add/Change #10470 (Core handlers are not threadsafe)




Change Properties
Action