Django

Code

Ticket #463 (closed: duplicate)

Opened 3 years ago

Last modified 1 year ago

[patch] new mysql.DatabaseWrapper - solving multiple MySQL problems

Reported by: eugene@lazutkin.com Assigned to: adrian
Milestone: Component: Core framework
Version: Keywords:
Cc: Triage Stage: Unreviewed
Has patch: 1 Needs documentation: 0
Needs tests: 0 Patch needs improvement: 0

Description

This patch solves a problem of multiple errors while hosting Django app with FastCGI and MySQL. I was running it for 24 hours without any problems.

Some highlights:

1) Django uses the same connection for all threads. It breaks MySQL leading to numerous random CR_SERVER_GONE_ERROR and CR_SERVER_LOST errors. Every independently talking entity should have its own connection. I've implemented mysql.DatabaseWrapper using dictionary to keep one connection per thread.

2) During request for new connection, old connections are examined. If thread is gone, its connection is closed and garbage-collected.

3) MySQL connection can time-out depending on MySQL server settings. The standard practice is to ping() stored connections before use. My implementation does it for every retrieved connection.

Some potential problems:

1) I rename threads, which request connections, to make them unique. If some other code relies on thread names, it may be broken.

2) 24 hour testing is not enough for production quality code. Changes are very small and compact. They can be verified manually. But still it is not a full blown system test under different hosting scenarios.

3) Please take a look at the code and verify that it is optimal --- my python experience is limited, I could implement something in sub-optimal way.

The patch:

Index: mysql.py
===================================================================
--- mysql.py	(revision 629)
+++ mysql.py	(working copy)
@@ -11,6 +11,9 @@
 from MySQLdb.constants import FIELD_TYPE
 import types
 
+import thread, threading
+from sets import Set
+
 DatabaseError = Database.DatabaseError
 
 django_conversions = conversions.copy()
@@ -23,32 +26,78 @@
 
 class DatabaseWrapper:
     def __init__(self):
-        self.connection = None
+        self.connections = {}
+        self.threads = Set()
+        self.lock = thread.allocate_lock()
         self.queries = []
+        
+    def _get_connection(self):
+        self.lock.acquire()
+        try:
+            # find existing connection
+            id = threading.currentThread().getName()
+            if id in self.connections:
+                connection = self.connections[id]
+                connection.ping()
+                return connection
+            # normalize thread name
+            if id != 'MainThread':
+                id = str(thread.get_ident())
+                threading.currentThread().setName(id)
+            # remove deadwood
+            dead = self.threads - Set([x.getName() for x in threading.enumerate()])
+            for name in dead:
+                self.connections[name].close()
+                del self.connections[name]
+            self.threads -= dead
+            # create new connection
+            from django.conf.settings import DATABASE_USER, DATABASE_NAME, DATABASE_HOST, DATABASE_PASSWORD
+            connection = Database.connect(user=DATABASE_USER, db=DATABASE_NAME,
+                    passwd=DATABASE_PASSWORD, host=DATABASE_HOST, conv=django_conversions)
+            self.connections[id] = connection
+            self.threads.add(id)
+            return connection
+        finally:
+            self.lock.release()
 
     def cursor(self):
-        from django.conf.settings import DATABASE_USER, DATABASE_NAME, DATABASE_HOST, DATABASE_PASSWORD, DEBUG
-        if self.connection is None:
-            self.connection = Database.connect(user=DATABASE_USER, db=DATABASE_NAME,
-                passwd=DATABASE_PASSWORD, host=DATABASE_HOST, conv=django_conversions)
+        connection = self._get_connection()
+        from django.conf.settings import DEBUG
         if DEBUG:
-            return base.CursorDebugWrapper(self.connection.cursor(), self)
-        return self.connection.cursor()
+            return base.CursorDebugWrapper(connection.cursor(), self)
+        return connection.cursor()
 
     def commit(self):
-        self.connection.commit()
+        self.lock.acquire()
+        try:
+            id = threading.currentThread().getName()
+            if id in self.connections:
+                self.connections[id].commit()
+        finally:
+            self.lock.release()
 
     def rollback(self):
-        if self.connection:
-            try:
-                self.connection.rollback()
-            except Database.NotSupportedError:
-                pass
+        self.lock.acquire()
+        try:
+            id = threading.currentThread().getName()
+            if id in self.connections:
+                try:
+                    self.connections[id].rollback()
+                except Database.NotSupportedError:
+                    pass
+        finally:
+            self.lock.release()
 
     def close(self):
-        if self.connection is not None:
-            self.connection.close()
-            self.connection = None
+        self.lock.acquire()
+        try:
+            id = threading.currentThread().getName()
+            if id in self.connections:
+                connection = self.connections[id]
+                connection.close()
+                del self.connections[id]
+        finally:
+            self.lock.release()
 
 def get_last_insert_id(cursor, table_name, pk_name):
     cursor.execute("SELECT LAST_INSERT_ID()")

Attachments

mysql.patch (3.8 kB) - added by eugene@lazutkin.com on 09/06/05 01:38:56.
patch as a file
mysql.diff (4.0 kB) - added by Nebojša Đorđević - nesh <nesh@studioquattro.co.yu> on 11/25/05 11:58:52.
new patch
mysql_0.91.diff (4.3 kB) - added by edgars@way.lv on 02/14/06 11:13:19.
patch against release 0.91
mysql_rev2307.diff (4.1 kB) - added by edgars.jekabsons@gmail.com on 02/14/06 14:31:07.
patch against trunk (rev. 2307)
mysql_trunk_rev2360.diff (2.4 kB) - added by greg@abbas.no-spam.org on 02/18/06 22:26:03.
Patch for trunk 2360, using thread-local storage
mysql_magic-removal_rev2360.diff (2.3 kB) - added by greg@abbas.no-spam.org on 02/18/06 22:26:51.
Patch for magic-removal 2360, using thread-local storage
mysql_trunk_rev2360-1.diff (3.7 kB) - added by greg@abbas.no-spam.org on 02/18/06 23:31:37.
Patch for trunk 2360, using thread-local storage and python 2.3 support
mysql_magic-removal_rev2360-1.diff (3.6 kB) - added by greg@abbas.no-spam.org on 02/18/06 23:32:38.
Patch for magic-removal 2360, using thread-local storage and python 2.3 support

Change History

09/06/05 01:38:56 changed by eugene@lazutkin.com

  • attachment mysql.patch added.

patch as a file

09/06/05 06:50:54 changed by garthk

I lack the time to check out the solution, but I'm all for fixing the problems described.

09/09/05 00:33:16 changed by anonymous

  • owner changed from adrian to anonymous.

09/09/05 06:57:26 changed by Boffbowsh

  • owner changed from anonymous to adrian.

09/25/05 17:34:15 changed by jacob

  • milestone set to Version 1.0.

09/25/05 20:44:44 changed by eugene@lazutkin.com

I ran this patch for 3 weeks now. No problems so far.

11/24/05 08:36:06 changed by Maniac <Maniac@SoftwareManiacs.Org>

For anyone interesting. There is a similar issue with Postgres for which I filed ticket 900.

11/25/05 11:58:29 changed by Nebojša Đorđević - nesh <nesh@studioquattro.co.yu>

I'm rewrote your patch for latest [1432] revision. There are some differences:

  • connection.ping() is always checked before returning cursor, and if connection is invalid then connection is recreated
  • for locking I'm using django.utils.synch.RWLock
  • instead renaming threads (witch will interfere with i18n stuff) I'm using currentThread() object as a thread id (like Hugo in i18n)

11/25/05 11:58:52 changed by Nebojša Đorđević - nesh <nesh@studioquattro.co.yu>

  • attachment mysql.diff added.

new patch

11/25/05 13:53:02 changed by eugene@lazutkin.com

Checking connections while pinging -- good catch. Using RWLock --- good refresh, I forgot to update the patch with RWLock, when it became available.

Regarding currentThread() -- I had some reservations about it. I tried to experiment with currentThread() and:

  1. "If the caller's thread of control was not created through the threading module, a dummy thread object with limited functionality is returned." -- from http://www.python.org/doc/2.4.2/lib/module-threading.html. Web server can create its own threads (or thread pools, if we want to be realistic). WSGI can create threads without threading module, e.g., it can use thread module directly.
  2. currentThread() is not guaranteed to return unique objects, e.g., they can be repeatedly reused, when threads are recycled.

Basically I was concerned about potential clash because uniqueness is not guaranteed. Maybe it is unfounded. E.g., I cannot remember why (2) was important. It doesn't look dangerous now.

(1) depends on implementation. If we assume that all Python implementations return unique dummy thread objects, it's probably OK. Theoretically it can return the same static object.

OTOH "thread renaming" may suffer from similar problems depending on implementation. I want a sage advice on that. :-)

11/26/05 03:26:38 changed by Nebojša Đorđević - nesh <nesh@studioquattro.co.yu>

Is this per thread stuff is really required? As far as I can see real problem is in not checking for connection validity and not threading stuff - in my test setup (FastCGI & mod_fcgi) all problems are gone when I only check with connection.ping and reconnect if connection is invalid.

Maybe someone more knowledgeable in django database internals may shed some light on this.

Second solution is using some type of database connection pool instead thread local connections.

11/26/05 05:45:17 changed by hugo

Actually currentThread() will return unique objects - they can't be reused, the thread names might be reused, but the object ID won't. And that's what is used when you use objects as dictionary keys. The dummy object created might have limited functionality - but we don't use the functionality, we only use the identity, so that's enough.

11/26/05 16:10:37 changed by eugene@lazutkin.com

Originally I had problems with multithreaded FCGI under load. It was not related to pings --- MySQL got confused when SQL statements from different threads came using one connection. I implemented "per thread stuff" after advices of MySQL experts. I added pings to prevent time-outs, which happened after some idle time.

INHO, database connection pool is a good idea.

11/26/05 16:20:12 changed by eugene@lazutkin.com

It's good to know that currentThread() is unique. In my original implementation I used get_ident() as a source of certifiably unique names with silly renaming of threads. It is the only thread-related method, which is explicitly documented as returning unique values, which are suitable as dictionary index (http://www.python.org/doc/2.4.2/lib/module-thread.html). No other thread/treading-related method is documented as such.

Unfortunately I didn't find TLS in python. That's what is used usually for stuff like that.

12/02/05 14:45:18 changed by Maniac <Maniac@SoftwareManiacs.Org>

I'm about to port this new patch for Postgres and have a question. Why is "remove deadwood" part needed? As far as I understand connections are always deleted on every request calling db.close().

12/02/05 20:35:54 changed by eugene@lazutkin.com

"Remove deadwood" is just an extra security measure against potential sloppiness in closing connections and handling exceptions. After watching MySQL connections die randomly I decided to take no chances.

02/14/06 11:13:19 changed by edgars@way.lv

  • attachment mysql_0.91.diff added.

patch against release 0.91

02/14/06 11:16:28 changed by edgars@way.lv

  • priority changed from high to highest.
  • severity changed from major to critical.

Please include this patch ASAP. Otherwise Django is no use in production as fcgi using MySQL database.

02/14/06 14:31:07 changed by edgars.jekabsons@gmail.com

  • attachment mysql_rev2307.diff added.

patch against trunk (rev. 2307)

02/14/06 14:59:12 changed by Tom Tobin <korpios@gmail.com>

  • priority changed from highest to high.
  • severity changed from critical to major.

Bumping priority and severity down one rank each; while this may be significant for this particular use-case (MySQL and FastCGI), that use-case has at least two obvious workarounds — switching to another database (preferably PostgreSQL), or switching to mod_python. I'm not saying it shouldn't get fixed; I'm just saying that marking it "highest" and "critical" seems nonsensical.

02/15/06 12:24:56 changed by Maniac <Maniac@SoftwareManiacs.Org>

Tom, Postgres' backend in Django suffers from same problems :-). And it also have the patch waiting in #900

02/15/06 16:46:38 changed by Tom Tobin <korpios@gmail.com>

After reading through #900, the alternatives I'd suggest become "switch to mod_python" or "use the preforking mpm". Again — I never said it shouldn't be fixed, or that it wasn't a PITA if your situation falls under these cases . . . just that there are production-ready workarounds at the moment. :)

02/16/06 00:43:54 changed by Maniac <Maniac@SoftwareManiacs.Org>

I doidn't mean to disagree with your resolution. I think it would be critical for 1.0 but not now.

But to clarify: all these bad things happen with mod_python too. Switching to preforking worker would help but many people don't control their server (shared hosting, evil admin, other projects requiring threads etc.)

02/18/06 22:26:03 changed by greg@abbas.no-spam.org

  • attachment mysql_trunk_rev2360.diff added.

Patch for trunk 2360, using thread-local storage

02/18/06 22:26:51 changed by greg@abbas.no-spam.org

  • attachment mysql_magic-removal_rev2360.diff added.

Patch for magic-removal 2360, using thread-local storage

02/18/06 23:31:37 changed by greg@abbas.no-spam.org

  • attachment mysql_trunk_rev2360-1.diff added.

Patch for trunk 2360, using thread-local storage and python 2.3 support

02/18/06 23:32:38 changed by greg@abbas.no-spam.org

  • attachment mysql_magic-removal_rev2360-1.diff added.

Patch for magic-removal 2360, using thread-local storage and python 2.3 support

02/19/06 00:19:08 changed by anonymous

If I may, let me suggest a slightly different implementation that I've had good luck with. Instead of using the current thread (or a value derived from the current thread) as a key into a dict, you can use Python's built-in thread-local storage mechanism. That way you don't need to worry about synchronization or removing "deadwood" (I like the name :-). There's one minor hitch, which is that threading.local() isn't available before Python 2.4. So I included an implementation of it for earlier interpreters (e.g. 2.3) that don't have this functionality built-in.

I've attached patches for the trunk and the magic-removal branch, as of revision 2360. The ones to look at are mysql_trunk_rev2360-1.diff and mysql_magic-removal_rev2360-1.diff. Please ignore mysql_trunk_rev2360.diff and mysql_magic-removal_rev2360.diff; Trac reported an internal error when I tried to replace the first versions that I uploaded with newer versions that support Python 2.3, so I had to change the names to get it to work. Sorry for the extra cruft.

02/19/06 00:19:47 changed by greg@abbas.no-spam.org

Oops, I didn't mean to leave the last comment anonymous... that was me.

02/19/06 01:04:29 changed by greg@abbas.no-spam.org

Oh... or, we could include a copy of _thread_local.py from python 2.4 like #1268 does (in current_user.diff).

03/08/06 01:33:39 changed by greg@abbas.no-spam.org

Eugene's code in #1442 is more elegant... IMHO we should use his fix, and close this as a duplicate.

03/10/06 11:54:08 changed by eugene@lazutkin.com

  • status changed from new to closed.
  • resolution set to duplicate.

Superseded by #1442

01/17/07 16:12:17 changed by

  • milestone deleted.

Milestone Version 1.0 deleted


Add/Change #463 ([patch] new mysql.DatabaseWrapper - solving multiple MySQL problems)




Change Properties
Action