Opened 19 years ago

Closed 18 years ago

Last modified 17 years ago

#463 closed defect (duplicate)

[patch] new mysql.DatabaseWrapper - solving multiple MySQL problems

Reported by: eugene@… Owned by: Adrian Holovaty
Component: Core (Other) Version:
Severity: major Keywords:
Cc: Triage Stage: Unreviewed
Has patch: yes Needs documentation: no
Needs tests: no Patch needs improvement: no
Easy pickings: no UI/UX: no

Description

This patch solves a problem of multiple errors while hosting Django app with FastCGI and MySQL. I was running it for 24 hours without any problems.

Some highlights:

1) Django uses the same connection for all threads. It breaks MySQL leading to numerous random CR_SERVER_GONE_ERROR and CR_SERVER_LOST errors. Every independently talking entity should have its own connection. I've implemented mysql.DatabaseWrapper using dictionary to keep one connection per thread.

2) During request for new connection, old connections are examined. If thread is gone, its connection is closed and garbage-collected.

3) MySQL connection can time-out depending on MySQL server settings. The standard practice is to ping() stored connections before use. My implementation does it for every retrieved connection.

Some potential problems:

1) I rename threads, which request connections, to make them unique. If some other code relies on thread names, it may be broken.

2) 24 hour testing is not enough for production quality code. Changes are very small and compact. They can be verified manually. But still it is not a full blown system test under different hosting scenarios.

3) Please take a look at the code and verify that it is optimal --- my python experience is limited, I could implement something in sub-optimal way.

The patch:

Index: mysql.py
===================================================================
--- mysql.py	(revision 629)
+++ mysql.py	(working copy)
@@ -11,6 +11,9 @@
 from MySQLdb.constants import FIELD_TYPE
 import types
 
+import thread, threading
+from sets import Set
+
 DatabaseError = Database.DatabaseError
 
 django_conversions = conversions.copy()
@@ -23,32 +26,78 @@
 
 class DatabaseWrapper:
     def __init__(self):
-        self.connection = None
+        self.connections = {}
+        self.threads = Set()
+        self.lock = thread.allocate_lock()
         self.queries = []
+        
+    def _get_connection(self):
+        self.lock.acquire()
+        try:
+            # find existing connection
+            id = threading.currentThread().getName()
+            if id in self.connections:
+                connection = self.connections[id]
+                connection.ping()
+                return connection
+            # normalize thread name
+            if id != 'MainThread':
+                id = str(thread.get_ident())
+                threading.currentThread().setName(id)
+            # remove deadwood
+            dead = self.threads - Set([x.getName() for x in threading.enumerate()])
+            for name in dead:
+                self.connections[name].close()
+                del self.connections[name]
+            self.threads -= dead
+            # create new connection
+            from django.conf.settings import DATABASE_USER, DATABASE_NAME, DATABASE_HOST, DATABASE_PASSWORD
+            connection = Database.connect(user=DATABASE_USER, db=DATABASE_NAME,
+                    passwd=DATABASE_PASSWORD, host=DATABASE_HOST, conv=django_conversions)
+            self.connections[id] = connection
+            self.threads.add(id)
+            return connection
+        finally:
+            self.lock.release()
 
     def cursor(self):
-        from django.conf.settings import DATABASE_USER, DATABASE_NAME, DATABASE_HOST, DATABASE_PASSWORD, DEBUG
-        if self.connection is None:
-            self.connection = Database.connect(user=DATABASE_USER, db=DATABASE_NAME,
-                passwd=DATABASE_PASSWORD, host=DATABASE_HOST, conv=django_conversions)
+        connection = self._get_connection()
+        from django.conf.settings import DEBUG
         if DEBUG:
-            return base.CursorDebugWrapper(self.connection.cursor(), self)
-        return self.connection.cursor()
+            return base.CursorDebugWrapper(connection.cursor(), self)
+        return connection.cursor()
 
     def commit(self):
-        self.connection.commit()
+        self.lock.acquire()
+        try:
+            id = threading.currentThread().getName()
+            if id in self.connections:
+                self.connections[id].commit()
+        finally:
+            self.lock.release()
 
     def rollback(self):
-        if self.connection:
-            try:
-                self.connection.rollback()
-            except Database.NotSupportedError:
-                pass
+        self.lock.acquire()
+        try:
+            id = threading.currentThread().getName()
+            if id in self.connections:
+                try:
+                    self.connections[id].rollback()
+                except Database.NotSupportedError:
+                    pass
+        finally:
+            self.lock.release()
 
     def close(self):
-        if self.connection is not None:
-            self.connection.close()
-            self.connection = None
+        self.lock.acquire()
+        try:
+            id = threading.currentThread().getName()
+            if id in self.connections:
+                connection = self.connections[id]
+                connection.close()
+                del self.connections[id]
+        finally:
+            self.lock.release()
 
 def get_last_insert_id(cursor, table_name, pk_name):
     cursor.execute("SELECT LAST_INSERT_ID()")

Attachments (8)

mysql.patch (3.8 KB ) - added by eugene@… 19 years ago.
patch as a file
mysql.diff (4.0 KB ) - added by Nebojša Đorđević - nesh <nesh@…> 18 years ago.
new patch
mysql_0.91.diff (4.3 KB ) - added by edgars@… 18 years ago.
patch against release 0.91
mysql_rev2307.diff (4.1 KB ) - added by edgars.jekabsons@… 18 years ago.
patch against trunk (rev. 2307)
mysql_trunk_rev2360.diff (2.4 KB ) - added by greg@… 18 years ago.
Patch for trunk 2360, using thread-local storage
mysql_magic-removal_rev2360.diff (2.3 KB ) - added by greg@… 18 years ago.
Patch for magic-removal 2360, using thread-local storage
mysql_trunk_rev2360-1.diff (3.7 KB ) - added by greg@… 18 years ago.
Patch for trunk 2360, using thread-local storage and python 2.3 support
mysql_magic-removal_rev2360-1.diff (3.6 KB ) - added by greg@… 18 years ago.
Patch for magic-removal 2360, using thread-local storage and python 2.3 support

Download all attachments as: .zip

Change History (33)

by eugene@…, 19 years ago

Attachment: mysql.patch added

patch as a file

comment:1 by garthk, 19 years ago

I lack the time to check out the solution, but I'm all for fixing the problems described.

comment:2 by anonymous, 19 years ago

Owner: changed from Adrian Holovaty to anonymous

comment:3 by Boffbowsh, 19 years ago

Owner: changed from anonymous to Adrian Holovaty

comment:4 by Jacob, 19 years ago

milestone: Version 1.0

comment:5 by eugene@…, 19 years ago

I ran this patch for 3 weeks now. No problems so far.

comment:6 by Maniac <Maniac@…>, 18 years ago

For anyone interesting. There is a similar issue with Postgres for which I filed ticket 900.

comment:7 by Nebojša Đorđević - nesh <nesh@…>, 18 years ago

I'm rewrote your patch for latest [1432] revision. There are some differences:

  • connection.ping() is always checked before returning cursor, and if connection is invalid then connection is recreated
  • for locking I'm using django.utils.synch.RWLock
  • instead renaming threads (witch will interfere with i18n stuff) I'm using currentThread() object as a thread id (like Hugo in i18n)

by Nebojša Đorđević - nesh <nesh@…>, 18 years ago

Attachment: mysql.diff added

new patch

comment:8 by eugene@…, 18 years ago

Checking connections while pinging -- good catch. Using RWLock --- good refresh, I forgot to update the patch with RWLock, when it became available.

Regarding currentThread() -- I had some reservations about it. I tried to experiment with currentThread() and:

  1. "If the caller's thread of control was not created through the threading module, a dummy thread object with limited functionality is returned." -- from http://www.python.org/doc/2.4.2/lib/module-threading.html. Web server can create its own threads (or thread pools, if we want to be realistic). WSGI can create threads without threading module, e.g., it can use thread module directly.
  2. currentThread() is not guaranteed to return unique objects, e.g., they can be repeatedly reused, when threads are recycled.

Basically I was concerned about potential clash because uniqueness is not guaranteed. Maybe it is unfounded. E.g., I cannot remember why (2) was important. It doesn't look dangerous now.

(1) depends on implementation. If we assume that all Python implementations return unique dummy thread objects, it's probably OK. Theoretically it can return the same static object.

OTOH "thread renaming" may suffer from similar problems depending on implementation. I want a sage advice on that. :-)

comment:9 by Nebojša Đorđević - nesh <nesh@…>, 18 years ago

Is this per thread stuff is really required? As far as I can see real problem is in not checking for connection validity and not threading stuff - in my test setup (FastCGI & mod_fcgi) all problems are gone when I only check with connection.ping and reconnect if connection is invalid.

Maybe someone more knowledgeable in django database internals may shed some light on this.

Second solution is using some type of database connection pool instead thread local connections.

comment:10 by hugo, 18 years ago

Actually currentThread() will return unique objects - they can't be reused, the thread names might be reused, but the object ID won't. And that's what is used when you use objects as dictionary keys. The dummy object created might have limited functionality - but we don't use the functionality, we only use the identity, so that's enough.

comment:11 by eugene@…, 18 years ago

Originally I had problems with multithreaded FCGI under load. It was not related to pings --- MySQL got confused when SQL statements from different threads came using one connection. I implemented "per thread stuff" after advices of MySQL experts. I added pings to prevent time-outs, which happened after some idle time.

INHO, database connection pool is a good idea.

comment:12 by eugene@…, 18 years ago

It's good to know that currentThread() is unique. In my original implementation I used get_ident() as a source of certifiably unique names with silly renaming of threads. It is the only thread-related method, which is explicitly documented as returning unique values, which are suitable as dictionary index (http://www.python.org/doc/2.4.2/lib/module-thread.html). No other thread/treading-related method is documented as such.

Unfortunately I didn't find TLS in python. That's what is used usually for stuff like that.

comment:13 by Maniac <Maniac@…>, 18 years ago

I'm about to port this new patch for Postgres and have a question. Why is "remove deadwood" part needed? As far as I understand connections are always deleted on every request calling db.close().

comment:14 by eugene@…, 18 years ago

"Remove deadwood" is just an extra security measure against potential sloppiness in closing connections and handling exceptions. After watching MySQL connections die randomly I decided to take no chances.

by edgars@…, 18 years ago

Attachment: mysql_0.91.diff added

patch against release 0.91

comment:15 by edgars@…, 18 years ago

priority: highhighest
Severity: majorcritical

Please include this patch ASAP. Otherwise Django is no use in production as fcgi using MySQL database.

by edgars.jekabsons@…, 18 years ago

Attachment: mysql_rev2307.diff added

patch against trunk (rev. 2307)

comment:16 by Tom Tobin <korpios@…>, 18 years ago

priority: highesthigh
Severity: criticalmajor

Bumping priority and severity down one rank each; while this may be significant for this particular use-case (MySQL and FastCGI), that use-case has at least two obvious workarounds — switching to another database (preferably PostgreSQL), or switching to mod_python. I'm not saying it shouldn't get fixed; I'm just saying that marking it "highest" and "critical" seems nonsensical.

comment:17 by Maniac <Maniac@…>, 18 years ago

Tom, Postgres' backend in Django suffers from same problems :-). And it also have the patch waiting in #900

comment:18 by Tom Tobin <korpios@…>, 18 years ago

After reading through #900, the alternatives I'd suggest become "switch to mod_python" or "use the preforking mpm". Again — I never said it shouldn't be fixed, or that it wasn't a PITA if your situation falls under these cases . . . just that there are production-ready workarounds at the moment. :)

comment:19 by Maniac <Maniac@…>, 18 years ago

I doidn't mean to disagree with your resolution. I think it would be critical for 1.0 but not now.

But to clarify: all these bad things happen with mod_python too. Switching to preforking worker would help but many people don't control their server (shared hosting, evil admin, other projects requiring threads etc.)

by greg@…, 18 years ago

Attachment: mysql_trunk_rev2360.diff added

Patch for trunk 2360, using thread-local storage

by greg@…, 18 years ago

Patch for magic-removal 2360, using thread-local storage

by greg@…, 18 years ago

Attachment: mysql_trunk_rev2360-1.diff added

Patch for trunk 2360, using thread-local storage and python 2.3 support

by greg@…, 18 years ago

Patch for magic-removal 2360, using thread-local storage and python 2.3 support

comment:20 by anonymous, 18 years ago

If I may, let me suggest a slightly different implementation that I've had good luck with. Instead of using the current thread (or a value derived from the current thread) as a key into a dict, you can use Python's built-in thread-local storage mechanism. That way you don't need to worry about synchronization or removing "deadwood" (I like the name :-). There's one minor hitch, which is that threading.local() isn't available before Python 2.4. So I included an implementation of it for earlier interpreters (e.g. 2.3) that don't have this functionality built-in.

I've attached patches for the trunk and the magic-removal branch, as of revision 2360. The ones to look at are mysql_trunk_rev2360-1.diff and mysql_magic-removal_rev2360-1.diff. Please ignore mysql_trunk_rev2360.diff and mysql_magic-removal_rev2360.diff; Trac reported an internal error when I tried to replace the first versions that I uploaded with newer versions that support Python 2.3, so I had to change the names to get it to work. Sorry for the extra cruft.

comment:21 by greg@…, 18 years ago

Oops, I didn't mean to leave the last comment anonymous... that was me.

comment:22 by greg@…, 18 years ago

Oh... or, we could include a copy of _thread_local.py from python 2.4 like #1268 does (in current_user.diff).

comment:23 by greg@…, 18 years ago

Eugene's code in #1442 is more elegant... IMHO we should use his fix, and close this as a duplicate.

comment:24 by eugene@…, 18 years ago

Resolution: duplicate
Status: newclosed

Superseded by #1442

comment:25 by (none), 17 years ago

milestone: Version 1.0

Milestone Version 1.0 deleted

Note: See TracTickets for help on using tickets.
Back to Top