#463 closed defect (duplicate)
[patch] new mysql.DatabaseWrapper - solving multiple MySQL problems
Reported by: | Owned by: | Adrian Holovaty | |
---|---|---|---|
Component: | Core (Other) | Version: | |
Severity: | major | Keywords: | |
Cc: | Triage Stage: | Unreviewed | |
Has patch: | yes | Needs documentation: | no |
Needs tests: | no | Patch needs improvement: | no |
Easy pickings: | no | UI/UX: | no |
Description
This patch solves a problem of multiple errors while hosting Django app with FastCGI and MySQL. I was running it for 24 hours without any problems.
Some highlights:
1) Django uses the same connection for all threads. It breaks MySQL leading to numerous random CR_SERVER_GONE_ERROR and CR_SERVER_LOST errors. Every independently talking entity should have its own connection. I've implemented mysql.DatabaseWrapper using dictionary to keep one connection per thread.
2) During request for new connection, old connections are examined. If thread is gone, its connection is closed and garbage-collected.
3) MySQL connection can time-out depending on MySQL server settings. The standard practice is to ping() stored connections before use. My implementation does it for every retrieved connection.
Some potential problems:
1) I rename threads, which request connections, to make them unique. If some other code relies on thread names, it may be broken.
2) 24 hour testing is not enough for production quality code. Changes are very small and compact. They can be verified manually. But still it is not a full blown system test under different hosting scenarios.
3) Please take a look at the code and verify that it is optimal --- my python experience is limited, I could implement something in sub-optimal way.
The patch:
Index: mysql.py =================================================================== --- mysql.py (revision 629) +++ mysql.py (working copy) @@ -11,6 +11,9 @@ from MySQLdb.constants import FIELD_TYPE import types +import thread, threading +from sets import Set + DatabaseError = Database.DatabaseError django_conversions = conversions.copy() @@ -23,32 +26,78 @@ class DatabaseWrapper: def __init__(self): - self.connection = None + self.connections = {} + self.threads = Set() + self.lock = thread.allocate_lock() self.queries = [] + + def _get_connection(self): + self.lock.acquire() + try: + # find existing connection + id = threading.currentThread().getName() + if id in self.connections: + connection = self.connections[id] + connection.ping() + return connection + # normalize thread name + if id != 'MainThread': + id = str(thread.get_ident()) + threading.currentThread().setName(id) + # remove deadwood + dead = self.threads - Set([x.getName() for x in threading.enumerate()]) + for name in dead: + self.connections[name].close() + del self.connections[name] + self.threads -= dead + # create new connection + from django.conf.settings import DATABASE_USER, DATABASE_NAME, DATABASE_HOST, DATABASE_PASSWORD + connection = Database.connect(user=DATABASE_USER, db=DATABASE_NAME, + passwd=DATABASE_PASSWORD, host=DATABASE_HOST, conv=django_conversions) + self.connections[id] = connection + self.threads.add(id) + return connection + finally: + self.lock.release() def cursor(self): - from django.conf.settings import DATABASE_USER, DATABASE_NAME, DATABASE_HOST, DATABASE_PASSWORD, DEBUG - if self.connection is None: - self.connection = Database.connect(user=DATABASE_USER, db=DATABASE_NAME, - passwd=DATABASE_PASSWORD, host=DATABASE_HOST, conv=django_conversions) + connection = self._get_connection() + from django.conf.settings import DEBUG if DEBUG: - return base.CursorDebugWrapper(self.connection.cursor(), self) - return self.connection.cursor() + return base.CursorDebugWrapper(connection.cursor(), self) + return connection.cursor() def commit(self): - self.connection.commit() + self.lock.acquire() + try: + id = threading.currentThread().getName() + if id in self.connections: + self.connections[id].commit() + finally: + self.lock.release() def rollback(self): - if self.connection: - try: - self.connection.rollback() - except Database.NotSupportedError: - pass + self.lock.acquire() + try: + id = threading.currentThread().getName() + if id in self.connections: + try: + self.connections[id].rollback() + except Database.NotSupportedError: + pass + finally: + self.lock.release() def close(self): - if self.connection is not None: - self.connection.close() - self.connection = None + self.lock.acquire() + try: + id = threading.currentThread().getName() + if id in self.connections: + connection = self.connections[id] + connection.close() + del self.connections[id] + finally: + self.lock.release() def get_last_insert_id(cursor, table_name, pk_name): cursor.execute("SELECT LAST_INSERT_ID()")
Attachments (8)
Change History (33)
by , 19 years ago
Attachment: | mysql.patch added |
---|
comment:1 by , 19 years ago
I lack the time to check out the solution, but I'm all for fixing the problems described.
comment:2 by , 19 years ago
Owner: | changed from | to
---|
comment:3 by , 19 years ago
Owner: | changed from | to
---|
comment:4 by , 19 years ago
milestone: | → Version 1.0 |
---|
comment:6 by , 19 years ago
For anyone interesting. There is a similar issue with Postgres for which I filed ticket 900.
comment:7 by , 19 years ago
I'm rewrote your patch for latest [1432] revision. There are some differences:
- connection.ping() is always checked before returning cursor, and if connection is invalid then connection is recreated
- for locking I'm using django.utils.synch.RWLock
- instead renaming threads (witch will interfere with i18n stuff) I'm using currentThread() object as a thread id (like Hugo in i18n)
comment:8 by , 19 years ago
Checking connections while pinging -- good catch. Using RWLock --- good refresh, I forgot to update the patch with RWLock, when it became available.
Regarding currentThread() -- I had some reservations about it. I tried to experiment with currentThread() and:
- "If the caller's thread of control was not created through the threading module, a dummy thread object with limited functionality is returned." -- from http://www.python.org/doc/2.4.2/lib/module-threading.html. Web server can create its own threads (or thread pools, if we want to be realistic). WSGI can create threads without threading module, e.g., it can use thread module directly.
- currentThread() is not guaranteed to return unique objects, e.g., they can be repeatedly reused, when threads are recycled.
Basically I was concerned about potential clash because uniqueness is not guaranteed. Maybe it is unfounded. E.g., I cannot remember why (2) was important. It doesn't look dangerous now.
(1) depends on implementation. If we assume that all Python implementations return unique dummy thread objects, it's probably OK. Theoretically it can return the same static object.
OTOH "thread renaming" may suffer from similar problems depending on implementation. I want a sage advice on that. :-)
comment:9 by , 19 years ago
Is this per thread stuff is really required? As far as I can see real problem is in not checking for connection validity and not threading stuff - in my test setup (FastCGI & mod_fcgi) all problems are gone when I only check with connection.ping and reconnect if connection is invalid.
Maybe someone more knowledgeable in django database internals may shed some light on this.
Second solution is using some type of database connection pool instead thread local connections.
comment:10 by , 19 years ago
Actually currentThread() will return unique objects - they can't be reused, the thread names might be reused, but the object ID won't. And that's what is used when you use objects as dictionary keys. The dummy object created might have limited functionality - but we don't use the functionality, we only use the identity, so that's enough.
comment:11 by , 19 years ago
Originally I had problems with multithreaded FCGI under load. It was not related to pings --- MySQL got confused when SQL statements from different threads came using one connection. I implemented "per thread stuff" after advices of MySQL experts. I added pings to prevent time-outs, which happened after some idle time.
INHO, database connection pool is a good idea.
comment:12 by , 19 years ago
It's good to know that currentThread()
is unique. In my original implementation I used get_ident()
as a source of certifiably unique names with silly renaming of threads. It is the only thread-related method, which is explicitly documented as returning unique values, which are suitable as dictionary index (http://www.python.org/doc/2.4.2/lib/module-thread.html). No other thread/treading-related method is documented as such.
Unfortunately I didn't find TLS in python. That's what is used usually for stuff like that.
comment:13 by , 19 years ago
I'm about to port this new patch for Postgres and have a question. Why is "remove deadwood" part needed? As far as I understand connections are always deleted on every request calling db.close().
comment:14 by , 19 years ago
"Remove deadwood" is just an extra security measure against potential sloppiness in closing connections and handling exceptions. After watching MySQL connections die randomly I decided to take no chances.
comment:15 by , 19 years ago
priority: | high → highest |
---|---|
Severity: | major → critical |
Please include this patch ASAP. Otherwise Django is no use in production as fcgi using MySQL database.
comment:16 by , 19 years ago
priority: | highest → high |
---|---|
Severity: | critical → major |
Bumping priority and severity down one rank each; while this may be significant for this particular use-case (MySQL and FastCGI), that use-case has at least two obvious workarounds — switching to another database (preferably PostgreSQL), or switching to mod_python. I'm not saying it shouldn't get fixed; I'm just saying that marking it "highest" and "critical" seems nonsensical.
comment:17 by , 19 years ago
Tom, Postgres' backend in Django suffers from same problems :-). And it also have the patch waiting in #900
comment:18 by , 19 years ago
After reading through #900, the alternatives I'd suggest become "switch to mod_python" or "use the preforking mpm". Again — I never said it shouldn't be fixed, or that it wasn't a PITA if your situation falls under these cases . . . just that there are production-ready workarounds at the moment. :)
comment:19 by , 19 years ago
I doidn't mean to disagree with your resolution. I think it would be critical for 1.0 but not now.
But to clarify: all these bad things happen with mod_python too. Switching to preforking worker would help but many people don't control their server (shared hosting, evil admin, other projects requiring threads etc.)
by , 19 years ago
Attachment: | mysql_trunk_rev2360.diff added |
---|
Patch for trunk 2360, using thread-local storage
by , 19 years ago
Attachment: | mysql_magic-removal_rev2360.diff added |
---|
Patch for magic-removal 2360, using thread-local storage
by , 19 years ago
Attachment: | mysql_trunk_rev2360-1.diff added |
---|
Patch for trunk 2360, using thread-local storage and python 2.3 support
by , 19 years ago
Attachment: | mysql_magic-removal_rev2360-1.diff added |
---|
Patch for magic-removal 2360, using thread-local storage and python 2.3 support
comment:20 by , 19 years ago
If I may, let me suggest a slightly different implementation that I've had good luck with. Instead of using the current thread (or a value derived from the current thread) as a key into a dict, you can use Python's built-in thread-local storage mechanism. That way you don't need to worry about synchronization or removing "deadwood" (I like the name :-). There's one minor hitch, which is that threading.local() isn't available before Python 2.4. So I included an implementation of it for earlier interpreters (e.g. 2.3) that don't have this functionality built-in.
I've attached patches for the trunk and the magic-removal branch, as of revision 2360. The ones to look at are mysql_trunk_rev2360-1.diff and mysql_magic-removal_rev2360-1.diff. Please ignore mysql_trunk_rev2360.diff and mysql_magic-removal_rev2360.diff; Trac reported an internal error when I tried to replace the first versions that I uploaded with newer versions that support Python 2.3, so I had to change the names to get it to work. Sorry for the extra cruft.
comment:21 by , 19 years ago
Oops, I didn't mean to leave the last comment anonymous... that was me.
comment:22 by , 19 years ago
Oh... or, we could include a copy of _thread_local.py from python 2.4 like #1268 does (in current_user.diff).
comment:23 by , 19 years ago
Eugene's code in #1442 is more elegant... IMHO we should use his fix, and close this as a duplicate.
patch as a file