#900 closed defect (duplicate)
[patch] Premature close of postgres connection with multiple threads
Reported by: | Owned by: | Adrian Holovaty | |
---|---|---|---|
Component: | Database layer (models, ORM) | Version: | |
Severity: | major | Keywords: | |
Cc: | Maniac@… | Triage Stage: | Unreviewed |
Has patch: | yes | Needs documentation: | no |
Needs tests: | no | Patch needs improvement: | no |
Easy pickings: | no | UI/UX: | no |
Description
Postgres back-end shares one connection for many requests. This is generally okay since psycopg handles this correctly. It looks like somewhere (I haven't found exact place yet) users of shared connection close it explicitly while other processes working with it. This results in other processes crash with the exception "already closed".
I'm seeing this in action on a page with many images where each image generated by a view looking into database. When running under development server everything is ok since it' single-process. But when running under Apache with each reload some of the images sometimes turn blank. Exceptions of course aren't shown inside 'img' tags but I replaced them with <object data="..." type="text/plain"> and got the messages :-)
I use apache2, mod_python, postgresql 7.4
For now as a workaround I just commented out whole DatabaseWrapper.close() so the connection is destroyed when Apache child is dying and killing the Interpreter (as I'm guessing). So my Postgres complains periodically about dropped connection. Is there any place where the destruction can be hooked at (I'm not familiar with WSGI and all this handlers stuff that close)?
Attachments (4)
Change History (16)
comment:1 by , 19 years ago
comment:2 by , 19 years ago
Yes, threaded one.
I use it because it's the one that is bundled with Ubuntu packages (and described as faster than preforked). But people may have their reasons too.
I beleive keeping connection open would work for preforking model anyway. If, of course, problem with graceful exit can be solved :-). The other way is to adopt Eugene'f patch from ticket 463 which keeps connection list opening new connection for every cursor. But then all those necessery thread locks won't help performance....
comment:3 by , 19 years ago
In reality locks don't hurt performance too much because usually underlying systems use something like a thread pool, which reuses existing "free" threads instead of killing old/creating new threads. I didn't see any measurable degradation of performance.
My biggest beef with my patch is I don't like the practice of renaming threads. It feels inelegant. But so far I didn't find an alternative solution. And it works fine with Django.
comment:4 by , 19 years ago
This performance degradation increases with load and much faster than lineary (more like exponentialy though I'm not sure). More threads simultaneously requiring access to some exclusive section of code lead to longer waiting for this section. And longer waiting in its turn leads to more threads waiting in the queue. Anyway it's better not to have another thread exclusion mechanism on top of psycopg connection which is already doing it.
comment:5 by , 19 years ago
True. But locked area is pretty small because in most cases it reuses the connection. It locks for longer time when adding new connection. Connection reuse is even faster and virtually without locking when the RWLock is used instead of the mutex. See nash' revision of the patch: #463.
comment:6 by , 19 years ago
I researched mod_python docs and found a hook for child exiting. I made a preliminary patch for gracefully closing postrges connection. But! I introduced new concept in addition to db.close() -- db.cleanup(). So now:
- db.close(): finishes db access connection on each request
- db.cleanup(): actually a destructor for DatabaseWrapper
Adrian, Jacob, is it okay architecturally for Django? If yes I could finish a patch with these changes:
- add cleanup() for all existing db backends as noop
- investigate if WSGI has such thing as mod_python's register_cleanup and make use of it too
BTW, with not closing a connection on each request things seem for a good bit faster. Though I didn't measure it, just seems by looking.
comment:7 by , 19 years ago
Cc: | added |
---|
I've redone the whole thing. The idea with reusing a single connection was nice performance-wise but as soon as it requires semantic changes to the backend (splitting close() into two different verbs) I understand it's not very desirable now when we that close to 1.0.
So I took the idea from that twin ticket for MySQL (#463) and have added thread-safety through django's RWLock on all shared operations (namely: cursor(), close(), rollback()). However unlike with MySQL there is no need to actually create a new connection each time since they are thread-safe themselves here. Instead I just keep list of all threads that access a connection and in db.close() the connection is closed only when last thread leaves it.
This also have a nice effect on performance when simultaneous threads still do share a connection. But it doesn't live forever and doesn't get forgot (is it English? :-) ).
I was running it some time with my monster app when a browser loads hundreds of images each accessing a db on server. Requests are always successful, connections count always successfuly goes to 0 after work and Postgres never complains of dropped connections.
Patch follows.
comment:8 by , 19 years ago
Summary: | Premature close of postgres connection with multiple threads → [patch] Premature close of postgres connection with multiple threads |
---|
comment:9 by , 19 years ago
I finally got to the new magicless branch and found that renaming "db" to "connection" didn't break that much as I was afraid :-). I've made the version of my patch for the branch. It looks pretty much the same except filenames and may be some line count jiggle...
comment:10 by , 19 years ago
Component: | Admin interface → Database wrapper |
---|
comment:11 by , 19 years ago
Adrian, Jacob is there anything wrong with the patch that prevents it from checking in? Or is it just time?
Which Apache worker module are you using? I've seen bugs like this when using the threaded worker that didn't show up after switching to the preforking one (which seems to be faster for serving Django anyway).