Opened 15 years ago

Closed 15 years ago

Last modified 11 years ago

#9878 closed Uncategorized (invalid)

Database backend leaves idling connections when used in multiple threads

Reported by: Fredde Owned by: nobody
Component: Database layer (models, ORM) Version: 1.0
Severity: Normal Keywords:
Cc: Triage Stage: Unreviewed
Has patch: no Needs documentation: no
Needs tests: no Patch needs improvement: no
Easy pickings: no UI/UX: no

Description

When creating and saving model instances in multiple threads, the database backend leaves idling connections. This will finally cause a "Too many connections"-exception.

I'm using Django 1.0 with a MySQL database

This script will leave ~10-20 open connections

def createArticle():
    a = Article(title = '...', text = '...')
    a.save()

threads = []
for i in xrange(200):
    t = threading.Thread(target = createArticle)
    t.start()
    threads.append(t)

for t in threads:
    t.join()

To see the active connections in the mysql client

mysql> show full processlist;
+--------+------+-----------+------+---------+------+-------+-----------------------+
| Id     | User | Host      | db   | Command | Time | State | Info                  |
+--------+------+-----------+------+---------+------+-------+-----------------------+
| 463987 | ---- | localhost | NULL | Query   |    0 | NULL  | show full processlist |
| 463994 | ---- | localhost | ---- | Sleep   |   44 |       | NULL                  |
| 464020 | ---- | localhost | ---- | Sleep   |   85 |       | NULL                  |
| 464042 | ---- | localhost | ---- | Sleep   |   85 |       | NULL                  |
| 464045 | ---- | localhost | ---- | Sleep   |   85 |       | NULL                  |
| 464048 | ---- | localhost | ---- | Sleep   |   85 |       | NULL                  |
| 464067 | ---- | localhost | ---- | Sleep   |   85 |       | NULL                  |
| 464093 | ---- | localhost | ---- | Sleep   |   85 |       | NULL                  |
| 464105 | ---- | localhost | ---- | Sleep   |   85 |       | NULL                  |
| 464136 | ---- | localhost | ---- | Sleep   |   85 |       | NULL                  |
| 464139 | ---- | localhost | ---- | Sleep   |   85 |       | NULL                  |
| 464165 | ---- | localhost | ---- | Sleep   |   85 |       | NULL                  |
| 464166 | ---- | localhost | ---- | Sleep   |   85 |       | NULL                  |
| 464183 | ---- | localhost | ---- | Sleep   |   85 |       | NULL                  |
+--------+------+-----------+------+---------+------+-------+-----------------------+
14 rows in set (0.00 sec)

Change History (7)

comment:1 by Malcolm Tredinnick, 15 years ago

Resolution: invalid
Status: newclosed

The particular example you give is not the way Django is used, however. So, yes, if you do something to deliberately create lots of connections, lot of connections will be created. However, Django closes its connection to the database at the end of each request/response cycle, so there is only one connection in operation per thread or process handling requests and responses. If you're not using the HTTP layer, it's still only one connection per thread of execution and you are in complete control of the number of threads you create.

So there's not really any Django bug here. The connection usage is predictable and controlled.

comment:2 by norvegh, 15 years ago

Hi,

You are correct in that we are not using the HTTP layer. We have a back-end application and it was very convenient to reuse Django's model API that we use for the front-end Webapp. However there is clearly a bug in Django, as the connections are not closed even when the threads using those connections are terminated. Those DB connections are just hanging there for a long time (or forever, don't really know that).
We know that this is not exactly the way Django is supposed to be used, but Django has grown too powerful and useful to limit it only to the front-end.

norvegh

in reply to:  1 comment:3 by anonymous, 15 years ago

Resolution: invalid
Status: closedreopened

Replying to mtredinnick:

However, Django closes its connection to the database at the end of each request/response cycle

This is not true even when using the HTTP layer (only tested with the development server). After all the threads have joined, there is still idling connections left behind. Try the code I posted erlier and you will see.

comment:4 by James Bennett, 15 years ago

Resolution: invalid
Status: reopenedclosed

What Malcolm means is that the database connection Django itself created automatically for the ORM in the current request/thread will be closed. Additional connections that you yourself cause to be opened are your responsibility to clean up (just as files you open are your responsibility to close, mail-server connections you open are your responsibility to close, etc., etc.).

comment:5 by James Bennett, 15 years ago

(and to see why this is so for the case of DB connections, note that Django's automatic connection-closing happens on the request_finished signal, which fires only in a thread that's processing an HTTP request, not in any other threads you may have spawned on your own. Django takes care of its thread, and you should take care of yours)

comment:6 by Fredde, 15 years ago

So at the end of each thread django.db.connection.close() should be called? But still, not every thread leaves an open connection. Only about 10% of them, so it looks like a race condition to me.

comment:7 by cameel2+django@…, 11 years ago

Easy pickings: unset
Severity: Normal
Type: Uncategorized
UI/UX: unset

I just got bitten by this after migration from Django 1.3.4 to 1.4.2. Test runner started to hang during cleanup (on an instruction that issues 'drop database test_<appname>' query). Turns out that's because the threads one of the methods under test is spawning were not closing the database connection (even though they themselves seem exit correctly) and server was waiting for them to do so. show full processlist; in mysql client was showing the threads as sleeping with no command to execute.

I understand that this is a bug in my code, but if the connections need to be closed explicitly, wouldn't it be better if I was also required to open them explicitly? I wasn't even aware that I'm using the database in those threads. They don't explicitly import any models or database-related code from Django. The fact that they're using it is the actual bug and it would be better if it caused an error instead of failing silently.

Another thing is that the test runner should be prepared to handle misbehaving code. Wouldn't it make sense for it check if there are connections from other threads still open (is it possible if the threads have already exited and the runner did not have any references to them?) and if so, print an error and forcefully close them? Letting it just hang like that makes debugging harder.

Note: See TracTickets for help on using tickets.
Back to Top