#31134 closed Bug (invalid)
ASGI does not closing DB connections properly.
Reported by: | Wojciech Bartosiak | Owned by: | nobody |
---|---|---|---|
Component: | Database layer (models, ORM) | Version: | 3.0 |
Severity: | Normal | Keywords: | asgi database |
Cc: | Andrew Godwin, Carlton Gibson | Triage Stage: | Accepted |
Has patch: | no | Needs documentation: | no |
Needs tests: | no | Patch needs improvement: | no |
Easy pickings: | no | UI/UX: | no |
Description
I’m testing Django 3.0.1 async features and ASGI performance and I have noticed that when in the configuration:
DATABASES = { 'default': { 'ENGINE': 'django.db.backends.postgresql', 'NAME': example, 'USER': example, 'PASSWORD': example, 'HOST': 'localhost', 'PORT': 5432, 'ATOMIC_REQUESTS': True } }
Or when ATOMIC_REQUESTS
is set to False
but in View
I have something like this:
from django.http import HttpResponse import datetime from django.db import transaction def current_datetime(request): with transaction.atomic(): now = datetime.datetime.now() html = "<html><body>It is now %s.</body></html>" % now return HttpResponse(html)
Connections/sessions are not being closed and every new request opens a new connection causing:
django.db.utils.OperationalError: FATAL: sorry, too many clients already
This problem does not exist in WSGI mode and/or with/without django_postgrespool2 (with WSGI) this does not exist.
When a server in ASGI mode attacker can use this vector to cut off service.
ASGI command:
uvicorn --workers 5 --interface asgi3 --loop asyncio --lifespan off project.asgi:application
WSGI command:
uvicorn --workers 5 --interface wsgi --lifespan off project.wsgi:application
Change History (12)
comment:1 by , 5 years ago
Cc: | added |
---|---|
Severity: | Normal → Release blocker |
Summary: | ASGI does not closing DB connections properly → ASGI does not closing DB connections properly. |
Triage Stage: | Unreviewed → Accepted |
follow-up: 3 comment:2 by , 5 years ago
Hmmm. This isn't as straight-forward as it looks.
Connections/sessions are not being closed and every new request opens a new connection
Testing locally, this isn't true for me. I'm seeing connections go up to the 10-15 range, but then drop again at a later point. It's certainly not the case that connection numbers are rising monotonically with the request count.
Thus, is it PostgreSQL config that's in play? Wojciech, could you help me pin down the exact reproduce case?
comment:3 by , 5 years ago
machine x86_64 release 19.2.0 sysname Darwin version Darwin Kernel Version 19.2.0: Sat Nov 9 03:47:04 PST 2019; root:xnu-6153.61.1~20/RELEASE_X86_64 python 3.7.3
My docker-compose.yml
file:
version: "3" services: my_db: image: postgres container_name: "my_db" environment: - POSTGRES_USER= example - POSTGRES_PASSWORD= example - POSTGRES_DB= example expose: - "5432" ports: - "5432:5432" volumes: - ./persistence/postgres-data:/var/lib/postgresql/data
PostgreSQL version:
# select version(); PostgreSQL 12.0 (Debian 12.0-2.pgdg100+1) on x86_64-pc-linux-gnu, compiled by gcc (Debian 8.3.0-6) 8.3.0, 64-bit
Command for testing:
$ ab -c 10 -n 100 http://localhost:8000/a/
comment:4 by , 5 years ago
follow-up: 6 comment:5 by , 5 years ago
It looks like connections could be closed better — yes, once the server is idle open connections are lingering (I suspect they're not closed until Django tries to re-use them in subsequent requests) — but (as yet) it's not looking like there's an in issue with a runaway number of connections. Upping -c
to 200 and higher, the relation to the number of open connections seems to be dependent on the number of workers, rather than the request count.
Example: finishing a test with 4 workers and 200 concurrent requests, I have open DB connections maxing out (and stable) at around 70. From there making single requests one at a time sees the connection count drop by one on most requests — some it skips, presumably because the connection is reused and still valid.
However, CONN_MAX_AGE=0
(the default) should imply the connection being closed each request. (Not necessarily the ideal combination with atomic requests for performance testing but...) This is normally done in the request_finished
signal, so I'll look into why that's not happening.
Thanks for the follow-up!
comment:6 by , 5 years ago
Replying to Carlton Gibson:
It looks like connections could be closed better
When I have switched to django-postgrespool2
I have noticed error described here: https://docs.sqlalchemy.org/en/13/errors.html#error-3o7r
Maybe this information will give you some hint about connections.
HINT: Error 3o7r
exists for WSGI and ASGI connections.
This is my configuration to reproduce this error:
DATABASES = { 'default': { 'ENGINE': 'django_postgrespool2', 'NAME': 'example', 'USER': 'example', 'PASSWORD': 'example', 'HOST': 'localhost', 'PORT': 5432, 'ATOMIC_REQUESTS': False, 'DISABLE_SERVER_SIDE_CURSORS': True } } DATABASE_POOL_CLASS = 'sqlalchemy.pool.QueuePool' DATABASE_POOL_ARGS = { 'max_overflow': 5, 'pool_size': 1, 'recycle': 5 }
Uvicorn start commands:
ASGI:
uvicorn --workers 5 --interface asgi3 --loop asyncio --lifespan off project.asgi:application
WSGI:
uvicorn --workers 5 --interface wsgi --lifespan off project.wsgi:application
Testing command:
% ab -c 20 -n 1000 -s 120 http://localhost:8000/a/ This is ApacheBench, Version 2.3 <$Revision: 1843412 $> [...] Percentage of the requests served within a certain time (ms) 50% 7 66% 8 75% 8 80% 9 90% 10 95% 14 98% 28 99% 79 100% 30038 (longest request)
Error:
sqlalchemy.exc.TimeoutError: QueuePool limit of size 1 overflow 5 reached, connection timed out, timeout 30 (Background on this error at: http://sqlalche.me/e/3o7r)
comment:7 by , 5 years ago
Ok. I must admit that this ticket can be closed.
I finally understood DB connection lifecycle and uvicorn workers. First of all uvicorn for every worker starts 10 threads so for 2 workers there will be 20 opened connections, for 10 workers it will be 100 connections.
Django requires to have non-transactional requests to close the connection after CONN_MAX_AGE
and because when I set ATOMIC_REQUESTS
to True
, Django will always leave the open connection but firstly when the connection was open longer than CONN_MAX_AGE
, Django will reestablish that connection once again.
follow-up: 9 comment:8 by , 5 years ago
Resolution: | → invalid |
---|---|
Severity: | Release blocker → Normal |
Status: | new → closed |
OK, thanks for the follow-up: that's saved me a cycle tracking that down. (Note to self: always try with uvicorn and Daphne to make sure it's not the protocol server...)
Please do come back if you see follow-up issues.
comment:9 by , 5 years ago
Replying to Carlton Gibson:
OK, thanks for the follow-up: that's saved me a cycle tracking that down. (Note to self: always try with uvicorn and Daphne to make sure it's not the protocol server...)
Please do come back if you see follow-up issues.
I'm trying to understand the implementation and when trying to use WGSI with QueuePool with CONN_MAX_AGE=0
Django trying to close the connection but QueuePool
just returning it to the pool.
This works fine even when I have the sum of pool_size
and max_overflow
less than 10 (threads per worker) and ab -c30 -n10000
but when I change to ASGI, then after 20-30 requests problem with Queue pool limit size pops out, so this made me think that the closing connection in ASGI is buggy.
Andrew Godwin wrote:
[...]more investigation would be needed to determine if this is the ASGI handler not calling close_old_connections correctly (there's a chance it gets called in the wrong thread, due to the way signals interact with async).
I'm not good in core Django implementations but maybe some more experienced developer would like to investigated this?
follow-up: 11 comment:10 by , 5 years ago
Having looked at it at some length today, it seems it's working as expected. (Not sure about the QueuePool issue, but that looks unrelated—not the issue reported here—but also expected: you've overwhelmed the QueuePool, if I read that SQLAlchemy page correctly...?) Very happy though to look a minimal projects demonstrating a concrete issue with Django.
comment:11 by , 5 years ago
Replying to Carlton Gibson:
you've overwhelmed the QueuePool, if I read that SQLAlchemy page correctly...?
With WSGI and this simple transaction example, the connections are being returned correctly, even I'm overwhelming the Queue. With ASGI returning (closing) the DB connection somehow works wrong, maybe signals are not working properly with ASGI?
comment:12 by , 5 years ago
Finally, after hours, hours diving into the code, I finally understood how Django sets the connection, release connection and all that signal handlers. I can confirm a couple of things:
- ASGI and WSGI works fine with connections
- When ASGI enabled, the
PgBouncer
should be used withCONN_MAX_AGE = 0
andDISABLE_SERVER_SIDE_CURSORS = True
django_postgrespool2
has a bug with connection pooling because accessing to the pool is not thread-safe and adding simplethreading.Lock
fixed issue - so this needs to be reported- The best performance I have with simple
Gunicorn + WSGI
with--workers X
,--threads 1
,CONN_MAX_AGE
is not0
X
is (2 x $num_cores) + 1
Thanks for this report.