Opened 10 months ago

Last modified 3 days ago

#33497 new New feature

Database persistent connections do not work with ASGI in 4.0

Reported by: Stenkar Owned by: nobody
Component: Database layer (models, ORM) Version: 4.0
Severity: Normal Keywords: ASGI, Database, async
Cc: Carlton Gibson, Florian Apolloner, Andrew Godwin, Anders Kaseorg, Patryk Zawadzki, Mikail, Alex, joeli Triage Stage: Accepted
Has patch: no Needs documentation: no
Needs tests: no Patch needs improvement: no
Easy pickings: no UI/UX: no

Description

Hello!

I've discovered that after upgrading Django to ver. 4 (currently 4.0.2), I started to see database FATAL: sorry, too many clients already errors in the Sentry.

For a database, I'm using containerized Postgres 14.1 and the connection between Django and Postgres is done by Unix socket.

Database settings look like this:

DATABASES = {
    "default": {
        "ENGINE": "django.db.backends.postgresql",
        "NAME": environ.get("POSTGRES_DB"),
        "USER": environ.get("POSTGRES_USER"),
        "PASSWORD": environ.get("POSTGRES_PASSWORD"),
        "HOST": environ.get("POSTGRES_HOST"),
        "PORT": environ.get("POSTGRES_PORT"),
        "CONN_MAX_AGE": 3600
    }
}

In production, I'm using ASGI (Uvicorn 0.17.4) to run the Django application (4 workers).

When everything is deployed and I have surfed around the Django admin site, then checking Postgres active connections, using SELECT * FROM pg_stat_activity; command, I see that there are 30+ Idle connections made from Django.

After surfing more around the admin site, I can see that more Idle connections have been made by Django.

It looks like the database connections are not reused. At one point some of the Idle connections are closed, but then again more connections have been made when more DB queries are made by Django.

I have one Django 3.2.11 project running on production and all the settings are the same, there are always max 10 persistent connections with the database and everything works fine.

Should that be like this in version 4.0?

Change History (16)

comment:1 Changed 10 months ago by Stenkar

Component: UncategorizedDatabase layer (models, ORM)

comment:2 Changed 10 months ago by Stenkar

Summary: Database persistent connection do not work with ASGI in 4.0Database persistent connections do not work with ASGI in 4.0

comment:3 Changed 10 months ago by Mariusz Felisiak

Cc: Carlton Gibson added
Resolution: needsinfo
Status: newclosed

Thanks for the report. ​Django has a routine to clean up old connections that is tied into the request-response life-cycle, so idle connections should be closed. However, I don't think you've explained the issue in enough detail to confirm a bug in Django. This can be an issue in psycopg2, uvicorn, or in custom middlewares (see #31905) it's hard to say without a reproduce.

Please reopen the ticket if you can debug your issue and provide details about why and where Django is at fault, or if you can provide a sample project with reproducible scenario.

Last edited 10 months ago by Mariusz Felisiak (previous) (diff)

comment:4 Changed 10 months ago by Carlton Gibson

Cc: Florian Apolloner Andrew Godwin added

Hi Stenkar.

Would you be able to put together a minimal test-project here, so that folks can reproduce quickly.

This may be due to Django 4.0 having per-request contexts for the thread sensitivity of sync_to_async() — See #32889.
If so, that's kind-of a good thing, in that too many open resources is what you'd expect in async code, and up to now, we've not been hitting that, as we've essentially been running serially.

Immediate thought for a mitigation would be to use a connection pool.

Equally, can we limit the number of threads in play using asgiref's `AGSI_THREADS` environment variable? (But see the discussion on the related Daphne issue about whether that's the right place for that at all.)

This is likely a topic we'll need to deal with (eventually) in Django: once you start getting async working, you soon hit resource limits, and handling that with structures for sequencing and maximum parallelism is one of those hard-batteries™ that we maybe should provide. 🤔

comment:5 Changed 10 months ago by Florian Apolloner

I think https://github.com/django/asgiref/pull/306#issuecomment-991959863 might play into this as well. By using a single thread per connection, persistent connections will never get clean up.

EDIT:// By Design I do not think that persistent connections currently work in Django's async mode. I think this should be reproducible with any async view that does a database connection.

Last edited 10 months ago by Florian Apolloner (previous) (diff)

comment:6 Changed 10 months ago by Stenkar

Resolution: needsinfo
Status: closednew

Thank you for the comments.

I created a project where it's possible to spin up a minimal project with docker-compose.

https://github.com/KStenK/django-ticket-33497

I'm not sure that I can find where or what goes wrong in more detail, but I'll give it a try.

comment:7 Changed 10 months ago by Carlton Gibson

Triage Stage: UnreviewedAccepted
Type: BugNew feature

OK, thanks Stenkar.

I'm going to accept this as a New Feature. It's a change in behaviour from 3.2, but it's precisely in allowing multiple executors for sync_to_async() that it comes up. (In 3.2 it's essentially single-threaded, with only a single connection actually being used.) We need to improve the story here, but it's not a bug in #32889 that we don't have async compatible persistent DB connections yet. (I hope that makes sense.)

A note to the docs about this limitation may be worthwhile.

comment:8 Changed 10 months ago by Florian Apolloner

Thinking more about this I do not think the problem is new. We have the same problem when persistent connections are used and a new thread is generated per request (for instance in runserver.py). Usually (ie with gunicorn etc) one has a rather "stable" pool of processes or requests; as soon as you switch to new threads per connection this will break. In ASGI this behavior is probably more pronounced since by definition every request is in it's own async task context which then propagates down to the db backend as new connection per request (which in turn will also never reuse connections because the "thread" ids change).

All in all I think we are finally at the point where we need a connection pool in Django. I'd strongly recommend to use something like https://github.com/psycopg/psycopg/tree/master/psycopg_pool/psycopg_pool but abstracted to work for all databases in Django.

comment:9 Changed 8 months ago by Anders Kaseorg

Cc: Anders Kaseorg added

Possibly related: #33625 (for memcached connections).

comment:10 Changed 3 months ago by Patryk Zawadzki

Cc: Patryk Zawadzki added

comment:11 Changed 3 months ago by Mikail

Cc: Mikail added

comment:12 Changed 3 months ago by Patryk Zawadzki

This is marked as a "new feature," but it's an undocumented breaking change between 3.2 and 4.0. Connections that were previously reused and terminated are now just left to linger.

The request_finished signal does not terminate them as they are not idle for longer than MAX_CONN_AGE.

The request_started signal does not terminate them as it never sees those connections due to the connection state being asgiref.local and discarded after every request.

Allowing parallel execution of requests is a great change, but I feel Django should outright refuse to start if MAX_CONN_AGE is combined with ASGI.

comment:13 Changed 3 weeks ago by Carlton Gibson

Keywords: async added

comment:14 Changed 7 days ago by Alex

Cc: Alex added

comment:15 in reply to:  12 Changed 3 days ago by joeli

Replying to Patryk Zawadzki:

This is marked as a "new feature," but it's an undocumented breaking change between 3.2 and 4.0. Connections that were previously reused and terminated are now just left to linger.

The request_finished signal does not terminate them as they are not idle for longer than MAX_CONN_AGE.

The request_started signal does not terminate them as it never sees those connections due to the connection state being asgiref.local and discarded after every request.

Allowing parallel execution of requests is a great change, but I feel Django should outright refuse to start if MAX_CONN_AGE is combined with ASGI.

I agree. I would even go as far as calling this a regression, not just an undocumented breaking change. No matter the reasons behind it or the technical superiority of the new solution, fact of the matter stands that in 3.2 ASGI mode our code worked fine and reused connections. In 4.x it is broken unless using MAX_CONN_AGE = 0, which disables a feature in Django that used to work.

comment:16 Changed 3 days ago by joeli

Cc: joeli added
Note: See TracTickets for help on using tickets.
Back to Top