#34914 closed Bug (duplicate)
Postgres connection is not restarted after it's dropped by the server, even if we use CONN_HEALTH_CHECKS=True
| Reported by: | Vitor Menegat | Owned by: | nobody |
|---|---|---|---|
| Component: | Database layer (models, ORM) | Version: | 4.2 |
| Severity: | Normal | Keywords: | |
| Cc: | Vitor Menegat, Marcelo Frantz | Triage Stage: | Unreviewed |
| Has patch: | no | Needs documentation: | no |
| Needs tests: | no | Patch needs improvement: | no |
| Easy pickings: | no | UI/UX: | no |
Description
I'm using a Django management command to start a container that executes DB queries (without any transaction scope): python manage.py <mycommand>
These are the relevant database configs I'm using:
DATABASES = {
"default": env.db("DATABASE_URL"),
}
DATABASES["default"]["ATOMIC_REQUESTS"] = True
DATABASES["default"]["CONN_HEALTH_CHECKS"] = True
DATABASES["default"]["CONN_MAX_AGE"] = None ### I've also tried setting this to 0. Same issue
When Django loses the DB connection for whatever reason, I see the error server closed the connection unexpectedly, but then the error psycopg2.InterfaceError: connection already closed happens over and over again, every time it tries to execute a SQL query. This is very annoying because the container needs to be re-created in other for the app to function properly.
Investigating this I found here that the connection health check is SKIPPED if (https://github.com/django/django/blob/main/django/db/backends/base/base.py#L576-L580):
- self.connection is None; OR
- self.health_checks_enabled = False; OR
- self.health_check_done = True
I created a test command to understand what was happening with the connection, added some logs to see the connection attributes, and found that the connection health check is always skipped because self.health_check_done is never set to False.
My question is: shouldn't the connection be health checked when self.errors_occurred flips to True?
Here's the code I used to test:
import logging
import time
from django.core.management import BaseCommand
from django.db import connection
logger = logging.getLogger(__name__)
class Command(BaseCommand):
help = ""
def handle(self, *args, **kwargs):
while True:
time.sleep(1)
try:
logger.info(f"health_check_enabled: {connection.health_check_enabled}")
logger.info(f"health_check_done: {connection.health_check_done}")
logger.info(f"errors_occurred: {connection.errors_occurred}")
with connection.cursor() as cursor:
cursor.execute("SELECT 1")
logger.info("query executed")
except Exception as e:
logger.exception(e)
Here's the relevant part of the logs:
health_check_enabled: False ## the connect() method was not yet called
health_check_done: False ## the connect() method was not yet called
errors_occurred: False
query executed
health_check_enabled: True
health_check_done: True
errors_occurred: False
server closed the connection unexpectedly
This probably means the server terminated abnormally
before or while processing the request.
Traceback (most recent call last):
File "/usr/local/lib/python3.11/site-packages/django/db/backends/utils.py", line 87, in _execute
return self.cursor.execute(sql)
^^^^^^^^^^^^^^^^^^^^^^^^
psycopg2.OperationalError: server closed the connection unexpectedly
This probably means the server terminated abnormally
before or while processing the request.
[... Traceback omitted ...]
health_check_enabled: True
health_check_done: True
errors_occurred: True ## This is set to True now, as expected
Traceback (most recent call last):
File "/usr/local/lib/python3.11/site-packages/django/db/backends/base/base.py", line 308, in _cursor
return self._prepare_cursor(self.create_cursor(name))
^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.11/site-packages/django/utils/asyncio.py", line 26, in inner
return func(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.11/site-packages/django/db/backends/postgresql/base.py", line 330, in create_cursor
cursor = self.connection.cursor()
^^^^^^^^^^^^^^^^^^^^^^^^
psycopg2.InterfaceError: connection already closed
[...Traceback omitted ...]
health_check_enabled: True
health_check_done: True
errors_occurred: True
Traceback (most recent call last):
File "/usr/local/lib/python3.11/site-packages/django/db/backends/base/base.py", line 308, in _cursor
return self._prepare_cursor(self.create_cursor(name))
^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.11/site-packages/django/utils/asyncio.py", line 26, in inner
return func(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.11/site-packages/django/db/backends/postgresql/base.py", line 330, in create_cursor
cursor = self.connection.cursor()
^^^^^^^^^^^^^^^^^^^^^^^^
psycopg2.InterfaceError: connection already closed
[... Traceback omitted ...]
health_check_enabled: True
health_check_done: True
errors_occurred: True
Traceback (most recent call last):
File "/usr/local/lib/python3.11/site-packages/django/db/backends/base/base.py", line 308, in _cursor
return self._prepare_cursor(self.create_cursor(name))
^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.11/site-packages/django/utils/asyncio.py", line 26, in inner
return func(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.11/site-packages/django/db/backends/postgresql/base.py", line 330, in create_cursor
cursor = self.connection.cursor()
^^^^^^^^^^^^^^^^^^^^^^^^
psycopg2.InterfaceError: connection already closed
[... Traceback omitted ...]
Change History (5)
comment:1 by , 2 years ago
| Cc: | added |
|---|
comment:2 by , 2 years ago
| Cc: | added |
|---|
comment:4 by , 2 years ago
| Resolution: | → duplicate |
|---|---|
| Status: | new → closed |
comment:5 by , 21 months ago
I am not executing raw sql instead I am using orm based queries.
Can I skip this line ?
with connection.cursor() as cursor:
cursor.execute("SELECT 1")
will a new connection always be available for orm inside the handle method to make execute queries after using close_old_connections() or anything else needed.
I think this is closely related to #32589 and #14845.
Django being mainly a web framework most of its connection management is based around HTTP request lifecycle handling. For example, Django automatically checks whether database connections should be closed when starting and finishing HTTP requests.
The way management commands were designed though didn't account for lifecycle events for the framework to hook into and manage connections appropriately. In other words,
BaseCommand.handleis a blackbox to the framework and it cannot safely determine that a unit of work is completed and that is is safe to close a connection.In the case of long-running / demonized commands that are meant to run for longer than
CONN_MAX_AGE(or the TTL of the underlying server connection) it is the responsibility of the user to callclose_old_connections. This is something that is unfortunately not documented but given how common this use case and the lack of progress on #14845 in the past years I think we should consider a documentation admonition that focuses solely on the long-runner command use case if that means documenting this pattern more broadly.If you feel so inclined to get this documented it appears to be a good opportunity to resolve a 10 years old issue ;)
TL;DR the following should address your issue but it's unfortunately not documented