Opened 13 months ago

Closed 13 months ago

Last modified 9 months ago

#34914 closed Bug (duplicate)

Postgres connection is not restarted after it's dropped by the server, even if we use CONN_HEALTH_CHECKS=True

Reported by: Vitor Menegat Owned by: nobody
Component: Database layer (models, ORM) Version: 4.2
Severity: Normal Keywords:
Cc: Vitor Menegat, Marcelo Frantz Triage Stage: Unreviewed
Has patch: no Needs documentation: no
Needs tests: no Patch needs improvement: no
Easy pickings: no UI/UX: no

Description

I'm using a Django management command to start a container that executes DB queries (without any transaction scope): python manage.py <mycommand>

These are the relevant database configs I'm using:

DATABASES = {
    "default": env.db("DATABASE_URL"),
}

DATABASES["default"]["ATOMIC_REQUESTS"] = True
DATABASES["default"]["CONN_HEALTH_CHECKS"] = True
DATABASES["default"]["CONN_MAX_AGE"] = None                  ### I've also tried setting this to 0. Same issue

When Django loses the DB connection for whatever reason, I see the error server closed the connection unexpectedly, but then the error psycopg2.InterfaceError: connection already closed happens over and over again, every time it tries to execute a SQL query. This is very annoying because the container needs to be re-created in other for the app to function properly.

Investigating this I found here that the connection health check is SKIPPED if (https://github.com/django/django/blob/main/django/db/backends/base/base.py#L576-L580):

  • self.connection is None; OR
  • self.health_checks_enabled = False; OR
  • self.health_check_done = True

I created a test command to understand what was happening with the connection, added some logs to see the connection attributes, and found that the connection health check is always skipped because self.health_check_done is never set to False.

My question is: shouldn't the connection be health checked when self.errors_occurred flips to True?

Here's the code I used to test:

import logging
import time 

from django.core.management import BaseCommand
from django.db import connection


logger = logging.getLogger(__name__)


class Command(BaseCommand):
    help = ""

    def handle(self, *args, **kwargs):
        while True:
            time.sleep(1)
            try:
                logger.info(f"health_check_enabled: {connection.health_check_enabled}")
                logger.info(f"health_check_done: {connection.health_check_done}")
                logger.info(f"errors_occurred: {connection.errors_occurred}")
                with connection.cursor() as cursor:
                    cursor.execute("SELECT 1")
                logger.info("query executed")
            except Exception as e:
                logger.exception(e)

Here's the relevant part of the logs:

health_check_enabled: False   ## the connect() method was not yet called
health_check_done: False      ## the connect() method was not yet called
errors_occurred: False

query executed

health_check_enabled: True
health_check_done: True
errors_occurred: False

server closed the connection unexpectedly
	This probably means the server terminated abnormally
	before or while processing the request.
Traceback (most recent call last):
  File "/usr/local/lib/python3.11/site-packages/django/db/backends/utils.py", line 87, in _execute
    return self.cursor.execute(sql)
           ^^^^^^^^^^^^^^^^^^^^^^^^
psycopg2.OperationalError: server closed the connection unexpectedly
    This probably means the server terminated abnormally
    before or while processing the request.
[... Traceback omitted ...]

health_check_enabled: True
health_check_done: True
errors_occurred: True         ## This is set to True now, as expected

Traceback (most recent call last):
  File "/usr/local/lib/python3.11/site-packages/django/db/backends/base/base.py", line 308, in _cursor
    return self._prepare_cursor(self.create_cursor(name))
                                ^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/django/utils/asyncio.py", line 26, in inner
    return func(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/django/db/backends/postgresql/base.py", line 330, in create_cursor
    cursor = self.connection.cursor()
             ^^^^^^^^^^^^^^^^^^^^^^^^
psycopg2.InterfaceError: connection already closed
[...Traceback omitted ...]

health_check_enabled: True
health_check_done: True
errors_occurred: True

Traceback (most recent call last):
  File "/usr/local/lib/python3.11/site-packages/django/db/backends/base/base.py", line 308, in _cursor
    return self._prepare_cursor(self.create_cursor(name))
                                ^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/django/utils/asyncio.py", line 26, in inner
    return func(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/django/db/backends/postgresql/base.py", line 330, in create_cursor
    cursor = self.connection.cursor()
             ^^^^^^^^^^^^^^^^^^^^^^^^
psycopg2.InterfaceError: connection already closed
[... Traceback omitted ...]

health_check_enabled: True
health_check_done: True
errors_occurred: True

Traceback (most recent call last):
  File "/usr/local/lib/python3.11/site-packages/django/db/backends/base/base.py", line 308, in _cursor
    return self._prepare_cursor(self.create_cursor(name))
                                ^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/django/utils/asyncio.py", line 26, in inner
    return func(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/django/db/backends/postgresql/base.py", line 330, in create_cursor
    cursor = self.connection.cursor()
             ^^^^^^^^^^^^^^^^^^^^^^^^
psycopg2.InterfaceError: connection already closed
[... Traceback omitted ...]

Change History (5)

comment:1 by Vitor Menegat, 13 months ago

Cc: Vitor Menegat added

comment:2 by Marcelo Frantz, 13 months ago

Cc: Marcelo Frantz added

comment:3 by Simon Charette, 13 months ago

I think this is closely related to #32589 and #14845.

Django being mainly a web framework most of its connection management is based around HTTP request lifecycle handling. For example, Django automatically checks whether database connections should be closed when starting and finishing HTTP requests.

The way management commands were designed though didn't account for lifecycle events for the framework to hook into and manage connections appropriately. In other words, BaseCommand.handle is a blackbox to the framework and it cannot safely determine that a unit of work is completed and that is is safe to close a connection.

In the case of long-running / demonized commands that are meant to run for longer than CONN_MAX_AGE (or the TTL of the underlying server connection) it is the responsibility of the user to call close_old_connections. This is something that is unfortunately not documented but given how common this use case and the lack of progress on #14845 in the past years I think we should consider a documentation admonition that focuses solely on the long-runner command use case if that means documenting this pattern more broadly.

If you feel so inclined to get this documented it appears to be a good opportunity to resolve a 10 years old issue ;)

TL;DR the following should address your issue but it's unfortunately not documented

import logging
import time 

from django.core.management import BaseCommand
from django.db import close_old_connections, connection


logger = logging.getLogger(__name__)


class Command(BaseCommand):
    def handle(self, *args, **kwargs):
        while True:
            close_old_connections()  # <- this ensures DATABASES_CON* settings are honoured.
            time.sleep(1)
            try:
                logger.info(f"health_check_enabled: {connection.health_check_enabled}")
                logger.info(f"health_check_done: {connection.health_check_done}")
                logger.info(f"errors_occurred: {connection.errors_occurred}")
                with connection.cursor() as cursor:
                    cursor.execute("SELECT 1")
                logger.info("query executed")
            except Exception as e:
                logger.exception(e)
Last edited 13 months ago by Simon Charette (previous) (diff)

comment:4 by Mariusz Felisiak, 13 months ago

Resolution: duplicate
Status: newclosed

Agreed, this can be closed as a duplicate of #14845 (just like #32589).

comment:5 by utkarshpandey12, 9 months ago

I am not executing raw sql instead I am using orm based queries.

Can I skip this line ?

with connection.cursor() as cursor:
   cursor.execute("SELECT 1")

will a new connection be available from orm to make execute queries after using close_old_connections() or anything else needed.

Version 0, edited 9 months ago by utkarshpandey12 (next)
Note: See TracTickets for help on using tickets.
Back to Top