Opened 3 years ago

Closed 3 years ago

Last modified 3 years ago

#33297 closed Bug (duplicate)

ASGI: Dead persistent postgres connections are not closed sometimes

Reported by: Matt Davis Owned by: nobody
Component: Uncategorized Version: 3.2
Severity: Normal Keywords:
Cc: Triage Stage: Unreviewed
Has patch: no Needs documentation: no
Needs tests: no Patch needs improvement: no
Easy pickings: no UI/UX: no

Description

I have a two node cluster running a container with ASGI and Django backend for the ORM with uvicorn and fastapi--after 3 days one of the two nodes has resulted in a permanent Internal Server Error due to the DB connection being closed, the other node is still running fine. I found a similar closed ticket (https://code.djangoproject.com/ticket/31905) but it was specifically about middleware, and this issue is not in the middleware. I apologize if this issue ends up being one of the other libraries at play (DRF, uvicorn, fastapi), but I think that the actual cause of the stack trace is the closed Django DB connection which is contained in the top of the stack trace, django/utils/asyncio.py", line 26, in inner is always calling for the same stale connection once the error condition happens, as in nothing ever refreshes it.

Django versions:
Django==3.2.8

django.db.utils.OperationalError: SSL SYSCALL error: EOF detected
INFO:     10.244.0.195:46240 - "GET /blog/posts HTTP/1.1" 500 Internal Server Error
ERROR:    Exception in ASGI application
Traceback (most recent call last):
  File "/usr/local/lib/python3.10/site-packages/django/db/backends/base/base.py", line 237, in _cursor
    return self._prepare_cursor(self.create_cursor(name))
  File "/usr/local/lib/python3.10/site-packages/django/utils/asyncio.py", line 26, in inner
    return func(*args, **kwargs)
  File "/usr/local/lib/python3.10/site-packages/django/db/backends/postgresql/base.py", line 236, in create_cursor
    cursor = self.connection.cursor()
psycopg2.InterfaceError: connection already closed
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
  File "/usr/local/lib/python3.10/site-packages/uvicorn/protocols/http/httptools_impl.py", line 375, in run_asgi
    result = await app(self.scope, self.receive, self.send)
  File "/usr/local/lib/python3.10/site-packages/uvicorn/middleware/proxy_headers.py", line 75, in __call__
    return await self.app(scope, receive, send)
  File "/usr/local/lib/python3.10/site-packages/fastapi/applications.py", line 208, in __call__
    await super().__call__(scope, receive, send)
  File "/usr/local/lib/python3.10/site-packages/starlette/applications.py", line 112, in __call__
    await self.middleware_stack(scope, receive, send)
  File "/usr/local/lib/python3.10/site-packages/starlette/middleware/errors.py", line 181, in __call__
    raise exc
  File "/usr/local/lib/python3.10/site-packages/starlette/middleware/errors.py", line 159, in __call__
    await self.app(scope, receive, _send)
  File "/usr/local/lib/python3.10/site-packages/starlette/exceptions.py", line 82, in __call__
    raise exc
  File "/usr/local/lib/python3.10/site-packages/starlette/exceptions.py", line 71, in __call__
    await self.app(scope, receive, sender)
  File "/usr/local/lib/python3.10/site-packages/starlette/routing.py", line 656, in __call__
    await route.handle(scope, receive, send)
  File "/usr/local/lib/python3.10/site-packages/starlette/routing.py", line 259, in handle
    await self.app(scope, receive, send)
  File "/usr/local/lib/python3.10/site-packages/starlette/routing.py", line 61, in app
    response = await func(request)
  File "/usr/local/lib/python3.10/site-packages/fastapi/routing.py", line 226, in app
    raw_response = await run_endpoint_function(
  File "/usr/local/lib/python3.10/site-packages/fastapi/routing.py", line 161, in run_endpoint_function
    return await run_in_threadpool(dependant.call, **values)
  File "/usr/local/lib/python3.10/site-packages/starlette/concurrency.py", line 39, in run_in_threadpool
    return await anyio.to_thread.run_sync(func, *args)
  File "/usr/local/lib/python3.10/site-packages/anyio/to_thread.py", line 28, in run_sync
    return await get_asynclib().run_sync_in_worker_thread(func, *args, cancellable=cancellable,
  File "/usr/local/lib/python3.10/site-packages/anyio/_backends/_asyncio.py", line 805, in run_sync_in_worker_thread
    return await future
  File "/usr/local/lib/python3.10/site-packages/anyio/_backends/_asyncio.py", line 743, in run
    result = func(*args)
  File "/code/whitewhale/./fapi/blog.py", line 47, in blog_posts
    return serializer.data
  File "/usr/local/lib/python3.10/site-packages/rest_framework/serializers.py", line 745, in data
    ret = super().data
  File "/usr/local/lib/python3.10/site-packages/rest_framework/serializers.py", line 246, in data
    self._data = self.to_representation(self.instance)
  File "/usr/local/lib/python3.10/site-packages/rest_framework/serializers.py", line 663, in to_representation
    return [
  File "/usr/local/lib/python3.10/site-packages/django/db/models/query.py", line 280, in __iter__
    self._fetch_all()
  File "/usr/local/lib/python3.10/site-packages/django/db/models/query.py", line 1324, in _fetch_all
    self._result_cache = list(self._iterable_class(self))
  File "/usr/local/lib/python3.10/site-packages/django/db/models/query.py", line 51, in __iter__
    results = compiler.execute_sql(chunked_fetch=self.chunked_fetch, chunk_size=self.chunk_size)
  File "/usr/local/lib/python3.10/site-packages/django/db/models/sql/compiler.py", line 1173, in execute_sql
    cursor = self.connection.cursor()
  File "/usr/local/lib/python3.10/site-packages/django/utils/asyncio.py", line 26, in inner
    return func(*args, **kwargs)
  File "/usr/local/lib/python3.10/site-packages/django/db/backends/base/base.py", line 259, in cursor
    return self._cursor()
  File "/usr/local/lib/python3.10/site-packages/django/db/backends/base/base.py", line 236, in _cursor
    with self.wrap_database_errors:
  File "/usr/local/lib/python3.10/site-packages/django/db/utils.py", line 90, in __exit__
    raise dj_exc_value.with_traceback(traceback) from exc_value
  File "/usr/local/lib/python3.10/site-packages/django/db/backends/base/base.py", line 237, in _cursor
    return self._prepare_cursor(self.create_cursor(name))
  File "/usr/local/lib/python3.10/site-packages/django/utils/asyncio.py", line 26, in inner
    return func(*args, **kwargs)
  File "/usr/local/lib/python3.10/site-packages/django/db/backends/postgresql/base.py", line 236, in create_cursor
    cursor = self.connection.cursor()
django.db.utils.InterfaceError: connection already closed

Only a restart of the application fixes the problem.

Change History (4)

comment:1 by Carlton Gibson, 3 years ago

Resolution: needsinfo
Status: newclosed

... if this issue ends up being one of the other libraries at play (DRF, uvicorn, fastapi)...

Impossible to say without a reproduce, but I suspect this is the case. I'm going to close as needsinfo but if you can reduce it to an example just using Django, then I'm very happy to take another look!

...the actual cause of the stack trace is the closed Django DB connection ...

Django has a routine to clean up old connections that is tied into the request-response life-cycle. This is triggered by Django's HTTP handlers once the response is sent. The suspicion would be that your alternate stack isn't triggering that cleanup, and you'll need to add that yourself.

comment:2 by Johannes Maron, 3 years ago

Resolution: needsinfoduplicate

I can second this. We're experiencing the same issue. Seems to be the same issue as #24810 thought.

comment:3 by Carlton Gibson, 3 years ago

Hey Joe — any chance you can work on a reproduce?

comment:4 by Azzonith, 3 years ago

Hi there,

Can confirm issue exists.
Reproduced 100% times on the following steps:

  1. Create models.
  2. Import models by setting up Django in any python module
os.environ['DJANGO_SETTINGS_MODULE'] = 'ServiceName.settings'
django.setup()
from ServiceName.models import ModelName

  1. Use models via django ORM to fetch some data from DB on service startup(lets call it step 1) and later on by some triggers(lets call it step 2). In my case data is fetched once on startup, and all the subsequent times on certain Kafka event.

e.g.

query_results = ModelName.objects.filter(your_filter=some_value)
  1. Shutdown PostgreSQL instance after step one, but before step 2

Not sure if related to the issue, but we are using pg-bouncer as a connection point.

  1. Start PostgreSQL, trigger step2.

First attemp to fetch the data fails with the following trace:

Traceback (most recent call last):
File "/opt/services/_init_.py", line 113, in observe_events
if callback(event):
File "services_logparser_core.py", line 193, in process_event
r_handler.update_rules()
File "services_logparser_core.py", line 48, in update_rules
self.rules_list = self._fetch_enabled_rules()
File "services_logparser_core.py", line 69, in _fetch_enabled_rules
for sr in enabled_search_rules:
File "/usr/local/lib/python3.8/dist-packages/django/db/models/query.py", line 280, in _iter_
self._fetch_all()
File "/usr/local/lib/python3.8/dist-packages/django/db/models/query.py", line 1324, in _fetch_all
self._result_cache = list(self._iterable_class(self))
File "/usr/local/lib/python3.8/dist-packages/django/db/models/query.py", line 51, in _iter_
results = compiler.execute_sql(chunked_fetch=self.chunked_fetch, chunk_size=self.chunk_size)
File "/usr/local/lib/python3.8/dist-packages/django/db/models/sql/compiler.py", line 1175, in execute_sql
cursor.execute(sql, params)
File "/usr/local/lib/python3.8/dist-packages/django/db/backends/utils.py", line 66, in execute
return self._execute_with_wrappers(sql, params, many=False, executor=self._execute)
File "/usr/local/lib/python3.8/dist-packages/django/db/backends/utils.py", line 75, in _execute_with_wrappers
return executor(sql, params, many, context)
File "/usr/local/lib/python3.8/dist-packages/django/db/backends/utils.py", line 84, in _execute
return self.cursor.execute(sql, params)
File "/usr/local/lib/python3.8/dist-packages/django/db/utils.py", line 90, in _exit_
raise dj_exc_value.with_traceback(traceback) from exc_value
File "/usr/local/lib/python3.8/dist-packages/django/db/backends/utils.py", line 84, in _execute
return self.cursor.execute(sql, params)
django.db.utils.OperationalError: SSL SYSCALL error: EOF detected

All the subsequent attempts to fetch the data fail with:

Traceback (most recent call last):
  File "/opt/services/__init__.py", line 113, in observe_events
    if callback(event):
  File "services_logparser_core.py", line 193, in process_event
    r_handler.update_rules()
  File "services_logparser_core.py", line 48, in update_rules
    self.rules_list = self._fetch_enabled_rules()
  File "services_logparser_core.py", line 69, in _fetch_enabled_rules
    for sr in enabled_search_rules:
  File "/usr/local/lib/python3.8/dist-packages/django/db/models/query.py", line 280, in __iter__
    self._fetch_all()
  File "/usr/local/lib/python3.8/dist-packages/django/db/models/query.py", line 1324, in _fetch_all
    self._result_cache = list(self._iterable_class(self))
  File "/usr/local/lib/python3.8/dist-packages/django/db/models/query.py", line 51, in __iter__
    results = compiler.execute_sql(chunked_fetch=self.chunked_fetch, chunk_size=self.chunk_size)
  File "/usr/local/lib/python3.8/dist-packages/django/db/models/sql/compiler.py", line 1173, in execute_sql
    cursor = self.connection.cursor()
  File "/usr/local/lib/python3.8/dist-packages/django/utils/asyncio.py", line 26, in inner
    return func(*args, **kwargs)
  File "/usr/local/lib/python3.8/dist-packages/django/db/backends/base/base.py", line 259, in cursor
    return self._cursor()
  File "/usr/local/lib/python3.8/dist-packages/django/db/backends/base/base.py", line 237, in _cursor
    return self._prepare_cursor(self.create_cursor(name))
  File "/usr/local/lib/python3.8/dist-packages/django/db/utils.py", line 90, in __exit__
    raise dj_exc_value.with_traceback(traceback) from exc_value
  File "/usr/local/lib/python3.8/dist-packages/django/db/backends/base/base.py", line 237, in _cursor
    return self._prepare_cursor(self.create_cursor(name))
  File "/usr/local/lib/python3.8/dist-packages/django/utils/asyncio.py", line 26, in inner
    return func(*args, **kwargs)
  File "/usr/local/lib/python3.8/dist-packages/django/db/backends/postgresql/base.py", line 236, in create_cursor
    cursor = self.connection.cursor()
django.db.utils.InterfaceError: connection already closed
  1. Application restart fixes the problem as it was previously stated.

Is there any django/psycopg setting can be configured inside the project so that new connection is established each time db query is required?

Last edited 3 years ago by Azzonith (previous) (diff)
Note: See TracTickets for help on using tickets.
Back to Top