Code

Opened 7 months ago

Closed 11 days ago

Last modified 11 days ago

#21202 closed Cleanup/optimization (fixed)

Atomic masks fatal database errors and instead propagates "connection already closed"

Reported by: intgr Owned by: aaugustin
Component: Database layer (models, ORM) Version: 1.6-beta-1
Severity: Normal Keywords:
Cc: depaolim@… Triage Stage: Accepted
Has patch: yes Needs documentation: no
Needs tests: no Patch needs improvement: no
Easy pickings: no UI/UX: no

Description

I noticed this problem with the new Atomic code in Django 1.6: when the PostgreSQL server raises a FATAL error (meaning that the server closes the connection after the error), then the Atomic block will swallow the original error and propagates a meaningless "InterfaceError: connection already closed" instead. I don't know whether this also affects other database backends.

I find that it's quite annoying to debug problems when the original cause of an error is hidden by subsequent error recovery code. It's not a new problem with Atomic, I believe commit_on_success (and Christophe Pettus's xact) had the same problem.

The easiest way to reproduce this is to terminate your own connection. (In PostgreSQL 9.1 and earlier you need to be superuser to do this):

from django.db import connection
from django.db.transaction import atomic

with atomic():
    cur = connection.cursor()
    cur.execute("select pg_terminate_backend(pg_backend_pid())")
    cur.fetchall()

With current Django 1.6 this raises an error like:

Traceback (most recent call last):
  File "term1.py", line 7, in <module>
    cur.fetchall()
  File "/home/marti/src/django/django/db/transaction.py", line 351, in __exit__
    connection.set_autocommit(True)
  File "/home/marti/src/django/django/db/backends/__init__.py", line 335, in set_autocommit
    self._set_autocommit(autocommit)
  File "/home/marti/src/django/django/db/backends/postgresql_psycopg2/base.py", line 184, in _set_autocommit
    self.connection.autocommit = autocommit
psycopg2.InterfaceError: connection already closed

It tells you that the connection is closed, but not why. And there's a second problem too: a psycopg2 exception is propagated into user code, without using Django's exception wrappers.

To solve this problem, I submitted pull request 1696: https://github.com/django/django/issues/1696

  1. Changed BaseDatabaseWrapper.set_autocommit to wrap backend-specific errors.
  2. Wrapped Atomic.__exit__ in another layer of try/except and re-raising the original exception in case of an InterfaceError.

Now the original exception is properly raised:

django.db.utils.OperationalError: terminating connection due to administrator command

I can also write unit tests if someone can help me figure out how to test this (the example I provided above requires PostgreSQL, with either superuser privs or version 9.2+).

Ran all 'transactions' and 'db_backends' tests on these changes with both SQLite (OK (skipped=42)) and psycopg2 (OK (skipped=1))

Attachments (0)

Change History (13)

comment:1 Changed 7 months ago by aaugustin

  • Needs documentation unset
  • Needs tests unset
  • Owner changed from nobody to aaugustin
  • Patch needs improvement set
  • Status changed from new to assigned
  • Triage Stage changed from Unreviewed to Accepted
  • Type changed from Uncategorized to Cleanup/optimization

comment:2 Changed 7 months ago by intgr

Hi, I see you set the "Patch needs improvement" flag, but I can't see a review anywhere. Am I missing something?

comment:3 Changed 7 months ago by aaugustin

The report only covers SQLite and PostgreSQL. We need to check and possibly make changes for MySQL and Oracle.

The diff in atomic is very scary, but maybe that's just a side effect of indenting everything.

comment:4 Changed 7 months ago by intgr

Yeah, the change in atomic is not really scary at all, here's a whitespace-independent diff for 2nd commit:

  • django/db/transaction.py

    diff --git a/django/db/transaction.py b/django/db/transaction.py
    index 86a357f..2a4f16d 100644
    a b from functools import wraps 
    1818 
    1919from django.db import ( 
    2020        connections, DEFAULT_DB_ALIAS, 
    21         DatabaseError, ProgrammingError) 
     21        DatabaseError, ProgrammingError, InterfaceError) 
    2222from django.utils.decorators import available_attrs 
    2323 
    2424 
    class Atomic(object): 
    312312            connection.in_atomic_block = False 
    313313 
    314314        try: 
     315            try: 
    315316                if exc_type is None and not connection.needs_rollback: 
    316317                    if connection.in_atomic_block: 
    317318                        # Release savepoint if there is one 
    class Atomic(object): 
    354355                elif not connection.savepoint_ids and not connection.commit_on_exit: 
    355356                    connection.in_atomic_block = False 
    356357 
     358        except InterfaceError: 
     359            if exc_type is not None: 
     360                # We tried to clean up from an exception, but ran into an 
     361                # InterfaceError, meaning the connection is unusable from now 
     362                # on. Propagate up the original error instead of masking it 
     363                # with InterfaceError ("connection already closed") 
     364                return 
     365 
    357366    def __call__(self, func): 
    358367        @wraps(func, assigned=available_attrs(func)) 
    359368        def inner(*args, **kwargs): 
Last edited 7 months ago by intgr (previous) (diff)

comment:5 Changed 2 months ago by dji

Whatever happened to this? Seeing this happen a lot on production with 1.6 and postgres. Even worse, the connection then doesn't get closed, and so all future requests error as well. So should we also be closing the bad connection... somewhere?

Longer explanation: As it is with this patch applied (to 1.6, I can't seem to get dev to run), django.db.close_old_connections() fails on conn.rollback() (which is called by conn.abort()) with the same InterfaceError. This means that conn.close_if_unusable_or_obsolete() never gets called, and the broken connection is reused for later requests, causing all future requests on that connection to error.

You can replicate this behavior pretty easily as described above, by enabling connection pooling and terminating your own connection in a request, and then trying to make a subsequent request. I suspect this is still the behavior in dev as well, nothing appears to have changed.

My "solution" is to squelch InterfaceError as well as DatabaseError in close_old_connections, but I'm new to the Django internals so I'm not sure if this is best, or if it would be better to just close the connection straight away here in __exit__ before we return, or what. Happy to help sort this but I'm not sure of the correct approach.

Last edited 2 months ago by dji (previous) (diff)

comment:6 Changed 4 weeks ago by aaugustin

  • Patch needs improvement unset

https://github.com/django/django/pull/2468

I believe that patch works for the case when the database closes the connection and also adresses comment 5.

It also prevents further queries until the exit of the outermost atomic block, thus preserving atomicity.

comment:7 Changed 13 days ago by depaolim

  • Cc depaolim@… added

comment:8 Changed 11 days ago by Aymeric Augustin <aymeric.augustin@…>

In 3becac84845cee8f12a5fb7f68c87cbaf029c6a0:

Fixed #22321 -- Wrapped exceptions in _set_autocommit.

Refs #21202.

comment:9 Changed 11 days ago by Aymeric Augustin <aymeric.augustin@…>

  • Resolution set to fixed
  • Status changed from assigned to closed

In 8176150850b2e34b2afe1dc107e184eb4c6cd668:

Fixed #21202 -- Maintained atomicity when the server disconnects.

Thanks intgr for the report.

This commit doesn't include a test because I don't know how to emulate a
database disconnection in a cross-database compatible way.

Also simplified a 'backends' test that was constrained by this problem.

comment:10 Changed 11 days ago by Aymeric Augustin <aymeric.augustin@…>

In 746cded010c7ba3e2bf2df4cde32dcc9b75cb0d3:

[1.6.x] Fixed #22321 -- Wrapped exceptions in _set_autocommit.

Refs #21202.

Backport of 3becac84 from master

comment:11 Changed 11 days ago by Aymeric Augustin <aymeric.augustin@…>

In 1d3d2b9a2458cbf11b7e828fb59b3f2cb73d14fe:

[1.6.x] Fixed #21202 -- Maintained atomicity when the server disconnects.

Thanks intgr for the report.

This commit doesn't include a test because I don't know how to emulate a
database disconnection in a cross-database compatible way.

Also simplified a 'backends' test that was constrained by this problem.

Backport of 81761508 from master

comment:12 Changed 11 days ago by Aymeric Augustin <aymeric.augustin@…>

In 5f22bda3827f32289793b90a387d8c77d602a859:

[1.7.x] Fixed #22321 -- Wrapped exceptions in _set_autocommit.

Refs #21202.

Backport of 3becac84 from master

comment:13 Changed 11 days ago by Aymeric Augustin <aymeric.augustin@…>

In 2ad0bc132a6f70a128d8e1fec689c6a474ddde89:

[1.7.x] Fixed #21202 -- Maintained atomicity when the server disconnects.

Thanks intgr for the report.

This commit doesn't include a test because I don't know how to emulate a
database disconnection in a cross-database compatible way.

Also simplified a 'backends' test that was constrained by this problem.

Backport of 81761508 from master

Add Comment

Modify Ticket

Change Properties
<Author field>
Action
as closed
as The resolution will be set. Next status will be 'closed'
The resolution will be deleted. Next status will be 'new'
Author


E-mail address and user name can be saved in the Preferences.

 
Note: See TracTickets for help on using tickets.