Opened 17 months ago

Closed 10 months ago

Last modified 9 months ago

#21553 closed Bug (fixed)

InterfaceError in Postgres

Reported by: anonymous Owned by: aaugustin
Component: Database layer (models, ORM) Version: 1.6
Severity: Release blocker Keywords:
Cc: depaolim@…, shai Triage Stage: Accepted
Has patch: no Needs documentation: no
Needs tests: no Patch needs improvement: no
Easy pickings: no UI/UX: no

Description

I think this is a hole in the fix for #15901 -- in Postgres, connection.is_usable accesses the cursor directly, with the result that base exceptions (eg. InterfaceError) are thrown without the normal wrapper and thus not caught:

    def is_usable(self):
        try:
            # Use a psycopg cursor directly, bypassing Django's utilities.
            self.connection.cursor().execute("SELECT 1")
        except DatabaseError:
            return False
        else:
            return True

Change History (24)

comment:1 Changed 17 months ago by aaugustin

  • Needs documentation unset
  • Needs tests unset
  • Patch needs improvement unset

InterfaceError means that something is wrong with the database adapter on the Python side. I'm not convinced it's safe or useful to attempt automatically reconnecting to the database in that case.

That's why I chose to catch DatabaseError and not InterfaceError.

Could you explain your use case? In what circumstances do you receive InterfaceError?

comment:2 Changed 17 months ago by aaugustin

  • Resolution set to needsinfo
  • Status changed from new to closed

Since you reported that ticket anonymously, I'm not even sure you'll see my answer. I'm closing the ticket as "needing more information". Please reopen it with the answer to my questions above.

comment:3 Changed 17 months ago by adam-iris

  • Resolution needsinfo deleted
  • Status changed from closed to new

We are getting this error when the connection has been closed by Postgres:

InterfaceError: connection already closed

There was an earlier ticket (#15802) about this error in particular; from looking at the code, it seems that the fix for the earlier ticket involved catching Database.Error in close() but in our case it looks like the error is being raised from is_usable() which only catches Database.DatabaseError.

I would argue that is_usable() should never throw; regardless of what error occurs, the result is that the database is not usable.

comment:4 Changed 15 months ago by tehasdf@…

The use case for this fix is picking up a new connection after restarting postgresql (or in my case, pgpool2, but this doesn't seem to matter).
To replicate this error, simply run your app, bring the database down, send one more request, then bring the database back up. The backend will start throwing psycopg2.InterfaceError.
To my understanding, both DatabaseError and InterfaceError are reasonable there, one means you couldn't connect, the other means you couldn't _re_connect.
FWIW, the fix to this is simply change the line to be except (DatabaseError, Database.InterfaceError):

comment:5 Changed 15 months ago by timo

Is this a duplicate of #21202? There is a patch there that may be helpful.

comment:6 Changed 15 months ago by KyleMac

I just ran into this as well.

For some reason postmaster killed a process and so all of postgres restarted to avoid corruption. Postgres was down for only a few seconds and normal service should have resumed.

However, all the persistent db connections in Django were now dead and needed to reconnect. But is_usable was throwing "InterfaceError: connection already closed" and so Django would never reconnect and only served error 500 pages until I manually restarted it.

comment:7 Changed 15 months ago by timo

#22060 was a duplicate.

comment:8 Changed 15 months ago by Connor23

The fix proposed in comment ticket:21553#comment:4 works great.

I was having the exact same issue but ONLY when using persistent connections with Postgres.

comment:9 Changed 15 months ago by aaugustin

  • Owner changed from nobody to aaugustin
  • Status changed from new to assigned

comment:10 Changed 14 months ago by timo

  • Triage Stage changed from Unreviewed to Accepted
  • Type changed from Uncategorized to Bug

comment:11 Changed 13 months ago by depaolim

  • Cc depaolim@… added

comment:12 follow-up: Changed 13 months ago by maikhoepfel

This issue took down my production sites twice until I manually restarted. I'd argue that persistent connections with PostgreSQL (and potentially MySQL) have to be considered broken and that this should be escalated to release blocker.

I'd follow adam-iris's argument that is_usable should never throw an exception, and default to returning False. I'd argue that catching all exceptions is reasonable in this case, but am aware it's considered bad style. Either way, InterfaceError needs to be caught at least in the PostgreSQL backend. I'm not sure if MySQL's backend's ping() can throw an InterfaceError, that would be worth a look.

I tried to reproduce this locally, but did not manage so far. The downtimes were caused by Heroku restarting the PostgreSQL service, from which Django should be able to recover gracefully.

comment:13 Changed 13 months ago by aaugustin

  • Severity changed from Normal to Release blocker

comment:14 Changed 13 months ago by Aymeric Augustin <aymeric.augustin@…>

  • Resolution set to fixed
  • Status changed from assigned to closed

In 5f2f47fdfcc063a60ad1c3688e0c6d44b066d549:

Fixed #21553 -- Ensured unusable database connections get closed.

comment:15 Changed 13 months ago by Aymeric Augustin <aymeric.augustin@…>

In 6fa7d7c594597ab7072ffa740da5b6f4a5a6cd26:

[1.6.x] Fixed #21553 -- Ensured unusable database connections get closed.

Backport of 5f2f47f from master

comment:16 Changed 13 months ago by Aymeric Augustin <aymeric.augustin@…>

In f6f188ffc7d48f7f38edea35234f23f2cfefda0b:

[1.7.x] Fixed #21553 -- Ensured unusable database connections get closed.

Backport of 5f2f47f from master

comment:17 Changed 13 months ago by shai

  • Cc shai added
  • Patch needs improvement set
  • Resolution fixed deleted
  • Status changed from closed to new

The test introduced in 1.6.x is broken with Python 2.7 -- it uses a cursor as a context manager, and on 2.7 that only works with Django 1.7.

The block that uses it is empty:

with connection.cursor():
    pass

the with here doesn't really handle any exceptions; I think that on Django>=1.7, this block is equivalent to connection.cursor().close(), which would also work on 1.6.

comment:18 Changed 13 months ago by Aymeric Augustin <aymeric.augustin@…>

In e68c084ed17d185f19658e1d7fe8c7047d59aea6:

Fixed a broken test introduced in 6fa7d7c5. Refs #21553.

Thanks Shai.

comment:19 Changed 13 months ago by aaugustin

  • Patch needs improvement unset
  • Resolution set to fixed
  • Status changed from new to closed

comment:20 in reply to: ↑ 12 Changed 13 months ago by aaugustin

Replying to maikhoepfel:

I tried to reproduce this locally, but did not manage so far. The downtimes were caused by Heroku restarting the PostgreSQL service, from which Django should be able to recover gracefully.

If you tested with runserver, it opens a new connection for each request, because connections are thread local and it uses a new thread for each connection. You must use ./manage.py runserver --nothreading to test this feature.

comment:21 Changed 10 months ago by max@…

We are not using persistent connections, but every once in a while we are getting InterfaceError.
This occasionally happens just after PostgreSQL server is restarted and sometimes it just happends during regular operations.

is_usable is supposed to check if connection is alive: return false if it's not.
InterfaceError clearly says it's not usable, why would we raise an exception instead of returning false?

Fix is easy: https://github.com/mkorenkov/django/commit/ddb3083c3959e3b30f2c0821d0ba1fec7d34a834 and users are suffering.

comment:22 Changed 10 months ago by max@…

  • Resolution fixed deleted
  • Status changed from closed to new
  • Triage Stage changed from Accepted to Someday/Maybe

reopening based on the previous comment.

we were getting InterfaceError in is_usable method. Please check the fix above.

comment:23 Changed 10 months ago by aaugustin

  • Resolution set to fixed
  • Status changed from new to closed
  • Triage Stage changed from Someday/Maybe to Accepted

The fix isn't made at the right level.

The problem you're seeing is a new issue: an unwrapped exception is reaching the Django layers. It may be a duplicate of a variant of #22879.

Can you open a new ticket with the full stack trace? That will allow us to address the root cause rather than the symptom in one particular place (leaving other places vulnerable). Thank you!

comment:24 Changed 9 months ago by anonymous

The problem you're seeing is a new issue: an unwrapped exception is reaching the Django layers.

I am quite OK, if InterfaceError would be handled somewhere at connection.cursor().execute and re-raised as a DatabaseError.

Sorry, I would have to switch the production code to the version without the fix to get the stack trace, which is not really an option.

Note: See TracTickets for help on using tickets.
Back to Top