Opened 5 years ago

Closed 5 years ago

#13215 closed (fixed)

Random OperationalError when using FastCGI prefork + PostgreSQL

Reported by: hcarvalhoalves Owned by: jbronn
Component: Core (Other) Version: 1.1
Severity: Keywords:
Cc: Triage Stage: Accepted
Has patch: no Needs documentation: no
Needs tests: no Patch needs improvement: no
Easy pickings: UI/UX:

Description

To reproduce:

  1. Setup a Django project with FCGI in prefork mode

python manage.py runfcgi method=prefork

  1. Use psycopg2 as the database handler
  1. Setup an async FCGI web server to use your daemon (Cherokee, Lighttpd, Nginx)

Randomly, some requests will result in the following traceback:

File "/usr/lib/python2.6/site-packages/django/core/handlers/base.py", line 86, in get_response
   response = callback(request, *callback_args, **callback_kwargs)

 File "/usr/lib/python2.6/site-packages/django/contrib/admin/sites.py", line 140, in root
   if not self.has_permission(request):

 File "/usr/lib/python2.6/site-packages/django/contrib/admin/sites.py", line 99, in has_permission
   return request.user.is_authenticated() and request.user.is_staff

 File "/usr/lib/python2.6/site-packages/django/contrib/auth/middleware.py", line 5, in __get__
   request._cached_user = get_user(request)

 File "/usr/lib/python2.6/site-packages/django/contrib/auth/__init__.py", line 83, in get_user
   user_id = request.session[SESSION_KEY]

 File "/usr/lib/python2.6/site-packages/django/contrib/sessions/backends/base.py", line 46, in __getitem__
   return self._session[key]

 File "/usr/lib/python2.6/site-packages/django/contrib/sessions/backends/base.py", line 172, in _get_session
   self._session_cache = self.load()

 File "/usr/lib/python2.6/site-packages/django/contrib/sessions/backends/db.py", line 16, in load
   expire_date__gt=datetime.datetime.now()

 File "/usr/lib/python2.6/site-packages/django/db/models/manager.py", line 93, in get
   return self.get_query_set().get(*args, **kwargs)

 File "/usr/lib/python2.6/site-packages/django/db/models/query.py", line 304, in get
   num = len(clone)

 File "/usr/lib/python2.6/site-packages/django/db/models/query.py", line 160, in __len__
   self._result_cache = list(self.iterator())

 File "/usr/lib/python2.6/site-packages/django/db/models/query.py", line 275, in iterator
   for row in self.query.results_iter():

 File "/usr/lib/python2.6/site-packages/django/db/models/sql/query.py", line 206, in results_iter
   for rows in self.execute_sql(MULTI):

 File "/usr/lib/python2.6/site-packages/django/db/models/sql/query.py", line 1734, in execute_sql
   cursor.execute(sql, params)

OperationalError: server closed the connection unexpectedly
       This probably means the server terminated abnormally
       before or while processing the request.

The bug doesn't appear to happend when using threaded mode, but it's hard to tell as it's completly random.

For the record, had this issue with: Django 1.1, Cherokee 0.44.19, python-2.6, PostgreSQL-8.3

Another report with the same issue, but different setup: http://stackoverflow.com/questions/393637/

Attachments (2)

gis-connection-close-trunk.diff (534 bytes) - added by jbronn 5 years ago.
gis-connection-close-1.1.X.diff (497 bytes) - added by jbronn 5 years ago.

Download all attachments as: .zip

Change History (8)

comment:1 follow-up: Changed 5 years ago by russellm

  • Needs documentation unset
  • Needs tests unset
  • Patch needs improvement unset
  • Triage Stage changed from Unreviewed to Accepted

I'll accept this, but I wouldn't hold my breath on getting a solution unless you can narrow down the cause of the problem. Does it matter what the view is doing? Does it happen consistently on the same view? Does it only occur under certain load conditions (CPU, memory, or HTTP access load)? Does it only occur on long-lived connections? Is the stack trace always the same? If it isn't, what is the common element?

In short, we need *anything* that can make this less random, and more repeatable.

comment:2 Changed 5 years ago by hcarvalhoalves

- Does it matter what the view is doing?

The shown traceback triggers the bug in admin code, but I have the same behaviour in middleware code, before any view is processed. So, no.

- Does it happen consistently on the same view?

It will happen randomly, and for any view/pre-view code that tries to access the database. But it always in the first code trying hit the DB.

- Does it occur under certain load conditions?

No, this doesn't seem to affect. You can get the same behaviour in a local server.

- Does it only occur on long-lived connections?

I have it happening randomly and regardless of my keep-alive configuration, request timeout or view processing time.

- Is the stack trace always the same? If it isn't, what is the common element?

Always caused in the first code trying hit the DB. This might indicate that the DB cursor is dead from request start.


In some way, this bug is similar to #9437

comment:3 in reply to: ↑ 1 Changed 5 years ago by jbronn

  • Owner changed from nobody to jbronn
  • Status changed from new to assigned

Replying to russellm:

In short, we need *anything* that can make this less random, and more repeatable.

Russell, this problem is (unfortunately) easy to demonstrate and reproduce with GeoDjango. It is much easier to do this on Linux than Mac OS X.

  1. Use WorldBorders model from the GeoDjango tutorial
  1. Create a very simple view that conducts and configure a URL to go to it. Here's what I did in world.views (note that it assumes you loaded the spatial data, see the tutorial):
    from django.http import HttpResponse                                                                                                                                                      
    from world.models import WorldBorders                                                                                                                                                     
                                                                                                                                                                                              
    def world_index(request):                                                                                                                                                                 
        sm = WorldBorders.objects.get(name='San Marino')                                                                                                                                      
        return HttpResponse(sm.mpoly.centroid.wkt)          
    
    
  1. Use the following Lighttpd configuration (modify paths accordingly):
    server.modules = (                                                                                                                                                                        
      "mod_access",                                                                                                                                                                           
      "mod_rewrite",                                                                                                                                                                          
      "mod_fastcgi",                                                                                                                                                                          
      "mod_accesslog",                                                                                                                                                                        
    )                                                                                                                                                                                         
    server.document-root = "/home/geo/geodjango/media"                                                                                                                                        
                                                                                                                                                                                              
    server.port = 8000                                                                                                                                                                        
    server.bind = "0.0.0.0"                                                                                                                                                                   
    server.username = "geo"                                                                                                                                                                   
    server.groupname = "geo"                                                                                                                                                                  
    server.pid-file = "/home/geo/geodjango/var/lighttpd-fcgi.pid"                                                                                                                             
    static-file.exclude-extensions = (".fcgi", ".py", "~", )                                                                                                                                  
                                                                                                                                                                                              
                                                                                                                                                                                              
    fastcgi.server = (                                                                                                                                                                        
        "/geodjango.fcgi" => (                                                                                                                                                                
            "main" => (                                                                                                                                                                       
                "socket" => "/home/geo/geodjango/var/geodjango.sock",                                                                                                                         
                    "check-local" => "disable",                                                                                                                                               
            )                                                                                                                                                                                 
        ),                                                                                                                                                                                    
    )                                                                                                                                                                                         
    url.rewrite-once = (                                                                                                                                                                      
         "^(/.*)$" => "/geodjango.fcgi$1",                                                                                                                                                    
    )                                                                                                                                                                                         
                                                                                                                                                                                              
    accesslog.filename = "/home/geo/geodjango/var/geodjango.http.log"
    
  1. Execute the runfcgi command, leave running in a spare terminal:
    python manage.py runfcgi maxchildren=1 maxrequests=1 socket=var/geodjango.sock daemonize=false debug=true
    
  1. Visit the url for the view defined earlier (may work first time, but reload again). You should get a psycopg2 OperationalError:
     /django/trunk/django/db/backends/__init__.py  in _rollback(self=<django.contrib.gis.db.backends.postgis.base.DatabaseWrapper object at 0x10ce788>)
       34     def _rollback(self):
       35         if self.connection is not None:
       36             return self.connection.rollback()
       37 
       38     def _enter_transaction_management(self, managed):
    self = <django.contrib.gis.db.backends.postgis.base.DatabaseWrapper object at 0x10ce788>, self.connection = None, self.connection.rollback undefined
    <class 'psycopg2.OperationalError'>: no connection to the server
          args = ('no connection to the server\n',)
          cursor = None
          message = 'no connection to the server\n'
          pgcode = None
          pgerror = None 
    

comment:4 Changed 5 years ago by jbronn

So, this is very similar to #9437. I suspect it's related to an interaction between the close_connection signal called when a request is finished.

comment:5 Changed 5 years ago by jbronn

I think I know what's causing the problem in my case, but I think it's a distinct issue from the reporter of the ticket (because he doesn't appear to be using GeoDjango). Just like the root cause of #9437, I need to close the connection after getting the !PostGIS version, and not the cursor, to prevent the issue. I'll attach my patches here for posterity (as reporter could be doing something similar in application code), but may just reopen #9437 as it has regressed.

Curiously, SpatiaLite isn't affected, even though it just closes its cursor like PostGIS -- I wonder if this is a due to a difference in the psycopg2 and pysqlite2 bindings.

Changed 5 years ago by jbronn

Changed 5 years ago by jbronn

comment:6 Changed 5 years ago by jbronn

  • Resolution set to fixed
  • Status changed from assigned to closed

Fixed in r12948; forgot to close ticket.

Note: See TracTickets for help on using tickets.
Back to Top