Opened 15 years ago
Closed 14 years ago
#13215 closed (fixed)
Random OperationalError when using FastCGI prefork + PostgreSQL
Reported by: | Henrique C. Alves | Owned by: | jbronn |
---|---|---|---|
Component: | Core (Other) | Version: | 1.1 |
Severity: | Keywords: | ||
Cc: | Triage Stage: | Accepted | |
Has patch: | no | Needs documentation: | no |
Needs tests: | no | Patch needs improvement: | no |
Easy pickings: | no | UI/UX: | no |
Description
To reproduce:
- Setup a Django project with FCGI in prefork mode
python manage.py runfcgi method=prefork
- Use psycopg2 as the database handler
- Setup an async FCGI web server to use your daemon (Cherokee, Lighttpd, Nginx)
Randomly, some requests will result in the following traceback:
File "/usr/lib/python2.6/site-packages/django/core/handlers/base.py", line 86, in get_response response = callback(request, *callback_args, **callback_kwargs) File "/usr/lib/python2.6/site-packages/django/contrib/admin/sites.py", line 140, in root if not self.has_permission(request): File "/usr/lib/python2.6/site-packages/django/contrib/admin/sites.py", line 99, in has_permission return request.user.is_authenticated() and request.user.is_staff File "/usr/lib/python2.6/site-packages/django/contrib/auth/middleware.py", line 5, in __get__ request._cached_user = get_user(request) File "/usr/lib/python2.6/site-packages/django/contrib/auth/__init__.py", line 83, in get_user user_id = request.session[SESSION_KEY] File "/usr/lib/python2.6/site-packages/django/contrib/sessions/backends/base.py", line 46, in __getitem__ return self._session[key] File "/usr/lib/python2.6/site-packages/django/contrib/sessions/backends/base.py", line 172, in _get_session self._session_cache = self.load() File "/usr/lib/python2.6/site-packages/django/contrib/sessions/backends/db.py", line 16, in load expire_date__gt=datetime.datetime.now() File "/usr/lib/python2.6/site-packages/django/db/models/manager.py", line 93, in get return self.get_query_set().get(*args, **kwargs) File "/usr/lib/python2.6/site-packages/django/db/models/query.py", line 304, in get num = len(clone) File "/usr/lib/python2.6/site-packages/django/db/models/query.py", line 160, in __len__ self._result_cache = list(self.iterator()) File "/usr/lib/python2.6/site-packages/django/db/models/query.py", line 275, in iterator for row in self.query.results_iter(): File "/usr/lib/python2.6/site-packages/django/db/models/sql/query.py", line 206, in results_iter for rows in self.execute_sql(MULTI): File "/usr/lib/python2.6/site-packages/django/db/models/sql/query.py", line 1734, in execute_sql cursor.execute(sql, params) OperationalError: server closed the connection unexpectedly This probably means the server terminated abnormally before or while processing the request.
The bug doesn't appear to happend when using threaded mode, but it's hard to tell as it's completly random.
For the record, had this issue with: Django 1.1, Cherokee 0.44.19, python-2.6, PostgreSQL-8.3
Another report with the same issue, but different setup: http://stackoverflow.com/questions/393637/
Attachments (2)
Change History (8)
follow-up: 3 comment:1 by , 15 years ago
Triage Stage: | Unreviewed → Accepted |
---|
comment:2 by , 15 years ago
- Does it matter what the view is doing?
The shown traceback triggers the bug in admin code, but I have the same behaviour in middleware code, before any view is processed. So, no.
- Does it happen consistently on the same view?
It will happen randomly, and for any view/pre-view code that tries to access the database. But it always in the first code trying hit the DB.
- Does it occur under certain load conditions?
No, this doesn't seem to affect. You can get the same behaviour in a local server.
- Does it only occur on long-lived connections?
I have it happening randomly and regardless of my keep-alive configuration, request timeout or view processing time.
- Is the stack trace always the same? If it isn't, what is the common element?
Always caused in the first code trying hit the DB. This might indicate that the DB cursor is dead from request start.
In some way, this bug is similar to #9437
comment:3 by , 15 years ago
Owner: | changed from | to
---|---|
Status: | new → assigned |
Replying to russellm:
In short, we need *anything* that can make this less random, and more repeatable.
Russell, this problem is (unfortunately) easy to demonstrate and reproduce with GeoDjango. It is much easier to do this on Linux than Mac OS X.
- Use WorldBorders model from the GeoDjango tutorial
- Create a very simple view that conducts and configure a URL to go to it. Here's what I did in
world.views
(note that it assumes you loaded the spatial data, see the tutorial):from django.http import HttpResponse from world.models import WorldBorders def world_index(request): sm = WorldBorders.objects.get(name='San Marino') return HttpResponse(sm.mpoly.centroid.wkt)
- Use the following Lighttpd configuration (modify paths accordingly):
server.modules = ( "mod_access", "mod_rewrite", "mod_fastcgi", "mod_accesslog", ) server.document-root = "/home/geo/geodjango/media" server.port = 8000 server.bind = "0.0.0.0" server.username = "geo" server.groupname = "geo" server.pid-file = "/home/geo/geodjango/var/lighttpd-fcgi.pid" static-file.exclude-extensions = (".fcgi", ".py", "~", ) fastcgi.server = ( "/geodjango.fcgi" => ( "main" => ( "socket" => "/home/geo/geodjango/var/geodjango.sock", "check-local" => "disable", ) ), ) url.rewrite-once = ( "^(/.*)$" => "/geodjango.fcgi$1", ) accesslog.filename = "/home/geo/geodjango/var/geodjango.http.log"
- Execute the
runfcgi
command, leave running in a spare terminal:python manage.py runfcgi maxchildren=1 maxrequests=1 socket=var/geodjango.sock daemonize=false debug=true
- Visit the url for the view defined earlier (may work first time, but reload again). You should get a psycopg2
OperationalError
:/django/trunk/django/db/backends/__init__.py in _rollback(self=<django.contrib.gis.db.backends.postgis.base.DatabaseWrapper object at 0x10ce788>) 34 def _rollback(self): 35 if self.connection is not None: 36 return self.connection.rollback() 37 38 def _enter_transaction_management(self, managed): self = <django.contrib.gis.db.backends.postgis.base.DatabaseWrapper object at 0x10ce788>, self.connection = None, self.connection.rollback undefined <class 'psycopg2.OperationalError'>: no connection to the server args = ('no connection to the server\n',) cursor = None message = 'no connection to the server\n' pgcode = None pgerror = None
comment:4 by , 15 years ago
So, this is very similar to #9437. I suspect it's related to an interaction between the close_connection
signal called when a request is finished.
comment:5 by , 15 years ago
I think I know what's causing the problem in my case, but I think it's a distinct issue from the reporter of the ticket (because he doesn't appear to be using GeoDjango). Just like the root cause of #9437, I need to close the connection after getting the !PostGIS version, and not the cursor, to prevent the issue. I'll attach my patches here for posterity (as reporter could be doing something similar in application code), but may just reopen #9437 as it has regressed.
Curiously, SpatiaLite isn't affected, even though it just closes its cursor like PostGIS -- I wonder if this is a due to a difference in the psycopg2 and pysqlite2 bindings.
by , 15 years ago
Attachment: | gis-connection-close-trunk.diff added |
---|
by , 15 years ago
Attachment: | gis-connection-close-1.1.X.diff added |
---|
comment:6 by , 14 years ago
Resolution: | → fixed |
---|---|
Status: | assigned → closed |
Fixed in r12948; forgot to close ticket.
I'll accept this, but I wouldn't hold my breath on getting a solution unless you can narrow down the cause of the problem. Does it matter what the view is doing? Does it happen consistently on the same view? Does it only occur under certain load conditions (CPU, memory, or HTTP access load)? Does it only occur on long-lived connections? Is the stack trace always the same? If it isn't, what is the common element?
In short, we need *anything* that can make this less random, and more repeatable.