Opened 16 years ago

Closed 14 years ago

Last modified 7 years ago

#5133 closed Uncategorized (fixed)

Memcached connections get left open (in certain circumstances).

Reported by: Jacob Owned by: nobody
Component: Core (Cache system) Version: dev
Severity: Normal Keywords: memcached
Cc: Michael Malone, rajesh.dhawan@…, harm.verhagen+django@… Triage Stage: Accepted
Has patch: yes Needs documentation: no
Needs tests: no Patch needs improvement: yes
Easy pickings: no UI/UX: no

Description

In a few cases Django under mod_python can leave dangling connections to memcached open. We've had trouble tracking down the circumstances under which this happens, but when it does it can lead to memcached servers hitting their connection limits, which means caching stops. Nasty.

The attached patch should take care of this, but I'm not 100% sure it's the right approach.

Attachments (1)

memcached-cleanup-connections.patch (1.3 KB) - added by Jacob 16 years ago.

Download all attachments as: .zip

Change History (23)

Changed 16 years ago by Jacob

comment:1 Changed 16 years ago by Simon G. <dev@…>

Patch needs improvement: set
Triage Stage: UnreviewedAccepted

comment:2 Changed 16 years ago by h.rauch@…

I have/had the following problem:

With memcached enabled, our site (dms.bildung.hessen.de) does not respond anymore after 10 - 20 minutes. Restarting memcached solves this problem for the next minutes. Therefor it's not a problem with apache or the program logic.

Therefore I tried the patch above. But it not solve our problem.

In the end I use the file-cache with no problems.

comment:3 Changed 16 years ago by hans.rauch@…

I'm sorry, my real email-address is not h.rauch@…, it's hans.rauch@…

comment:4 Changed 16 years ago by mcroydon

Resolution: invalid
Status: newclosed

Please try the latest version of python-memcached (which is 1.4.0 as of this writing). I believe that the error was there and that it has been resolved between 1.3.9 and 1.4.0.

comment:5 Changed 16 years ago by hans.rauch@…

Resolution: invalid
Status: closedreopened

I'm sorry, but I've been using the latest version of memcached (1.2.2) and python-memcached (1.4.0), when I watch these problems.

comment:6 Changed 16 years ago by mcroydon

hans: Indeed, my apologies. I looked at 1.4.0 during the sprint and must have missed it.

here's a patch against python-memcached 1.4.0 that removes threadlocals and doesn't leave any dangling connections for us. The threadlocals bit appeared somewhere in 1.3.x and has been the source for problems for us too. You might also look at cmemcache. This ticket should definitely remain open until we figure something out.

comment:7 Changed 16 years ago by hans.rauch@…

Thanks, your patch seems to solve our problem. Our Site is running for 24 hours and we didn't had to restart memcached or apache.

comment:8 Changed 16 years ago by Ludvig Ericson

Improved patch (really just nitpick): http://dpaste.com/hold/41946/

comment:9 in reply to:  8 Changed 16 years ago by johann@…

Removing thread locals gives me lots of bad errors after running okay for a few minutes:

Traceback (most recent call last):
 File "/usr/lib/python2.5/site-packages/django/core/handlers/base.py", line 82, in get_response
   response = callback(request, *callback_args, **callback_kwargs)
 File "/usr/lib/python2.5/site-packages/shotserver04/websites/views.py", line 93, in details
   preload_foreign_keys(browsers, browser_group=True)
 File "/usr/lib/python2.5/site-packages/shotserver04/common/object_cache.py", line 165, in preload_foreign_keys
   [getattr(instance, field_id) for instance in instances]))
 File "/usr/lib/python2.5/site-packages/shotserver04/common/object_cache.py", line 92, in get_many
   [KEY_FORMAT % (model._meta.db_table, 'id', id) for id in id_list])
 File "/usr/lib/python2.5/site-packages/django/core/cache/backends/memcached.py", line 41, in get_many
   return self._cache.get_multi(map(smart_str,keys))
 File "/usr/lib/python2.5/site-packages/python_memcached-1.41-py2.5.egg/memcache.py", line 717, in get_multi
   retvals[prefixed_to_orig_key[rkey]] = val   # un-prefix returned key.
KeyError: 'platforms_operatingsystem:id=6'

However, memcached-cleanup-connections.patch seems to fix the open connections problem on my system.

  • Django trunk r7519
  • memcached 1.1.12-1
  • python-memcached-1.41

comment:10 Changed 15 years ago by Malcolm Tredinnick

I just saw this happen on a client's production site after we updated the memcache client from 1.34 to 1.43 (server is 1.1.12). We ended up applying Jacob's original patch because it looked the least intrusive, but it feels like something that python-memcache should be fixing rather than Django having to do manual connection management (my only concern with Jacob's patch). We elected not to go with Matt's patch because it basically undoes something that was included in python-memcache 1.36, which was included to fix some (other) multi-threading issues, according to their ChangeLog.

I've filed a bug upstream with the python memcache providers, so we'll see if anything happens.

comment:11 Changed 15 years ago by Michael Malone

Cc: Michael Malone added

comment:12 Changed 15 years ago by Malcolm Tredinnick

milestone: 1.0

Just heard back from the python-memcached maintainer and current behaviour is intended, to allow for more efficient pooling. This problem may well end up being because we're not getting the same thread-id each time inside the same mod_python process or something similar. Since we probably don't want to do inter-thread pooling (for the same reason we don't do database connection pooling), I'd suggest we go with Jacob's original patch for now as a way to manage the connection lifetimes in a well-defined fashion.

Not going to apply this immediately, in case anybody else on the list has bright ideas, but barring any good objections, I think this would be worth applying before 1.0. It's a "bug" in the sense that we're not cooperating well with memcache and problems that only appear when you're wildly successful and/or under high load are not a lot of fun to see.

comment:13 Changed 15 years ago by Graham Dumpleton

If python-memcached can't handle being used in a multithreaded environment, which I recollect as the case when I looked at the code many moons ago, it isn't just a mod_python issue, but would also be an issue were mod_wsgi, fastcgi, paste server, cherrypy wsgi server were used in a multithreaded configuration. I suspect that python-memcached may also have a lot of issues if any of those hosting mechanisms were to kill off threads when they feel like they are no longer needed and then create new ones later on. This would possibly be an issue as not sure it would do cleanup of state when a thread no longer becomes used. Anyway, that later point is an issue for python-memcached and not Django unless Django had similar issues with retaining state for threads beyond their lifetime.

comment:14 Changed 15 years ago by Rajesh Dhawan

Cc: rajesh.dhawan@… added

comment:15 Changed 15 years ago by Malcolm Tredinnick

Resolution: fixed
Status: reopenedclosed

(In [8418]) Fixed #5133 -- Explicitly close memcached connections after each request
(similar to database connection management). We can't effectively manage the
lifecycle by pooling connections and recent versions of python-memcache can
lead to connection exhaustion in some quite reasonable setups.

comment:16 Changed 14 years ago by booink@…

Resolution: fixed
Status: closedreopened

The patch takes care of connections kept open, but it introduces another problem - the need to open one or more tcp connections every single request.

With a simple loop, you can make a system run out of sockets easily - after a socket is closed, that port cannot be reused for an eternity, ranging from 1 minute to 4 depending on OS. If enough sockets get stuck in TIME_WAIT state, the server simply fails to connect to memcached and start serving everything from db again - that's not something you want to see on a site with sufficient traffic to need a memcached installation.

In my opinion, the cure is worse than the disease. There's an easy workaround available for the original problem: restart workers after a certain amount of requests. With max-request=500 on a 5 threads deamon process (mod_wsgi, times 20 processes), we never go over 100 connections on our memcached server, started with the default cap of 1024 connections. If you run mod_python, use MaxRequestsPerChild.

My current solution is to just noop the whole fix with one line in any .py: django.core.cache.backends.memcached.CacheClass.close = lambda x: None. It might be an idea to make it configurable so people can choose between disconnect after every request and keep it open until process restart.

comment:17 Changed 14 years ago by Alex Gaynor

Resolution: fixed
Status: reopenedclosed

This bug was fixed many months ago, please file a newbug if you believe there is a new issue.

comment:18 Changed 13 years ago by harm

Cc: harm.verhagen+django@… added
Easy pickings: unset
Severity: Normal
Type: Uncategorized

comment:19 Changed 12 years ago by Jacob

milestone: 1.0

Milestone 1.0 deleted

comment:20 Changed 7 years ago by Tim Graham <timograham@…>

In 3e935ae:

Refs #5133 -- Tested memcached connection closing after each request.

comment:21 Changed 7 years ago by Tim Graham <timograham@…>

In f02dbbe1:

Fixed #11331 -- Stopped closing pylibmc connections after each request.

libmemcached manages its own connections, so isn't affected by refs #5133.

comment:22 Changed 7 years ago by Tim Graham <timograham@…>

In 82f8996:

Refs #5133 -- Isolated test_close() cache test.

Note: See TracTickets for help on using tickets.
Back to Top