Opened 3 years ago
Closed 5 weeks ago
#33092 closed Bug (worksforme)
PyMemcacheCache backend fails when running as a wsgi application with gevent worker class
Reported by: | Martijn van der Blom | Owned by: | nobody |
---|---|---|---|
Component: | Core (Cache system) | Version: | 3.2 |
Severity: | Normal | Keywords: | |
Cc: | Andrew Godwin, Nick Pope | Triage Stage: | Accepted |
Has patch: | no | Needs documentation: | no |
Needs tests: | no | Patch needs improvement: | no |
Easy pickings: | no | UI/UX: | no |
Description (last modified by )
For thread-safety when using pymemcache
the option 'use_pooling': True
can be passed via OPTIONS
which will make pymemcache.HashClient
use pymemcache.PooledClient
instead of pymemcache.Client
internally.
Some documentation should be added or improved.
In our application we were using the MemcachedCache backend to connect to a memcached server. Since this backend will be removed in Django 4.1 we thought we'd migrate to the alternative PyMemcacheCache backend as suggested in the Django documentation at: https://docs.djangoproject.com/en/3.2/topics/cache/
After upgrading we encountered several errors when running the application in gunicorn with the gevent worker class.
Summary of errors:
- gevent._socketcommon.cancel_wait_ex: [Errno 9] File descriptor was closed in another greenlet
- gevent.exceptions.ConcurrentObjectUseError: This socket is already used by another greenlet: <bound method Waiter.switch of <gevent._gevent_c_waiter.Waiter object at 0x7f8e77b00a40>>
- OSError: [Errno 9] Bad file descriptor
These errors seem to be related to either the Django backend implementation or Pymemcache not handling multi-threading/thread-safety properly.
There is a related bug for the Pymemcache library where a member of that team states that is up the application using Pymemcache to handle thread-safety (https://github.com/pinterest/pymemcache/issues/195#issuecomment-452523524). In this case the Django framework. This comment in Django's BaseMemcachedCache implementation indicates that that was the original intent: https://github.com/django/django/blob/main/django/core/cache/backends/memcached.py#L38
so i think PyMemcacheCache should handle this.
Example project that reproduces the error:
https://github.com/mvanderblom/django-memcached-bugreport
For us, this error prevents us from using the PyMemcacheCache backend and thus from upgrading to Django 4.1 when it gets released.
Attachments (4)
Change History (19)
by , 3 years ago
Attachment: | example.log added |
---|
comment:1 by , 3 years ago
Cc: | added |
---|
Thanks for the report. This can be also an issue with asgiref.local.Local
. Can you confirm that this issue still exists with asgiref==3.4.1
and on the Django's main
branch?
comment:2 by , 3 years ago
Description: | modified (diff) |
---|---|
Summary: | PyMemcacheCache backend fails when running as a wsgi application with gevent worker class → Add note regarding thread-safety when using PyMemcacheCache. |
Triage Stage: | Unreviewed → Accepted |
Type: | Bug → Cleanup/optimization |
So, yes, pymemcache.Client
is not thread-safe. We are using pymemcache.HashClient
so that we can support connections to multiple servers.
I note that pymemcache.PooledClient
is thread-safe according to the documentation. We can pass the use_pooling
flag to HashClient
. Unfortunately pymemcache
's documentation is a little sparse!
If I add 'OPTIONS': {'use_pooling': True}
to your CACHES
configuration in your reproducer the problem goes away for me.
Would you be prepared to open a PR with a tweak to the documentation? I already mentioned use_pooling
in at the end of the cache arguments section, so maybe we just need to amend the sentence before?
by , 3 years ago
Attachment: | use_pooling_true.log added |
---|
Logfile of a run with use_pooling set to true
comment:3 by , 3 years ago
Thanks for the quick reply! I've tried your suggested solution, but although les frequent (3 vs 31 times), the error still occurs.
I've verified that the config actually causes the HashClient to use a PooledClient and it does.
I've updated the example in the git repo and attached a log to this issue.
Do you guys have any clue as to what's happening here or any hints about what to do next?
comment:4 by , 3 years ago
That's interesting - I'll have to have another try later.
I don't know a great deal about this... Perhaps the issue is that PooledClient
is using threading.Lock
. See here.
As mentioned here some monkey patching needs to be done with gevent.monkey.patch_all()
, but it seems that is already handled by gunicorn for the gevent worker.
Perhaps you can try passing lock_generator
in OPTIONS
with a different lock type, e.g. see https://www.gevent.org/api/gevent.lock.html
comment:5 by , 3 years ago
I've tried passing gevent.lock.BoundedSemaphore as a lock_generator, but unfortunately that didn't make any difference. As you say, the gevent worker already calls monkey.patch_all() that should cause all usages of threading.[R]Lock to use the gevent variant. I'm still not sure how to proceed. Any further help would be much appreciated.
comment:6 by , 3 years ago
Hi Nick, did you get a chance to look at this issue again? I'm still in the dark as to how to resolve this issue.
For now this prevents me from using the suggested alternative to MemcachedCache and when time comes, it will prevent me (and probably others) to update to Django 4.1.
If you want me to investigate something, please let me know.
comment:7 by , 3 years ago
Summary: | Add note regarding thread-safety when using PyMemcacheCache. → PyMemcacheCache backend fails when running as a wsgi application with gevent worker class |
---|---|
Type: | Cleanup/optimization → Bug |
I really don't think this issue can be resolved with only changes to the docs. Again, please let me know if you want me to investigate something further
comment:8 by , 3 years ago
I'm not sure it is related, but I found CacheMiddleware is not thread-safe and provided a solution in #33252.
follow-up: 10 comment:9 by , 3 years ago
I can still reproduce the issue in Django 3.2.10 so it doesn't seem to be fixed by #33252
comment:10 by , 3 years ago
Replying to Martijn van der Blom:
I can still reproduce the issue in Django 3.2.10 so it doesn't seem to be fixed by #33252
That fix will not be backported to the 3.2.x series according to our backport policy.
It would be handy if you can run your test against the master branch to see if #33252 resolves the issue.
comment:11 by , 3 years ago
Will this be fixed in future releases of Django 4? I'm encountering the exact same issue with Django + mod_wsgi
Django==4.0
asgiref==3.4.1
pymemcache==3.5.0
Setting the "use_pooling" option helped, but didn't get rid of all the errors.
comment:12 by , 2 years ago
I've set up a simple project to reproduce this error here.
From my testing: This issue still occurs on django==4.0.7, but appears to be resolved on django==4.1.0. Perhaps this is the first version to include the fix from #33252? (I'm not sure how to tell which release(s) contains the fix made for that ticket).
comment:13 by , 14 months ago
Another issue I recently ran into is that pymemcache.base.client.PooledClient.gets
does not have the same signature as pymemcache.base.client.Client.gets
. The latter accepts default
as a keyword argument whereas the former does not.
When combined with the recommend configuration ignoring exceptions by setting ignore_exc: True
you can run into a situation where calling gets
there is an error that gets swallowed despite the value being in the cache and you get nothing back. This took a while to track down and debug.
comment:14 by , 10 months ago
From my testing: This issue still occurs on django==4.0.7, but appears to be resolved on django==4.1.0. Perhaps this is the first version to include the fix from #33252?
The #33252 merge commit is 3ff7b15bb79f2ee5b7af245c55ae14546243bb77. It is tagged from 4.1a1.
by , 5 weeks ago
Attachment: | 20241018-without-pooling.log added |
---|
by , 5 weeks ago
Attachment: | 20241018-with-pooling.log added |
---|
comment:15 by , 5 weeks ago
Resolution: | → worksforme |
---|---|
Status: | new → closed |
I've just tested this using the following:
python==3.9.20 memcached==1.6.31 django==4.2.16 gevent==24.10.2 gunicorn==23.0.0 pymemcache==4.0.0
I was unable to reproduce the original problem using the provided reproducer on multiple runs, with or without {"use_pooling": True}
, so it seems to have been resolved.
Replying to aseem-md:
Another issue I recently ran into is that
pymemcache.base.client.PooledClient.gets
does not have the same signature aspymemcache.base.client.Client.gets
. The latter acceptsdefault
as a keyword argument whereas the former does not.
When combined with the recommend configuration ignoring exceptions by setting
ignore_exc: True
you can run into a situation where callinggets
there is an error that gets swallowed despite the value being in the cache and you get nothing back. This took a while to track down and debug.
This is unrelated to the original issue. However, I'll note that there have been a bunch of improvements in pymemcache==4.0.0
such that the default is returned if from .get()
if ignore_exc
is True
and no server is available. See the release notes for details.
Log file that contains the log from a run of the example project. The log file contains the full stacktraces for this error.