Opened 4 years ago
Closed 13 months ago
#33092 closed Bug (worksforme)
PyMemcacheCache backend fails when running as a wsgi application with gevent worker class
| Reported by: | Martijn van der Blom | Owned by: | nobody |
|---|---|---|---|
| Component: | Core (Cache system) | Version: | 3.2 |
| Severity: | Normal | Keywords: | |
| Cc: | Andrew Godwin, Nick Pope | Triage Stage: | Accepted |
| Has patch: | no | Needs documentation: | no |
| Needs tests: | no | Patch needs improvement: | no |
| Easy pickings: | no | UI/UX: | no |
Description (last modified by )
For thread-safety when using pymemcache the option 'use_pooling': True can be passed via OPTIONS which will make pymemcache.HashClient use pymemcache.PooledClient instead of pymemcache.Client internally.
Some documentation should be added or improved.
In our application we were using the MemcachedCache backend to connect to a memcached server. Since this backend will be removed in Django 4.1 we thought we'd migrate to the alternative PyMemcacheCache backend as suggested in the Django documentation at: https://docs.djangoproject.com/en/3.2/topics/cache/
After upgrading we encountered several errors when running the application in gunicorn with the gevent worker class.
Summary of errors:
- gevent._socketcommon.cancel_wait_ex: [Errno 9] File descriptor was closed in another greenlet
- gevent.exceptions.ConcurrentObjectUseError: This socket is already used by another greenlet: <bound method Waiter.switch of <gevent._gevent_c_waiter.Waiter object at 0x7f8e77b00a40>>
- OSError: [Errno 9] Bad file descriptor
These errors seem to be related to either the Django backend implementation or Pymemcache not handling multi-threading/thread-safety properly.
There is a related bug for the Pymemcache library where a member of that team states that is up the application using Pymemcache to handle thread-safety (https://github.com/pinterest/pymemcache/issues/195#issuecomment-452523524). In this case the Django framework. This comment in Django's BaseMemcachedCache implementation indicates that that was the original intent: https://github.com/django/django/blob/main/django/core/cache/backends/memcached.py#L38
so i think PyMemcacheCache should handle this.
Example project that reproduces the error:
https://github.com/mvanderblom/django-memcached-bugreport
For us, this error prevents us from using the PyMemcacheCache backend and thus from upgrading to Django 4.1 when it gets released.
Attachments (4)
Change History (19)
by , 4 years ago
| Attachment: | example.log added |
|---|
comment:1 by , 4 years ago
| Cc: | added |
|---|
Thanks for the report. This can be also an issue with asgiref.local.Local. Can you confirm that this issue still exists with asgiref==3.4.1 and on the Django's main branch?
comment:2 by , 4 years ago
| Description: | modified (diff) |
|---|---|
| Summary: | PyMemcacheCache backend fails when running as a wsgi application with gevent worker class → Add note regarding thread-safety when using PyMemcacheCache. |
| Triage Stage: | Unreviewed → Accepted |
| Type: | Bug → Cleanup/optimization |
So, yes, pymemcache.Client is not thread-safe. We are using pymemcache.HashClient so that we can support connections to multiple servers.
I note that pymemcache.PooledClient is thread-safe according to the documentation. We can pass the use_pooling flag to HashClient. Unfortunately pymemcache's documentation is a little sparse!
If I add 'OPTIONS': {'use_pooling': True} to your CACHES configuration in your reproducer the problem goes away for me.
Would you be prepared to open a PR with a tweak to the documentation? I already mentioned use_pooling in at the end of the cache arguments section, so maybe we just need to amend the sentence before?
by , 4 years ago
| Attachment: | use_pooling_true.log added |
|---|
Logfile of a run with use_pooling set to true
comment:3 by , 4 years ago
Thanks for the quick reply! I've tried your suggested solution, but although les frequent (3 vs 31 times), the error still occurs.
I've verified that the config actually causes the HashClient to use a PooledClient and it does.
I've updated the example in the git repo and attached a log to this issue.
Do you guys have any clue as to what's happening here or any hints about what to do next?
comment:4 by , 4 years ago
That's interesting - I'll have to have another try later.
I don't know a great deal about this... Perhaps the issue is that PooledClient is using threading.Lock. See here.
As mentioned here some monkey patching needs to be done with gevent.monkey.patch_all(), but it seems that is already handled by gunicorn for the gevent worker.
Perhaps you can try passing lock_generator in OPTIONS with a different lock type, e.g. see https://www.gevent.org/api/gevent.lock.html
comment:5 by , 4 years ago
I've tried passing gevent.lock.BoundedSemaphore as a lock_generator, but unfortunately that didn't make any difference. As you say, the gevent worker already calls monkey.patch_all() that should cause all usages of threading.[R]Lock to use the gevent variant. I'm still not sure how to proceed. Any further help would be much appreciated.
comment:6 by , 4 years ago
Hi Nick, did you get a chance to look at this issue again? I'm still in the dark as to how to resolve this issue.
For now this prevents me from using the suggested alternative to MemcachedCache and when time comes, it will prevent me (and probably others) to update to Django 4.1.
If you want me to investigate something, please let me know.
comment:7 by , 4 years ago
| Summary: | Add note regarding thread-safety when using PyMemcacheCache. → PyMemcacheCache backend fails when running as a wsgi application with gevent worker class |
|---|---|
| Type: | Cleanup/optimization → Bug |
I really don't think this issue can be resolved with only changes to the docs.
comment:8 by , 4 years ago
I'm not sure it is related, but I found CacheMiddleware is not thread-safe and provided a solution in #33252.
follow-up: 10 comment:9 by , 4 years ago
I can still reproduce the issue in Django 3.2.10 so it doesn't seem to be fixed by #33252
comment:10 by , 4 years ago
Replying to Martijn van der Blom:
I can still reproduce the issue in Django 3.2.10 so it doesn't seem to be fixed by #33252
That fix will not be backported to the 3.2.x series according to our backport policy.
It would be handy if you can run your test against the master branch to see if #33252 resolves the issue.
comment:11 by , 4 years ago
Will this be fixed in future releases of Django 4? I'm encountering the exact same issue with Django + mod_wsgi
Django==4.0
asgiref==3.4.1
pymemcache==3.5.0
Setting the "use_pooling" option helped, but didn't get rid of all the errors.
comment:12 by , 3 years ago
I've set up a simple project to reproduce this error here.
From my testing: This issue still occurs on django==4.0.7, but appears to be resolved on django==4.1.0. Perhaps this is the first version to include the fix from #33252? (I'm not sure how to tell which release(s) contains the fix made for that ticket).
comment:13 by , 2 years ago
Another issue I recently ran into is that pymemcache.base.client.PooledClient.gets does not have the same signature as pymemcache.base.client.Client.gets. The latter accepts default as a keyword argument whereas the former does not.
When combined with the recommend configuration ignoring exceptions by setting ignore_exc: True you can run into a situation where calling gets there is an error that gets swallowed despite the value being in the cache and you get nothing back. This took a while to track down and debug.
comment:14 by , 22 months ago
From my testing: This issue still occurs on django==4.0.7, but appears to be resolved on django==4.1.0. Perhaps this is the first version to include the fix from #33252?
The #33252 merge commit is 3ff7b15bb79f2ee5b7af245c55ae14546243bb77. It is tagged from 4.1a1.
by , 13 months ago
| Attachment: | 20241018-without-pooling.log added |
|---|
by , 13 months ago
| Attachment: | 20241018-with-pooling.log added |
|---|
comment:15 by , 13 months ago
| Resolution: | → worksforme |
|---|---|
| Status: | new → closed |
I've just tested this using the following:
python==3.9.20 memcached==1.6.31 django==4.2.16 gevent==24.10.2 gunicorn==23.0.0 pymemcache==4.0.0
I was unable to reproduce the original problem using the provided reproducer on multiple runs, with or without {"use_pooling": True}, so it seems to have been resolved.
Replying to aseem-md:
Another issue I recently ran into is that
pymemcache.base.client.PooledClient.getsdoes not have the same signature aspymemcache.base.client.Client.gets. The latter acceptsdefaultas a keyword argument whereas the former does not.
When combined with the recommend configuration ignoring exceptions by setting
ignore_exc: Trueyou can run into a situation where callinggetsthere is an error that gets swallowed despite the value being in the cache and you get nothing back. This took a while to track down and debug.
This is unrelated to the original issue. However, I'll note that there have been a bunch of improvements in pymemcache==4.0.0 such that the default is returned if from .get() if ignore_exc is True and no server is available. See the release notes for details.
Log file that contains the log from a run of the example project. The log file contains the full stacktraces for this error.