#35166 closed New feature (wontfix)
Return python-memcached support
Reported by: | Matej Spiller Muys | Owned by: | nobody |
---|---|---|---|
Component: | Core (Cache system) | Version: | 5.0 |
Severity: | Normal | Keywords: | memcached performance get_many |
Cc: | Nick Pope | Triage Stage: | Unreviewed |
Has patch: | no | Needs documentation: | no |
Needs tests: | no | Patch needs improvement: | no |
Easy pickings: | no | UI/UX: | no |
Description (last modified by )
In the past python-memcached was removed from Django (4.1).
https://docs.djangoproject.com/en/5.0/releases/4.1/
However https://github.com/linsomniac/python-memcached/releases was resurected with 3 released in past month solving both hacks needed in Django:
https://github.com/linsomniac/python-memcached/issues/159
https://github.com/linsomniac/python-memcached/issues/170
Commit that deleted python-memcached:
https://github.com/django/django/commit/05f3a6186efefc9fca2204a745b992501c6fd91f
Why is python-memcached interesting today is because of HUUUGE performance degradation:
https://github.com/pinterest/pymemcache/issues/478
Basically pymemcache get_many is very slow because it send the request to first sever wait for response, then to second server waiting for response.
But python-memcached first sends the requests to all servers and then waits for responses making it number of memcached servers times faster.
Change History (8)
comment:1 by , 9 months ago
Description: | modified (diff) |
---|
comment:2 by , 9 months ago
comment:3 by , 9 months ago
Cc: | added |
---|---|
Component: | Uncategorized → Core (Cache system) |
Resolution: | → wontfix |
Status: | new → closed |
Thanks for this ticket, however I don't think we would like to re-add python-memcached
support.
Basically pymemcache get_many is very slow because it send the request to first sever wait for response, then to second server waiting for response.
But python-memcached first sends the requests to all servers and then waits for responses making it number of memcached servers times faster.
This should be fixed in pymemcache
, we cannot add/remove memcached
libraries every 2 releases.
You can follow the triaging guidelines with regards to wontfix tickets and take this to DevelopersMailingList, if you don't agree.
comment:4 by , 9 months ago
The architecture of pymemcache is not allowing such flow without major refactoring since they splited multi server logic into separated logic making it hard to fix it.
That's why the issue was not solved and does not look to be solved in the near future unfourtanetly. I tried refactoring of pymemcache but it is really hard and would take considerable amount of development.
I understand you cannot add/remove memcached libraries every 2 releases but still current implementation of pymemcache is multiple times slower when using multiple servers. It's was a huge regression in Django memcache (we use 4 server so 4x times slower get_many).
The interface of python-memcached interface is now aligned with pymemcache so no longer need for hacks.
comment:5 by , 9 months ago
FYI: I originally added support for pymemcache
and suggested the deprecation and removal of python-memcached
as it was unmaintained for years.
I confess I was curious recently when I saw that python-memcached
had a new release (v1.60) that solved many of the problems we originally had. Unfortunately, it also broke another workaround that Django required which made it inconsistent with other client libraries. I managed to get that fixed earlier this year and it was released as v1.62.
I even knocked up a commit to see what it would look like to reintroduce this backend: https://github.com/ngnpope/django/commit/python-memcached
It's actually very, very little. The question is - is it worth reintroducing given the history of poor maintenance and inconsistency? We wouldn't want to add this again if we're going to have to rip it out again, as Mariusz says... Anyway - I'm on the fence with putting this back in - let's say +0.1!
If you really need to use python-memcached>=1.62
you can write your own backend which is as simple as this:
import pickle from django.core.cache.backends.memcached import BaseMemcachedCache class MemcachedCache(BaseMemcachedCache): """An implementation of a cache binding using python-memcached.""" def __init__(self, server, params): import memcache super().__init__( server, params, library=memcache, value_not_found_exception=ValueError ) self._options = {"pickleProtocol": pickle.HIGHEST_PROTOCOL} | self._options
comment:6 by , 9 months ago
Why is python-memcached interesting today is because of HUUUGE performance degradation:
https://github.com/pinterest/pymemcache/issues/478
I forgot point out that this is not making a persuasive argument when the linked issue that you have written contains no performance comparison.
comment:7 by , 9 months ago
We already have our own backend since Django 4.1 because of performance issues.
I forgot point out that this is not making a persuasive argument when the linked issue that you have written contains no performance comparison.
It is really simple to reproduce/mesaure. We have 4 memcached servers on AWS each having around 1 ms roundtrip.
Now if get_many needs to fetch from all 4, it takes 4 ms for pymemcache (doing 4x loop of send & receive) and 1 ms for python-memcached (4x send then 4x receive ...). I didn't write the measurements at that time because it was clear to the author of pymemcache what the problem is and that is would be solved if refactored the same way as python-memcached logic. Basically you shouldn't wait for response after each send before sending to another server (but hash.py in pymemcache just loops around base.py code making it harder to optimise).
The author of pymemcache is using mcrouter that works the same as python-memcached (first 4x send and then 4x receive). We don't want to use mcrouter, it would help but still adds a bit of extra hop latency or as alternative deploy it as side car with extra deploy complexity.
comment:8 by , 7 months ago
Besides the performance aspect, there is also the silenting of errors that python-memcached does, while the alternatives do not.
Also somewhat related to this issue #33092, which also talks about threading issues and errors due to that.
So at this time there is no drop-in replacement for the former default cache-backend that just works for everyone.
In any case, thanks for the short code for the own backend, now the cache errors are silent again.
We are using it as follows with latest (it should be the minimum to remove all the hacks):
python-memcached==1.62