Opened 3 years ago

Closed 3 months ago

#20892 closed New feature (fixed)

Allow configuring the memcached Client with CACHES 'OPTIONS'

Reported by: alex@… Owned by: Ed Morley
Component: Core (Cache system) Version: 1.5
Severity: Normal Keywords:
Cc: alex@…, shai@…, olavmrk@…, emorley@… Triage Stage: Accepted
Has patch: yes Needs documentation: no
Needs tests: no Patch needs improvement: no
Easy pickings: no UI/UX: no

Description

This appears to be an issue related to the fact that older releases of memcached insisted on a 1 MB value size limit.

Current releases of memcached have a user-configurable limit, however Django still uses a hard-wired 1 MB limit.

This (undocumented) 'feature' is currently causing errors, see the following two SO questions (one by me):

http://stackoverflow.com/questions/18143159/apparent-bug-storing-large-keys-in-django-memcached
http://stackoverflow.com/questions/11874377/django-caching-a-large-list

A 'solution' is proposed by the second SO question:

also edit this class in memcached.py located in /usr/lib/python2.7/dist-packages/django/core/cache/backends to look like this:

class MemcachedCache(BaseMemcachedCache):
"An implementation of a cache binding using python-memcached"
def __init__(self, server, params):
    import memcache
    memcache.SERVER_MAX_VALUE_LENGTH = 1024*1024*10 #added limit to accept 10mb
    super(MemcachedCache, self).__init__(server, params,
                                         library=memcache,
                                         value_not_found_exception=ValueError)

Perhaps this could be added to the config of the logger?

For example, an ITEM_SIZE_MAX key could be created for the dictionary, so the following code could be used:

CACHES = {
    'default': {
        'BACKEND': 'django.core.cache.backends.memcached.MemcachedCache',
        'ITEM_SIZE_MAX': 10,
        'LOCATION': 'localhost:11211',
    }
}

Obviously, the user would also have to add the necessary -I parameter to their /etc/memcached.conf (system dependent). I suppose the best would be if we could automatically detect the item size from their current memcached setup. For example:

from subprocess import Popen, Pipe
memcached_call = Popen("echo stats settings | nc localhost 11211", stdout=PIPE, stderr=PIPE, shell=True)
out, err = memcached_call.communicate()
if err:
    log("Error when getting memcached stats to find item size max: "+err+"\n. Assuming max size is 1 MB")
    maxsize_int = 1024*1024
else:
    values = out.split("\n")
    maxsize_field = filter(lambda field: field.startswith("STAT item_size_max"), values)[0]
    maxsize_int = int(maxsize_field.strip()[maxsize_field.rindex(" "):])

Change History (21)

comment:1 Changed 3 years ago by anonymous

import memcache
memcache.SERVER_MAX_VALUE_LENGTH = 1024*1024*10 #added limit to accept 10mb

Apparently this solution does NOT actually work... the value from the module is not propagated to the client properly.
I've reported the issue to python-memcache: https://github.com/linsomniac/python-memcached/issues/13

comment:2 Changed 3 years ago by Aymeric Augustin

As far as I can tell, Django itself doesn't contain any code to limit the size of cached items. (If it does, sorry, I missed it.)

Could you clarify your assertion that "Django still uses a hard-wired 1 MB limit."?

If the limitation is entirely handled by the memcache client library and server, I'm inclined to close this ticket as invalid.

comment:3 Changed 3 years ago by Shai Berger

Cc: shai@… added

If I understand correctly, the ticket complains that there is no sane way to configure the max-item-size for use in Django; this leads to Django apps "always" using the default 1M limit. For the suggested change to work, the user also needs to take care of their server (at least if using a larger value), and if the anonymous user is right, also fix a bug in the client lib, but the Django part is missing either way.

I think that is valid.

comment:4 Changed 3 years ago by Tim Graham

It's possible to accomplish this using a custom cache backend, but it may be nice to make this a bit easier:

import pickle

from django.core.cache.backends.memcached import MemcachedCache

MIN_COMPRESS_LENGTH = 1048576 # 1MB

class CustomMemcachedCache(MemcachedCache):

    @property
    def _cache(self):
        """
        Override to add pass some custom parameters to the python_memcached
        backend.
        """
        if getattr(self, '_client', None) is None:
            options = {}
            if self._options and 'SERVER_MAX_VALUE_LENGTH' in self._options:
                options['server_max_value_length'] = self._options['SERVER_MAX_VALUE_LENGTH']
            self._client = self._lib.Client(self._servers,
                pickleProtocol = pickle.HIGHEST_PROTOCOL, **options
            )
        return self._client

    def set(self, key, value, timeout=0, version=None):
        """
        Override to pass MIN_COMPRESS_LENGTH so large values are compressed.
        """
        key = self.make_key(key, version=version)
        self._cache.set(key, value, self._get_memcache_timeout(timeout), MIN_COMPRESS_LENGTH)

comment:5 Changed 3 years ago by anonymous

""If I understand correctly, the ticket complains that there is no sane way to configure the max-item-size for use in Django""

Precisely.

Django does not deliberately limit the size, but it's dependent modules do, without having their configuration options passed through.

comment:6 Changed 3 years ago by anonymous

Sorry guys, I just realised I really ought to make an account and stop posting anonymously... I posted this issue, and the anonymous comments are by me.

comment:7 Changed 3 years ago by AlexF124

Ok, python-memcache has gone ahead and fixed the bug, so the

import memcache
memcache.SERVER_MAX_VALUE_LENGTH = 1024*1024*10 #added limit to accept 10mb

would work from python-memcache's next release and onwards.

Last edited 3 years ago by AlexF124 (previous) (diff)

comment:8 Changed 3 years ago by Aymeric Augustin

I'm still unconvinced there's much value in channeling this configuration through Django's CACHES setting, but I'm not opposing it if another core dev wants it.

Last edited 3 years ago by Aymeric Augustin (previous) (diff)

comment:9 Changed 3 years ago by Shai Berger

Summary: Django memcached binding artificially limits item sizeadd maximum item size configuration to memcached cache backend
Triage Stage: UnreviewedAccepted
Type: BugNew feature

I'm not sure if that was Aymeric's point, but in any case, there should be a "don't do that unless you know what you're doing" warning on the use of this feature; memcached recommends strongly against it. I think as part of this, the entry should be named "FORCE_ITEM_MAX_SIZE" or something like it.

BTW, I think the most common cause for wanting it -- the only case I've seen up close -- is pushing big objects into a cached session. Perhaps this is something we should specifically recommend against.

comment:10 Changed 3 years ago by Tim Graham

Component: UncategorizedCore (Cache system)

comment:11 Changed 21 months ago by Olav Morken

Cc: olavmrk@… added

Hi,

I ran into this issue now, and have a few comments:

  • This problem only affects the django.core.cache.backends.memcached.MemcachedCache backend. A simple workaround may be to use the django.core.cache.backends.memcached.PyLibMCCache backend.
  • To fix this, one doesn't actually need to write to memcache.SERVER_MAX_VALUE_LENGTH variable. It can also be passed as an option to the constructor: [...].Client(self._servers, pickleProtocol=pickle.HIGHEST_PROTOCOL, server_max_value_length=max_value_size)

As for reasons to want this: In my case, I am fetching a blob of data from a remote web service that is 1.2 MiB. Caching that data blob saves an expensive API call.

comment:12 Changed 9 months ago by Tim Graham

Summary: add maximum item size configuration to memcached cache backendAllow configuring the memcached Client with CACHES 'OPTIONS'

Lets make this ticket a bit more generic after a request in #26301 to set another option (socket_timeout) in Client.__init__(). I'm thinking a key in 'OPTIONS' like CLIENT_OPTIONS which would be a dictionary passed to Client.__init__() as kwargs might work.

comment:14 Changed 9 months ago by Tim Graham

Has patch: set

comment:15 Changed 9 months ago by Tim Graham

Patch needs improvement: set

Please uncheck "patch needs improvement" after you update the patch, thanks!

comment:16 Changed 9 months ago by Maciej Dziardziel

Patch needs improvement: unset

comment:17 Changed 9 months ago by Tim Graham

Patch needs improvement: set

comment:18 Changed 3 months ago by Tim Graham <timograham@…>

In 606a303:

Fixed #27124 -- Excluded cull-related cache configs from memcached tests.

Since the cull and zero_cull test cache configs set MAX_ENTRIES
and CULL_FREQUENCY in OPTIONS, which are only intended for use with
the locmem, filesystem, and database backends. This prevents test
failures once refs #20892 is fixed.

comment:19 Changed 3 months ago by Ed Morley

Cc: emorley@… added
Owner: changed from nobody to Ed Morley
Patch needs improvement: unset
Status: newassigned

I've opened a new PR, which takes a slightly different approach:

  • The contents of OPTIONS is now passed directly to the client constructor
  • For pylibmc, behaviors are now passed inside the behaviors key within OPTIONS, rather than as top-level properties of OPTIONS. The old behavior is still supported for now, but generates deprecation warnings.

Discussion of alternative approaches (on the old PR):
https://github.com/django/django/pull/6233#issuecomment-192459369

The new PR:
https://github.com/django/django/pull/7160

comment:20 Changed 3 months ago by Tim Graham <timograham@…>

In fb8eea56:

[1.10.x] Fixed #27124 -- Excluded cull-related cache configs from memcached tests.

Since the cull and zero_cull test cache configs set MAX_ENTRIES
and CULL_FREQUENCY in OPTIONS, which are only intended for use with
the locmem, filesystem, and database backends. This prevents test
failures once refs #20892 is fixed.

Backport of 606a303856afee684563f9349c2a55578854f1ba from master

comment:21 Changed 3 months ago by Tim Graham <timograham@…>

Resolution: fixed
Status: assignedclosed

In 65ec8fa:

Fixed #20892 -- Allowed configuring memcached client using OPTIONS.

Previously, the MemcachedCache backend ignored OPTIONS and
PyLibMCCache used them to set pylibmc behaviors. Both backends now
pass OPTIONS as keyword arguments to the client constructors.

Note: See TracTickets for help on using tickets.
Back to Top