Code

Opened 4 years ago

Closed 21 months ago

#14093 closed Bug (fixed)

Confusing error when failing to create a session key because the cache is unavailable

Reported by: szymon@… Owned by: aaugustin
Component: contrib.sessions Version: 1.2
Severity: Normal Keywords: session, cache, session key
Cc: alanjds Triage Stage: Accepted
Has patch: yes Needs documentation: no
Needs tests: no Patch needs improvement: no
Easy pickings: yes UI/UX: no

Description

Hi,

We are using Django 1.2.1, cache session backend and from time to time, on higher traffic (but not so much, ~600 online users in 5 minutes) a few requests ends with exception:

Traceback (most recent call last):

File "/usr/lib64/python2.6/site-packages/django/core/handlers/base.py", line 80, in get_response

response = middleware_method(request)

File "/home/wonderlife/app/request.py", line 78, in process_request

if request.user.is_authenticated():

File "/usr/lib64/python2.6/site-packages/django/contrib/auth/middleware.py", line 9, in get

request._cached_user = get_user(request)

File "/usr/lib64/python2.6/site-packages/django/contrib/auth/init.py", line 97, in get_user

user_id = request.session[SESSION_KEY]

File "/usr/lib64/python2.6/site-packages/django/contrib/sessions/backends/base.py", line 46, in getitem

return self._session[key]

File "/usr/lib64/python2.6/site-packages/django/contrib/sessions/backends/base.py", line 172, in _get_session

self._session_cache = self.load()

File "/usr/lib64/python2.6/site-packages/django/contrib/sessions/backends/cache.py", line 16, in load

self.create()

File "/usr/lib64/python2.6/site-packages/django/contrib/sessions/backends/cache.py", line 33, in create

raise RuntimeError("Unable to create a new session key.")

RuntimeError: Unable to create a new session key.

Attachments (2)

django-uuid-session-keys.diff (23.8 KB) - added by tomevans223 3 years ago.
Patch to replace md5 session key generation with uuid session key generation. Includes verbatim copy of uuid.py from python 2.6 to fix issue that uuid is not in std libs in python 2.4
trac#14093-cache-session.diff (631 bytes) - added by stumbles 23 months ago.
Extend exception to explain that cache may be unavailable.

Download all attachments as: .zip

Change History (23)

comment:1 Changed 4 years ago by dmoisset

  • Needs documentation unset
  • Needs tests unset
  • Patch needs improvement unset
  • Triage Stage changed from Unreviewed to Accepted

This seems to be related to #8311, which was fixed in [8620]

        # Because a cache can fail silently (e.g. memcache), we don't know if
        # we are failing to create a new session because of a key collision or
        # because the cache is missing. So we try for a (large) number of times
        # and then raise an exception. That's the risk you shoulder if using
        # cache backing.
        for i in xrange(10000):

Anyway, that magic 10000 sounds like a code smell at least, probably the value should be adjustable for different server loads?

Flagging as accepted, at least a comment explaining the consecuences of that 10000 (what makes it different from 1000 or 100000?) should be added

comment:3 Changed 3 years ago by tomevans223

  • Has patch set

Replacing md5 session key with uuid session key would result in virtually zero chance of key collision.

See attached patch.

comment:4 Changed 3 years ago by timo

  • Patch needs improvement set

Tom, I applied your patch and ran the tests for a quick sanity check. Looks like one issue is that django.contrib.sessions.backends.file only allows the following characters in cache keys: VALID_KEY_CHARS = set("abcdef0123456789"). Not sure if we can modify that set or not.

comment:5 Changed 3 years ago by tomevans223

Ah, I didn't run a test run, thanks for that. I don't see a problem with extending that set to include '-', or (less preferably) we could process the generated uuid to remove the '-', but that would leave it less recognizable as a uuid.

I'll update the patch (and re-run tests!) when I'm home from work.

Changed 3 years ago by tomevans223

Patch to replace md5 session key generation with uuid session key generation. Includes verbatim copy of uuid.py from python 2.6 to fix issue that uuid is not in std libs in python 2.4

comment:6 Changed 3 years ago by tomevans223

  • Patch needs improvement unset

Updated patch does pass tests. I've had a good read of the docs, and cannot find anywhere where the format of the session key is discussed, so I don't think there are any doc patches required.

comment:7 Changed 3 years ago by PaulM

The VALID_KEY_CHARS setting was introduced to sanitize user input for security.

Your proposal to switch from MD5 to UUID4 isn't addressing the core of the problem. You'd be going from a 512 bit random(ish) number to a 113 bit random number. This really isn't going to fix the problem, which is that your cache is timing out. We need to address that directly.

See also my post on django-dev:

http://groups.google.com/group/django-developers/msg/61b51319646eb5ca

Version 0, edited 3 years ago by PaulM (next)

comment:8 Changed 3 years ago by tomevans223

There is no problem with changing VALID_KEY_CHARS to include '-', the purpose of it is to sanitize user input to prevent directory traversals.

Other comments replied to on thread.

comment:9 Changed 3 years ago by patchhammer

  • Easy pickings unset
  • Patch needs improvement set
  • Severity set to Normal
  • Type set to Uncategorized

django-uuid-session-keys.diff fails to apply cleanly on to trunk

comment:10 Changed 3 years ago by julien

  • Type changed from Uncategorized to Bug

comment:11 Changed 3 years ago by clay

  • UI/UX unset

I'm experiencing the "Unable to create a new session key" RuntimeError with a client's application. Since it's a new application with moderate use, I do not believe that it is load-related. We have applied the uuid4() patch mentioned here to no avail.

While using tcpdump to diagnose potential problems communicating with memcached, I observed the following flow:

app server -> memcached: GET <key from cookie>
memcached -> app server: END  (indicates nothing stored for this key)
app server -> memcached: SET <key from cookie> ... (the packet capture is truncated here so I don't know what value is being set)

What's interesting is that this flow seems impossible from reading the code. What I expect to happen (and have confirmed by manual testing) is that if the GET returns nothing, we then see another GET for a new key, followed by an ADD for that new key. That flow matches this section of code (from django.contrib.sessions.backends.cache.SessionStore):

    def create(self):
        for i in xrange(10000):
            self.session_key = self._get_new_session_key()  # GET
            try:
                self.save(must_create=True)  # ADD

I'm at a loss for how to explain the memcached communication I see given the RuntimeError traceback reported when this occurs on the live site. Nowhere in the code do I see where a SET should occur after a missed GET, or why this should result in a RuntimeError being raised. Any ideas?

comment:12 Changed 3 years ago by clay

  • Cc clay added
  • Has patch unset
  • Patch needs improvement unset

Unsetting 'Has patch' flag since the patch does not address the cause of the error.

comment:13 Changed 3 years ago by timo

What version of memcached are you using? I saw this problem on memcached 1.2.2. Since upgrading to 1.4.5 (which runs multithreaded by default I believe) I haven't seen the issue.

comment:14 Changed 3 years ago by clay

We're running 1.4.2.

I've looked at some more tcpdump traces, and have observed this exception occurring as a result of packet loss within Amazon EC2's network. If the SYN packet from app server to memcache server is lost, python-memcache times out before the kernel can retransmit it. All subsequent cache queries fail as python-memcache has marked the host dead, leading to the RuntimeError after 10,000 attempts to set a new session key.

I'm at a loss for how to fix this. I'd like to tune Linux's TCP retransmit timer lower than three seconds, but it's not tunable. Failing that, I'd like to have python-memcache retry a connection a couple times before giving up, but it won't. I see that pylibmc has configurable connect and retry behavior, but I haven't been able to find a combination of options that works around the packet loss.

comment:15 Changed 3 years ago by alanjds

  • Cc alanjds added

comment:16 Changed 2 years ago by aaugustin

  • Easy pickings set
  • Summary changed from Unable to create a new session key on higher traffic to Confusing error when failing to create a session key because the cache is unavailable

I'm updating the the summary to reflect the fact that this problem isn't related to load.

Django can't do much if the cache doesn't work, or if the network loses enough packets to make the cache unusable.

On the other hand, Django should provide a more informative error message when this situation occurs. This is easy.

comment:17 Changed 2 years ago by clay

  • Cc clay removed

comment:18 Changed 2 years ago by myusuf3

@aaugustin what needs to be done here? Just improve the error message?

Changed 23 months ago by stumbles

Extend exception to explain that cache may be unavailable.

comment:19 Changed 23 months ago by stumbles

  • Has patch set
  • Owner changed from nobody to stumbles
  • Status changed from new to assigned

I've updated the exception message in the attached patch.

comment:20 Changed 21 months ago by aaugustin

  • Owner changed from stumbles to aaugustin
  • Status changed from assigned to new

comment:21 Changed 21 months ago by Aymeric Augustin <aymeric.augustin@…>

  • Resolution set to fixed
  • Status changed from new to closed

In 98032f67c725e257bd3c53374ff0ee22e2c77d7c:

Fixed #14093 -- Improved error message in the cache session backend.

Thanks stumbles for the patch.

Add Comment

Modify Ticket

Change Properties
<Author field>
Action
as closed
as The resolution will be set. Next status will be 'closed'
The resolution will be deleted. Next status will be 'new'
Author


E-mail address and user name can be saved in the Preferences.

 
Note: See TracTickets for help on using tickets.