[[TOC]]

= CSRF Protection =

This page aims to document and discuss CSRF protection for Django.

== Background ==

For general discussion of the CSRF issue see:
 
 * [http://en.wikipedia.org/wiki/CSRF Wikipedia on CSRF]
 * [http://www.adambarth.com/papers/2008/barth-jackson-mitchell-b.pdf Robust Defences for Cross-Site Forgery (Barth, Jackson, Mitchell)] - very important paper, terminology used here reflects the terminology of that paper.

== Discussions ==

For the record, the most relevant discussion on django-developers are here, most recent first, but this page attempts to supersede them all by gathering the conclusions.

 * Discussion of #9977 patch csrf_template_tag_14.diff - http://groups.google.com/group/django-developers/browse_thread/thread/c23f556b88cedbc7?pli=1
 * Luke Plant's proposal - http://groups.google.com/group/django-developers/browse_thread/thread/ae525f270ed46933/c338b050f43741b2?lnk=gst
 * Simon Willison's proposal - SafeForm - http://groups.google.com/group/django-developers/browse_thread/thread/2c33621003992d07/61a9fefb50d662b0

== Django tickets ==

There is a lot of discussion in these tickets as well, which are also superseded by this page.

 * #9977 (most up to date)
 * #10816 (remove session dependency from Django 1.0 middleware)


== Types of attack ==

The following different attacks are in scope:

 1. 'CSRF': normal CSRF attacks, where an attacking site hosts a form, link, or piece of javascript that causes the user's browser to make a request on a (Django) site.  Normally this involves abuse of the user agent's authentication (e.g. a session/login cookie) to cause an action that the attacking site would not otherwise have permission to do.
 2. 'Login CSRF': this is when a browser is tricked into logging into a site under an account that is controlled by an attacker, so that the attacker can then track or otherwise abuse the actions performed by the victim.
 3. 'CSRF + MITM': this is a combination of  CSRF and active network (man-in-the-middle) attack, which is relevant in HTTPS situations which are otherwise thought to be invulnerable to MITM attacks.  This is because HTTP 'Set-Cookie' headers are accepted by clients that are talking to a site under HTTPS, allowing a MITM attacker to set cookies on the client. (MITM attacks under HTTP are considered out of scope, because in general MITM attacks under HTTP are impossible to protect against).

Further, attacks can be categorised as:

 1. Cross site  - attacker.com attacking victim.com
 2. Cross sub-domain - attacker.example.com attacking victim.example.com

There is also an attack related to 'CSRF + MITM' which is not necessarily 'CSRF', which is called 'session fixation': the opposite of session theft, a malicious agent sets the session cookie on a user's machine, giving them the attacker's session.  The attacker may or may not be 'authenticated' in that session - if not, the victim might authenticate, and the attacker may be able to take control of the victim's account or abuse the victim's authentication.  Alternatively, if the attacker is already authenticated, there are the same problems as those present in 'login CSRF'.  This type of attack can be achieved by an active network attacker (not in scope for HTTP connections), but it can also be achieved by someone with control of a sub-domain through the use of the wild-card cookies.  

(Question: if the sub-domain is unable to send cookies, is this attack foiled - or do you need to stop javascript as well?  The answer depends on browser policies about cookie setting, what happens when a page does {{{document.domain = 'example.com';}}} ?)

== Methods of defence ==

The main contenders are:

 * 'HMAC of session identifier'.  This was provided by Django 1.0 in !CsrfMiddleware and in 1.1 in !CsrfViewMiddleware, and is referred to as the 'CSRF token'.  All incoming POST requests that have an active session are required to have a CSRF token that is a hash of the session identifier and the site's SECRET_KEY.
 
 * 'session independent nonce'.  A random value is stored in a cookie, unique to every user, and POST forms must contain the same value as a token.

 * 'Strict Referer checking'. The HTTP 'REFERER' header must be present and must match the site's domain for the request to be accepted (for POST requests).

== Methods of token insertion ==

 * Django 1.0 provided !CsrfMiddleware and 1.1 provided !CsrfResponseMiddleware which did automatic insertion of the token in outgoing pages, to all POST forms.  This has several problems:
   * The performance hit from doing the post processing.
   * It removes the ability to do streaming of responses (should that become a possibility in future Django versions)
   * It adds the CSRF token to all POST forms, including those targeted at external sites.  These sites would then gain access to the CSRF token and would be able to do CSRF attacks on that user. (This can be avoided by use of the `@csrf_response_exempt` decorator if the page has no internal forms, but that might be an unacceptable constraint, and the default behaviour opens up vulnerabilities easily).  To put it simply, control over token insertion is on a page by page basis, when it needs to be form by form.

 * !SafeForm - a Django Form subclass that adds the token.  Proposal abandoned in favour of template tag for various reasons.

 * Template tag - templates that need the CSRF token need to insert them manually using `{% load csrf %}` at the top and `{% csrf_token %}` in every `<form>`.  This also requires use of a context processor (usually via !RequestContext).  This method is assumed for the rest of the document.

== Evaluation ==

Each of the methods of defence will be evaluated against the possible attacks.

=== Django 1.0 - HMAC of session identifier ===

 1. CSRF: 
   * if using Django's session framework as the basis for authorisation: '''protected'''
   * otherwise: '''not protected''' (the middleware provides no protection if there isn't an active session)
 2. Login CSRF:
   * if using Django's built-in session framework and login routines, you are '''protected''' (albeit accidentally).  This is because the login views in Django create a session when the login form is first accessed.  When the form is filled in and POSTed back, the view checks for the existence of a test value in the session (to check whether the session cookie has been accepted by the browser).  If not found, you receive the message: "Looks like your browser isn't configured to accept cookies. Please enable cookies, reload this page, and try again".  Hence, login CSRF doesn't work - the session is established in the step before logging in, and is required for login to succeed.
   * other login methods are '''not protected'''.
 3. CSRF + MITM, under HTTPS: '''protected'''
   * The CSRF token is tied to the session, so the MITM will not be able to fake the token and abuse the user's session.
   * but related session fixing vulnerabilities are not protected, essentially due to a flaw in cookies (see Barth et al).
 4. Cross sub-domain CSRF: same as normal CSRF (1) above
 5. Cross sub-domain login CSRF: same as normal login CSRF (2) above
 6. Cross sub-domain session fixing: '''not protected''' 
   * sub-domains will be able to send a wild-card session cookie to clients, giving them the attacker's session.

Additional issues:

 * it can cause false positives when the session identifier changes.  This can happen if the user has more than one tab open, and in one tab does something that causes session identifier to change, and then submits a form that is open in another tab.

 * it is tied to the session framework, which means it's not re-usable for sites that don't use Django sessions, or have alternative login strategies.

=== Session independent nonce ===

 1. CSRF: '''protected'''
 2. Login CSRF: '''protected''' (as long as both the CSRF token and the cookie are required for all POST requests)
 3. CSRF + MITM, under HTTPS: '''not protected'''
   * The attacker can set the CSRF cookie using Set-Cookie, and then supply a matching token in the POST form data.  Since the site does not tie the session cookies to the CSRF cookies, it has no way of determining that the CSRF token + cookie are genuine (doing hashing etc. of one of them will not work, as the attacker can just get a valid pair from the site directly, and use that pair in the attack).
 4. Cross sub-domain CSRF: '''not protected'''
   * The sub-domain can simply send a wild-card cookie to set the CSRF cookie, and include the corresponding token in the form.
 5. Cross sub-domain login CSRF: '''not protected'''
   * Same reason as (4)
 6. Cross sub-domain session fixing: '''not protected''' 
   * sub-domains will be able to send a wild-card session cookie to clients, giving them the attacker's session.

=== Strict Referer checking ===

 1. CSRF: '''protected'''
 2. Login CSRF: '''protected'''
 3. CSRF + MITM, under HTTPS: '''protected'''
   * but related session fixing vulnerabilities are not protected, essentially due to a flaw in cookies (see Barth et al).
 4. Cross sub-domain CSRF: '''protected'''
 5. Cross sub-domain login CSRF: '''protected'''
 6. Cross sub-domain session fixing: '''not protected''' (out of scope)

The big problem with strict Referer checking is that the Referer header is suppressed by some browsers and by some networks.  However, Barth et al have shown that for same-domain HTTPS requests, this is as little as 0.05% - 0.22%, and recommend that this method can be used for HTTPS connections.  Since HTTPS connections cannot be tampered with (apart from in some rare internal-network-with-proxy situations), suppression of the Referer header can only be done by the browser, so if a user is having problems due to use of this method, they can simply be instructed to configure their browser differently or use a different browser.

== Proposal ==

(by Luke Plant)

CSRF protection should be done by the following method:

 * Session independent nonce 
   * with backwards compatibility for the Django 1.0 token to avoid upgrade bumps
 * Additionally, strict Referer header checking for HTTPS only
 * Template tag for inserting the CRSF token 
   * with a backwards compatible !CsrfResponseMiddleware which can be used at the same time as the template tag, to allow people to upgrade without upgrading all their apps.

Compared to Django 1.1:
 * there is no dependence on the session framework, expanding the usefulness of this protection
 * the false positives caused by session cycling are eliminated
 * the vulnerability caused by automatically including the CSRF token in all POST forms (including external targets) can be avoided much more easily, and you are much more secure by default.
 * some cross sub-domain vulnerabilities are opened up - '''regression'''.  This is deemed an acceptable compromise, since there were already cross sub-domain vulnerabilities, and giving sub-domains to untrusted parties seems to be increasingly uncommon as most people buy their own domains.  It also isn't possible to fix cross sub-domain session fixing without changes to browsers, so it is safer simply to say that sub-domains should only be given to trusted parties.
 * for HTTPS connections, adding strict Referer checking closes the other vulnerabilities opened up by the change from 'HMAC of session identifier' to 'session independent nonce' (i.e.  CSRF + MITM under HTTPS)
 * there are required upgrade steps for contrib apps (including the admin) to continue working.  Given the fact that without CSRF protection enabled by default ticket #510 is a security bug that should not be considered closed, and the fact that the CSRF protection provided in Django 1.1 is considered inadequate (due to its performance and security problems), I think this is acceptable.

This proposal is implemented in the lp-csrf_rework branch in http://bitbucket.org/spookylukey/django-trunk-lukeplant/ (with patches regularly copied to #9977)

The docs for this branch, which contain upgrade information, are here: http://bitbucket.org/spookylukey/django-trunk-lukeplant/src/tip/docs/ref/contrib/csrf.txt

== Further work ==

 * Could examine use of Origin header for CSRF protection, in addition to this.  
   * It's usefulness will really depend on whether browsers implement it, and how quickly - we won't be able to rely on it for a long time.
   * Note that if we simply compare Origin and Host, we are still vulnerable to DNS rebinding attacks. 
 * General protection against DNS rebinding attacks.  This would require a setting that listed allowable values of Host.  This could be a separate middleware.