Version 2 (modified by 18 years ago) ( diff ) | ,
---|
See also: Autoescape alternative
AutoEscaping proposal
XSS vulnerabilities are the most common form of security hole in web applications by an order of magnitude. In Django, they are avoided using the escape
template filter - but it is easy to forget to use this, and just one mistake makes an application vulnerable.
It is proposed that Django auto-escapes ALL output from template variable tags, unless explicitly told not to. This is a controversial change - it breaks backwards compatibility (and hence MUST be decided before version 1.0) and appears at odds the implicit-vs-explicit rule from the Zen of Python. Nevertheless, the security benefits are enormous - and many of the cons can be mitigated with careful design.
Here is a proposed design, based on extensive discussion on the mailing lists.
Auto escaping
Consider a variable name
passed from the URL, which contains the following string:
/path/?name=<script>alert('XSS');</script>
At the moment, {{ name }}
outputs the following:
<script>alert('XSS');</script>
With auto escaping, this will be output as:
<script>alert('XSS');</script>
But what if you want to output the unescaped string? For example, when generating a plain text email. A block level tag is proposed to deal with this scenario.
{% autoescape off %} {{ body }} {% endautoescape %}
You will also be able to set a flag on the context, as explained below.
Escaped v.s. non-escaped strings
A major risk with auto escaping is that things will end up being double escaped. What if the user were already using a filter somewhere along the line that causes HTML to be escaped? The solution is to introduce two types of string: escaped and non-escaped.
Consider the following:
class escaped: pass class escapedstr(str, escaped): pass class escapedunicode(unicode, escaped): pass def markescaped(s): if isinstance(s, escaped): return s if isinstance(s, str): return escapedstr(s) if isinstance(s, unicode): return escapedunicode(s) raise ValueError, "'s' must be str or unicode"
(This is one of the few examples where multiple inheritance could be useful in Django).
escapedstr
and escapedunicode
are subclassses of Python's built in str
and unicode
types that are marked as being escaped. Other than the fact that they pass the isinstance(s, escaped)
test, they are indistinguishable from regular strings. They have no special methods of their own.
This allows us to use them to mark strings that have already been escaped. The auto escape mechanism can then use this marker to decide if something should be escaped or not. This has a number of uses. Firstly, filters that convert a value in to HTML (such as urlize
and markdown
) can flag it as already being escaped (maybe escape is the wrong term - 'safe' might be better) so that the auto escape mechanism knows not to escape the output. Secondly, model fields that are known to contain safe HTML can likewise be marked. Thirdly, the existing 'escape' filter can use this, preserving backwards compatibility for templates written before the introduction of auto escaping.
Implementation
I propose adding a new property to the Context
class, called autoescape
. This defaults to being set to True
, but can be toggled either in view functions or by {% autoescape off %}
blocks in templates. The VariableNode
render() method then uses this context flag to decide if escaping should be performed or not:
def render(self, context): output = self.filter_expression.resolve(context) encoded = self.encode_output(output) if context.autoescape and not isinstance(s, escaped): return escape(encoded) else: return encoded
And here's the implementation of the {% autoescape on/off %}
template tag:
class AutoEscapeNode(Node): def __init__(self, setting, nodelist): self.setting, self.nodelist = setting, nodelist def render(self, context): old_setting = context.autoescape context.autoescape = self.setting output = self.nodelist.render(context) context.autoescape = old_setting return output #@register.tag(name="autoescape") def do_autoescape(parser, token): """ Set autoescape behaviour for this block. Possible values are 'on' and 'off'. """ _, rest = token.contents.split(None, 1) if rest not in ('on', 'off'): raise TemplateSyntaxError("autoescape argument must be 'on' or 'off'.") setting = (rest == 'on') nodelist = parser.parse(('endautoescape',)) parser.delete_first_token() return AutoEscapeNode(setting, nodelist) do_autoescape = register.tag("autoescape", do_autoescape)
Additional work
A bunch more work needs to be done to implement this change (probably in a branch), including the following:
- Modify existing Django filters to flag strings as escaped where necessary
- Modify built-in Django templates (inc. error pages) to take this in to account
- Lots of tests!
- Extensive documentation
Prior discussion
- Proposal: default escaping on django-developers
- templates and html escaping on django-developers
- Global Escape on django-users