Changes between Initial Version and Version 1 of AutoEscaping


Ignore:
Timestamp:
Jun 20, 2006, 1:43:51 AM (18 years ago)
Author:
Simon Willison
Comment:

Initial proposal

Legend:

Unmodified
Added
Removed
Modified
  • AutoEscaping

    v1 v1  
     1== AutoEscaping proposal ==
     2
     3[http://en.wikipedia.org/wiki/Cross-site_scripting XSS vulnerabilities] are the most common form of security hole in web applications by an order of magnitude. In Django, they are avoided using the `escape` template filter - but it is easy to forget to use this, and just one mistake makes an application vulnerable.
     4
     5It is proposed that Django auto-escapes ALL output from template variable tags, unless explicitly told not to. This is a controversial change - it breaks backwards compatibility (and hence MUST be decided before version 1.0) and appears at odds the implicit-vs-explicit rule from the Zen of Python. Nevertheless, the security benefits are enormous - and many of the cons can be mitigated with careful design.
     6
     7Here is a proposed design, based on extensive discussion on the mailing lists.
     8
     9=== Auto escaping ===
     10
     11Consider a variable `name` passed from the URL, which contains the following  string:
     12
     13{{{
     14/path/?name=<script>alert('XSS');</script>
     15}}}
     16
     17At the moment, `{{ name }}` outputs the following:
     18
     19{{{
     20<script>alert('XSS');</script>
     21}}}
     22
     23With auto escaping, this will be output as:
     24
     25{{{
     26&lt;script&gt;alert(&apos;XSS&apos;);&lt;/script&gt;
     27}}}
     28
     29But what if you ''want'' to output the unescaped string? For example, when generating a plain text email. A block level tag is proposed to deal with this scenario.
     30
     31{{{
     32{% autoescape off %}
     33{{ body }}
     34{% endautoescape %}
     35}}}
     36
     37You will also be able to set a flag on the context, as explained below.
     38
     39=== Escaped v.s. non-escaped strings ===
     40
     41A major risk with auto escaping is that things will end up being double escaped. What if the user were already using a filter somewhere along the line that causes HTML to be escaped? The solution is to introduce two types of string: escaped and non-escaped.
     42
     43Consider the following:
     44
     45{{{
     46class escaped:
     47    pass
     48
     49class escapedstr(str, escaped):
     50    pass
     51
     52class escapedunicode(unicode, escaped):
     53    pass
     54
     55def markescaped(s):
     56    if isinstance(s, escaped):
     57        return s
     58    if isinstance(s, str):
     59        return escapedstr(s)
     60    if isinstance(s, unicode):
     61        return escapedunicode(s)
     62    raise ValueError, "'s' must be str or unicode"
     63}}}
     64
     65(This is one of the few examples where multiple inheritance could be useful in Django).
     66
     67`escapedstr` and `escapedunicode` are subclassses of Python's built in `str` and `unicode` types that are marked as being escaped. Other than the fact that they pass the `isinstance(s, escaped)` test, they are indistinguishable from regular strings. They have no special methods of their own.
     68
     69This allows us to use them to mark strings that have already been escaped. The auto escape mechanism can then use this marker to decide if something should be escaped or not. This has a number of uses. Firstly, filters that convert a value in to HTML (such as `urlize` and `markdown`) can flag it as already being escaped (maybe escape is the wrong term - 'safe' might be better) so that the auto escape mechanism knows not to escape the output. Secondly, model fields that are known to contain safe HTML can likewise be marked. Thirdly, the existing 'escape' filter can use this, preserving backwards compatibility for templates written before the introduction of auto escaping.
     70
     71=== Implementation ===
     72
     73I propose adding a new property to the `Context` class, called `autoescape`. This defaults to being set to `True`, but can be toggled either in view functions or by `{% autoescape off %}` blocks in templates. The `VariableNode` render() method then uses this context flag to decide if escaping should be performed or not:
     74
     75{{{
     76    def render(self, context):
     77        output = self.filter_expression.resolve(context)
     78        encoded = self.encode_output(output)
     79        if context.autoescape and not isinstance(s, escaped):
     80            return escape(encoded)
     81        else:
     82            return encoded
     83}}}
     84
     85And here's the implementation of the `{% autoescape on/off %}` template tag:
     86
     87{{{
     88class AutoEscapeNode(Node):
     89    def __init__(self, setting, nodelist):
     90        self.setting, self.nodelist = setting, nodelist
     91
     92    def render(self, context):
     93        old_setting = context.autoescape
     94        context.autoescape = self.setting
     95        output = self.nodelist.render(context)
     96        context.autoescape = old_setting
     97        return output
     98
     99#@register.tag(name="autoescape")
     100def do_autoescape(parser, token):
     101    """
     102    Set autoescape behaviour for this block. Possible values are 'on' and 'off'.
     103    """
     104    _, rest = token.contents.split(None, 1)
     105    if rest not in ('on', 'off'):
     106        raise TemplateSyntaxError("autoescape argument must be 'on' or 'off'.")
     107    setting = (rest == 'on')
     108    nodelist = parser.parse(('endautoescape',))
     109    parser.delete_first_token()
     110    return AutoEscapeNode(setting, nodelist)
     111do_autoescape = register.tag("autoescape", do_autoescape)
     112}}}
     113
     114=== Additional work ===
     115
     116A bunch more work needs to be done to implement this change (probably in a branch), including the following:
     117
     118 * Modify existing Django filters to flag strings as escaped where necessary
     119 * Modify built-in Django templates (inc. error pages) to take this in to account
     120 * Lots of tests!
     121 * Extensive documentation
     122
     123=== Prior discussion ===
     124
     125 * [http://groups.google.com/group/django-developers/browse_thread/thread/17d1dfecd67864ab/2d177ac262232b73 Proposal: default escaping] on django-developers
     126 * [http://groups.google.com/group/django-developers/browse_thread/thread/e448bbdd40426915/70c34ce7cc96e283 templates and html escaping] on django-developers
     127 * [http://groups.google.com/group/django-developers/browse_thread/thread/e448bbdd40426915/70c34ce7cc96e283 Global Escape] on django-users
Back to Top