Opened 5 years ago

Closed 2 years ago

#14057 closed New feature (wontfix)

Expose an interface for custom-escaping template content

Reported by: steveire Owned by: nobody
Component: Template system Version:
Severity: Normal Keywords:
Cc: net147@… Triage Stage: Design decision needed
Has patch: yes Needs documentation: yes
Needs tests: yes Patch needs improvement: no
Easy pickings: no UI/UX: no

Description

Django by default escapes html entities in Context content injected into a Template. For non-html content it is possible to implement a filter to implement custom escaping.

Example:

http://groups.google.com/group/django-developers/browse_thread/thread/2d3f095ed31ffc6c

However, in that case, escaping must be explicit. There is no auto-escaping, because auto-escaping assumes html content.

It is possible to create a separate object to perform escaping, which can be overridden for custom escaping. The specific escaping needs to be accessible from _render_value_in_context, currently a free function associated with the Node class, and from filters, some of which need to escape or conditionalEscape their input (linenumbers and linebreaks and linebreaksbr filters).

The solution I used in a C++ implementation was a OutputStream class with a virtual method for escaping:

http://gitorious.org/grantlee/grantlee/blobs/master/corelib/outputstream.h

It allows creation of a stream which does no escaping of content, or one that does javascript or latex (auto)escaping

The disadvantage is that the stream must be passed through all methods in the API, sometimes cloned. In python/django though it may make more sense to register a function in the SETTINGS which should be used for escaping, so that custom autoescaping can be possible.

Attachments (3)

14057-context-parameter.diff (611 bytes) - added by willhardy 5 years ago.
Allowing a callable to be passed to Context, providing custom autoescape
tests.py (7.1 KB) - added by willhardy 5 years ago.
14057-context-parameter-2.diff (11.5 KB) - added by willhardy 5 years ago.
Custom autoescape (all default filters should now be safe)

Download all attachments as: .zip

Change History (17)

comment:1 Changed 5 years ago by thejaswi_puthraya

  • Component changed from Uncategorized to Template system
  • Needs documentation unset
  • Needs tests unset
  • Patch needs improvement unset

comment:2 Changed 5 years ago by lukeplant

  • Resolution set to wontfix
  • Status changed from new to closed

I have had the same need myself, but a setting for this would not be appropriate because you might want HTML escaping for most templates, and some other kind of autoescaping for other templates. A better interface would be to specify the autoescape function/class using a set of template tags, or using a keyword argument to Template.

However, I think implementation is going to be extremely problematic. To get autoescaping right, we have things like SafeString and custom attributes on template filters like 'is_safe' - see http://docs.djangoproject.com/en/1.2/howto/custom-template-tags/#filters-and-auto-escaping . And we would have to pass the escaping object/function through all this lot without breaking backwards compatibility somehow — using a global is not a possibility for the reasons described above.

At the moment, I cannot see my way to extending this approach to be able to cope with other types of escaping elegantly. Since HTML autoescaping serves a very important security function — preventing XSS — I think it is OK to leave it as being an HTML only feature. Of course, there might be other types of output that would benefit in the same way, but they are not as important for a web framework.

If, however, you can come up with a convincing implementation that really works, we'd definitely be ready to re-consider. So, I'm closing WONTIFX, meaning that we don't have any intention to fix this, solely on the basis that I think it is Too Hard to do nicely, and adds too little value for a web framework, though I'm willing to be proved wrong by a beautiful patch :-)

comment:3 Changed 5 years ago by willhardy

Could a callable be passed to Context() in the autoescape parameter?

The following would then define a custom autoescape function:

my_context = Context(autoescape=my_escape_function)

Changed 5 years ago by willhardy

Allowing a callable to be passed to Context, providing custom autoescape

comment:4 Changed 5 years ago by SmileyChris

  • Has patch set
  • Needs documentation set
  • Needs tests set
  • Resolution wontfix deleted
  • Status changed from closed to reopened
  • Version 1.2 deleted

I remember having a discussion with the same solution (Context autoescape callable) and I think that's elegant. The limitation when rendering in regards to the tags/filters which can be used is acceptable, since this the escape function is changed very explicitly (and obviously is being changed for a specific purpose).

comment:5 Changed 5 years ago by willhardy

Here's a quick survey of the relationship between default filters and autoescape.

DIFFICULT FILTERS (4+8)

These would be difficult to make work with custom escape, because the
escaping needs to be handled by the filter itself:

  • cut (marks safe when change won't create new HTML tags)
  • force_escape (unlike escape, it won't use the custom escape function)
  • join (escapes each joined item individually)
  • linenumbers (but autoescape argument, allowing it to be turned off)

These have the same problem, but they would probably only be used in
HTML documents.

  • unordered_list
  • linebreaks
  • linebreaksbr
  • urlize (but takes autoescape argument)
  • urlizetrunc (but takes autoescape argument)
  • markup.textile
  • markup.markdown
  • markup.restructuredtext

FILTERS THAT DON'T TAINT/UNSAFE DATA (24)

These could be potentially dangerous, when they alter the "safe" data
in a way that would make it unsafe for a given context (like Luke's
example). Many however have a clear functionality, but the problem
might be so subtle that the developer wouldn't think of it. I imagine
that they don't taint the output because someone has carefully
analysed their behaviour in the HTML context.

  • addslashes
  • capfirst
  • iri_to_uri
  • lower
  • stringformat
  • title
  • truncatewords
  • truncatewords_html
  • wordwrap
  • ljust
  • rjust
  • center
  • removetags
  • striptags
  • last (oddly, the "first" filter does taint/unsafe the data, but this doesn't)
  • random (ditto, the "first" filter does taint/unsafe the data, but this doesn't)
  • slice
  • filesizeformat
  • phone2numeric
  • pprint
  • fix_ampersands
  • humanize: ordinal
  • humanize: intcomma
  • humanize: apnumber

FILTERS UNNECESSARILY MARKED SAFE (2)

These filters are marked as safe although there is apparently no need
to do so. It does however save a bit of processing. Removing
mark_safe() shouldn't have an effect on their operation, other than
being less efficient.

  • slugify (output can't contain &, <, >, " or ' anyway)
  • floatformat (marks format output as safe, none of the default formats would be affected by escaping (I checked). This would however prevent developers from writing formats with html markup, ie DECIMAL_SEPARATOR = '<span class="decimal">.</span>', which might be very useful for some)

SAFE FILTERS (23)

The following filters should be ok, as they don't prevent the custom escape from being run

  • escape
  • urlencode
  • wordcount
  • escapejs
  • safe (well it does prevent custom escape, but that's the idea)
  • safeseq
  • dictsort
  • dictsortreversed
  • first
  • length_is
  • add
  • get_digit
  • date
  • time
  • timesince
  • timeuntil
  • default
  • default_if_none
  • divisibleby
  • yesno
  • pluralize
  • humanize: intword
  • humanize: naturalday

comment:6 Changed 5 years ago by lukeplant

  • Triage Stage changed from Unreviewed to Design decision needed

For completeness, here is the example Will referred to.

Suppose I am outputting RTF text. I have an escape function like this:

def rtfescape(value):
    return value.replace('\\', '\\\\')
                .replace('{','\\{')
                .replace('}', '\\}')

i.e. backslashes and curly braces need escaping.

Suppose I have an input string "Joe's string". I should be able to mark this as safe - it doesn't need escaping either for HTML or RTF. Or leave it unmarked - it should make no difference.

Suppose also I have a template that uses 'addslashes' (for the sake of argument, because I'm writing a RTF document that shows what the 'addslashes' filter does). That filter is marked 'is_safe', because it only introduces slashes, which are not dangerous characters for HTML. But they are dangerous for RTF.

Logically I would expect the following 3 to produce the same output:

1) I use mark_safe on my safe input string and use addslashes to add the slashes

Template("{{ val|addslashes }}").render(
   Context({'val': mark_safe("Joe's string")}), 
   autoescape=rtfescape)
)

2) I don't use mark_safe on my safe input string and use addslashes to add the slashes

Template("{{ val|addslashes }}").render(
   Context({'val': "Joe's string"}), 
   autoescape=rtfescape)
)

3) I manually 'apply' addslashes.

Template("{{ val }}").render(
   Context({'val': "Joe\\'s string"}), 
   autoescape=rtfescape)
)

But these do not produce the same output - 1 is different from 2 and 3, and is not what I intend.

comment:7 Changed 5 years ago by willhardy

As mentioned before, I can think of three classes of filters:

1. Filters that don't prevent escaping

The first patch seems to cover this group.

2. Filters that explicitly mark the output as safe

These filters often take an autoescape parameter (.needs_autoescape), which means they can use the custom function that is passed in. The filters that don't currently do this, can simply add it.

3. Filters that don't taint output (myfilter.is_safe = True)

I've written a second patch that would allow these filters to operate safely.

It does this by defining an "autoescape context" (another name can be chosen to avoid confusion with template contexts). By default, the autoescape context is "html". Custom escape functions can define a different context (eg "latex") and custom filters can do the same. Custom escape functions can also define a whitelist of filters that are ok for the given context.

For example:

def rtfescape(value):
    return value.replace('\\', '\\\\')
                .replace('{','\\{')
                .replace('}', '\\}')
rtfescape.autoescape_context = "rtf"
rtfescape.safe_filters = set(['lower'])

All new filters that are designed to be safe in the RTF context can be written as follows:

@stringfilter
def myupper(value):
    return value.upper()
upper.autoescape_context = "rtf"

Using a filter from a different context on safe data will cause it to be escaped, even though the filter may operate safely. There are three possible solutions, the choice of which would depend on what you have access to change (the escape function, the filter or the template code). If you can change the escape function, you can add the filter to safe_filters. If you have access to the filter, you can define the appropriate context using .autoescape_context. If you (only) have access to the template, you can add the |safe filter to your variables.

Two imperfections I can see with this approach are:

  • that the filter's name doesn't actually correspond to a specific known filter: default filters can be overridden by custom filters.
  • the name "autoescape context" could be better, as "context" is already used in templates

Changed 5 years ago by willhardy

Changed 5 years ago by willhardy

Custom autoescape (all default filters should now be safe)

comment:8 Changed 5 years ago by net147

  • Cc net147@… added

comment:9 Changed 4 years ago by julien

  • Severity set to Normal
  • Type set to New feature

comment:10 Changed 3 years ago by aaugustin

  • UI/UX unset

Change UI/UX from NULL to False.

comment:11 Changed 3 years ago by aaugustin

  • Easy pickings unset

Change Easy pickings from NULL to False.

comment:12 Changed 2 years ago by aaugustin

  • Status changed from reopened to new

comment:13 Changed 2 years ago by aaugustin

I feel that this is going to add a significant amount of code to Django that won't be used very much in practice.

Since escaping is critical to prevent XSS attacks, I'd really prefer to keep in as simple as possible.

I appreciate the amount of effort that went into the patches, but as I read this ticket for the first time today, I think it should be closed as wontfix.

comment:14 Changed 2 years ago by apollo13

  • Resolution set to wontfix
  • Status changed from new to closed

I agree with Aymeric. I also think that the escaping system should stay as easy as possible, especially since we want it to be secure.

Note: See TracTickets for help on using tickets.
Back to Top