Opened 14 years ago
Closed 12 years ago
#14057 closed New feature (wontfix)
Expose an interface for custom-escaping template content
Reported by: | Stephen Kelly | Owned by: | nobody |
---|---|---|---|
Component: | Template system | Version: | |
Severity: | Normal | Keywords: | |
Cc: | net147@… | Triage Stage: | Design decision needed |
Has patch: | yes | Needs documentation: | yes |
Needs tests: | yes | Patch needs improvement: | no |
Easy pickings: | no | UI/UX: | no |
Description
Django by default escapes html entities in Context content injected into a Template. For non-html content it is possible to implement a filter to implement custom escaping.
Example:
http://groups.google.com/group/django-developers/browse_thread/thread/2d3f095ed31ffc6c
However, in that case, escaping must be explicit. There is no auto-escaping, because auto-escaping assumes html content.
It is possible to create a separate object to perform escaping, which can be overridden for custom escaping. The specific escaping needs to be accessible from _render_value_in_context, currently a free function associated with the Node class, and from filters, some of which need to escape or conditionalEscape their input (linenumbers and linebreaks and linebreaksbr filters).
The solution I used in a C++ implementation was a OutputStream class with a virtual method for escaping:
http://gitorious.org/grantlee/grantlee/blobs/master/corelib/outputstream.h
It allows creation of a stream which does no escaping of content, or one that does javascript or latex (auto)escaping
The disadvantage is that the stream must be passed through all methods in the API, sometimes cloned. In python/django though it may make more sense to register a function in the SETTINGS which should be used for escaping, so that custom autoescaping can be possible.
Attachments (3)
Change History (17)
comment:1 by , 14 years ago
Component: | Uncategorized → Template system |
---|
comment:2 by , 14 years ago
Resolution: | → wontfix |
---|---|
Status: | new → closed |
comment:3 by , 14 years ago
Could a callable be passed to Context() in the autoescape parameter?
The following would then define a custom autoescape function:
my_context = Context(autoescape=my_escape_function)
by , 14 years ago
Attachment: | 14057-context-parameter.diff added |
---|
Allowing a callable to be passed to Context, providing custom autoescape
comment:4 by , 14 years ago
Has patch: | set |
---|---|
Needs documentation: | set |
Needs tests: | set |
Resolution: | wontfix |
Status: | closed → reopened |
Version: | 1.2 |
I remember having a discussion with the same solution (Context autoescape callable) and I think that's elegant. The limitation when rendering in regards to the tags/filters which can be used is acceptable, since this the escape function is changed very explicitly (and obviously is being changed for a specific purpose).
comment:5 by , 14 years ago
Here's a quick survey of the relationship between default filters and autoescape.
DIFFICULT FILTERS (4+8)
These would be difficult to make work with custom escape, because the
escaping needs to be handled by the filter itself:
- cut (marks safe when change won't create new HTML tags)
- force_escape (unlike escape, it won't use the custom escape function)
- join (escapes each joined item individually)
- linenumbers (but autoescape argument, allowing it to be turned off)
These have the same problem, but they would probably only be used in
HTML documents.
- unordered_list
- linebreaks
- linebreaksbr
- urlize (but takes autoescape argument)
- urlizetrunc (but takes autoescape argument)
- markup.textile
- markup.markdown
- markup.restructuredtext
FILTERS THAT DON'T TAINT/UNSAFE DATA (24)
These could be potentially dangerous, when they alter the "safe" data
in a way that would make it unsafe for a given context (like Luke's
example). Many however have a clear functionality, but the problem
might be so subtle that the developer wouldn't think of it. I imagine
that they don't taint the output because someone has carefully
analysed their behaviour in the HTML context.
- addslashes
- capfirst
- iri_to_uri
- lower
- stringformat
- title
- truncatewords
- truncatewords_html
- wordwrap
- ljust
- rjust
- center
- removetags
- striptags
- last (oddly, the "first" filter does taint/unsafe the data, but this doesn't)
- random (ditto, the "first" filter does taint/unsafe the data, but this doesn't)
- slice
- filesizeformat
- phone2numeric
- pprint
- fix_ampersands
- humanize: ordinal
- humanize: intcomma
- humanize: apnumber
FILTERS UNNECESSARILY MARKED SAFE (2)
These filters are marked as safe although there is apparently no need
to do so. It does however save a bit of processing. Removing
mark_safe() shouldn't have an effect on their operation, other than
being less efficient.
- slugify (output can't contain &, <, >, " or ' anyway)
- floatformat (marks format output as safe, none of the default formats would be affected by escaping (I checked). This would however prevent developers from writing formats with html markup, ie
DECIMAL_SEPARATOR = '<span class="decimal">.</span>'
, which might be very useful for some)
SAFE FILTERS (23)
The following filters should be ok, as they don't prevent the custom escape from being run
- escape
- urlencode
- wordcount
- escapejs
- safe (well it does prevent custom escape, but that's the idea)
- safeseq
- dictsort
- dictsortreversed
- first
- length_is
- add
- get_digit
- date
- time
- timesince
- timeuntil
- default
- default_if_none
- divisibleby
- yesno
- pluralize
- humanize: intword
- humanize: naturalday
comment:6 by , 14 years ago
Triage Stage: | Unreviewed → Design decision needed |
---|
For completeness, here is the example Will referred to.
Suppose I am outputting RTF text. I have an escape function like this:
def rtfescape(value): return value.replace('\\', '\\\\') .replace('{','\\{') .replace('}', '\\}')
i.e. backslashes and curly braces need escaping.
Suppose I have an input string "Joe's string". I should be able to mark this as safe - it doesn't need escaping either for HTML or RTF. Or leave it unmarked - it should make no difference.
Suppose also I have a template that uses 'addslashes' (for the sake of argument, because I'm writing a RTF document that shows what the 'addslashes' filter does). That filter is marked 'is_safe', because it only introduces slashes, which are not dangerous characters for HTML. But they are dangerous for RTF.
Logically I would expect the following 3 to produce the same output:
1) I use mark_safe on my safe input string and use addslashes to add the slashes
Template("{{ val|addslashes }}").render( Context({'val': mark_safe("Joe's string")}), autoescape=rtfescape) )
2) I don't use mark_safe on my safe input string and use addslashes to add the slashes
Template("{{ val|addslashes }}").render( Context({'val': "Joe's string"}), autoescape=rtfescape) )
3) I manually 'apply' addslashes.
Template("{{ val }}").render( Context({'val': "Joe\\'s string"}), autoescape=rtfescape) )
But these do not produce the same output - 1 is different from 2 and 3, and is not what I intend.
comment:7 by , 14 years ago
As mentioned before, I can think of three classes of filters:
1. Filters that don't prevent escaping
The first patch seems to cover this group.
2. Filters that explicitly mark the output as safe
These filters often take an autoescape
parameter (.needs_autoescape
), which means they can use the custom function that is passed in. The filters that don't currently do this, can simply add it.
3. Filters that don't taint output (myfilter.is_safe = True
)
I've written a second patch that would allow these filters to operate safely.
It does this by defining an "autoescape context" (another name can be chosen to avoid confusion with template contexts). By default, the autoescape context is "html". Custom escape functions can define a different context (eg "latex") and custom filters can do the same. Custom escape functions can also define a whitelist of filters that are ok for the given context.
For example:
def rtfescape(value): return value.replace('\\', '\\\\') .replace('{','\\{') .replace('}', '\\}') rtfescape.autoescape_context = "rtf" rtfescape.safe_filters = set(['lower'])
All new filters that are designed to be safe in the RTF context can be written as follows:
@stringfilter def myupper(value): return value.upper() upper.autoescape_context = "rtf"
Using a filter from a different context on safe data will cause it to be escaped, even though the filter may operate safely. There are three possible solutions, the choice of which would depend on what you have access to change (the escape function, the filter or the template code). If you can change the escape function, you can add the filter to
safe_filters
. If you have access to the filter, you can define the appropriate context using.autoescape_context
. If you (only) have access to the template, you can add the|safe
filter to your variables.
Two imperfections I can see with this approach are:
- that the filter's name doesn't actually correspond to a specific known filter: default filters can be overridden by custom filters.
- the name "autoescape context" could be better, as "context" is already used in templates
by , 14 years ago
by , 14 years ago
Attachment: | 14057-context-parameter-2.diff added |
---|
Custom autoescape (all default filters should now be safe)
comment:8 by , 14 years ago
Cc: | added |
---|
comment:9 by , 13 years ago
Severity: | → Normal |
---|---|
Type: | → New feature |
comment:12 by , 12 years ago
Status: | reopened → new |
---|
comment:13 by , 12 years ago
I feel that this is going to add a significant amount of code to Django that won't be used very much in practice.
Since escaping is critical to prevent XSS attacks, I'd really prefer to keep in as simple as possible.
I appreciate the amount of effort that went into the patches, but as I read this ticket for the first time today, I think it should be closed as wontfix.
comment:14 by , 12 years ago
Resolution: | → wontfix |
---|---|
Status: | new → closed |
I agree with Aymeric. I also think that the escaping system should stay as easy as possible, especially since we want it to be secure.
I have had the same need myself, but a setting for this would not be appropriate because you might want HTML escaping for most templates, and some other kind of autoescaping for other templates. A better interface would be to specify the autoescape function/class using a set of template tags, or using a keyword argument to Template.
However, I think implementation is going to be extremely problematic. To get autoescaping right, we have things like SafeString and custom attributes on template filters like 'is_safe' - see http://docs.djangoproject.com/en/1.2/howto/custom-template-tags/#filters-and-auto-escaping . And we would have to pass the escaping object/function through all this lot without breaking backwards compatibility somehow — using a global is not a possibility for the reasons described above.
At the moment, I cannot see my way to extending this approach to be able to cope with other types of escaping elegantly. Since HTML autoescaping serves a very important security function — preventing XSS — I think it is OK to leave it as being an HTML only feature. Of course, there might be other types of output that would benefit in the same way, but they are not as important for a web framework.
If, however, you can come up with a convincing implementation that really works, we'd definitely be ready to re-consider. So, I'm closing WONTIFX, meaning that we don't have any intention to fix this, solely on the basis that I think it is Too Hard to do nicely, and adds too little value for a web framework, though I'm willing to be proved wrong by a beautiful patch :-)