Opened 5 years ago

Last modified 7 months ago

#17419 new New feature

Add a JSON template filter

Reported by: Lau Bech Lauritzen Owned by: Aymeric Augustin
Component: Template system Version: master
Severity: Normal Keywords: json template tag
Cc: mike@…, Florian Apolloner, adrian.macneil@…, sergeimaertens@… Triage Stage: Accepted
Has patch: no Needs documentation: no
Needs tests: no Patch needs improvement: no
Easy pickings: no UI/UX: no

Description

It's nice to be able to quickly make Python lists and dictionaries into JSON. The normal way of doing this is to dump the data structure as JSON in the view and then outputting the JSON within a script tag in the template, remembering to pipe it through safe. This little addition to Django would streamline that process.

It lets you do this:

<script>
  var data = {{ data|json }};
</script>

Instead of:

views.py:

from django.utils import simplejson 

def home(request):
  data = {'hello: ['world', 'universe']}
  return render_to_response("home.html", {
    'data': simplejson.dumps(data),
    }, context_instance=get_context(request))    


home.html:

<script>
  var data = {{ data|safe }};
</script>

Attachments (7)

json.py (222 bytes) - added by Lau Bech Lauritzen 5 years ago.
JSON template tag
json-filter.diff (3.8 KB) - added by Mike Fogel 5 years ago.
json template filter with tests and docs
json-filter.v2.diff (3.7 KB) - added by Mike Fogel 5 years ago.
json template filter with tests and docs - output no longer marked safe
json-filter.v3.diff (3.8 KB) - added by neaf 5 years ago.
Add a warning about naively displaying strings from JSON marked as safe.
json-filter.v4.diff (2.2 KB) - added by neaf 5 years ago.
Add test for invalid argument (not JSON serializable)
17419-insecure-do-not-use-this.patch (6.5 KB) - added by Aymeric Augustin 5 years ago.
jsonfilter.py (1.0 KB) - added by Jacob Rief 2 years ago.
Implement a safe JSON filter templatetag

Download all attachments as: .zip

Change History (36)

Changed 5 years ago by Lau Bech Lauritzen

Attachment: json.py added

JSON template tag

comment:1 Changed 5 years ago by Aymeric Augustin

Triage Stage: UnreviewedAccepted

comment:2 Changed 5 years ago by Aymeric Augustin

Easy pickings: set
Needs documentation: set
Needs tests: set

comment:3 Changed 5 years ago by Mike Fogel

Cc: mike@… added

comment:4 Changed 5 years ago by Mike Fogel

Needs documentation: unset
Needs tests: unset
Version: 1.31.4-alpha-1

I've uploaded a patch that applies cleanly to trunk that contains the new filter, tests, and documentation.

Would be great to get this into 1.4 - let me know if you need any additions/changes in order to accept it. Thanks.

Changed 5 years ago by Mike Fogel

Attachment: json-filter.diff added

json template filter with tests and docs

comment:5 Changed 5 years ago by Aymeric Augustin

Patch needs improvement: set

It isn't obvious to me that marking the output as safe by default is the right thing to do. Not everyone adds CDATA markers to its <script> tags. Actually, most frontend devs I've worked with don't.

Wouldn't the current implementation break HTML parsing when the filter is used naively?

If so, I'd prefer {{ data|json|safe }} within CDATA sections and {{ data|json }} everywhere else -- security should be on be default.

Last edited 5 years ago by Aymeric Augustin (previous) (diff)

Changed 5 years ago by Mike Fogel

Attachment: json-filter.v2.diff added

json template filter with tests and docs - output no longer marked safe

comment:6 Changed 5 years ago by Mike Fogel

Patch needs improvement: unset

Not everyone adds CDATA markers to its <script> tags.

Only required if you're using an XHTML doctype (rather than HTML 5 or 4.01). But no matter - I'm on board with a safety-by-default approach.

I've attached another patch, this one does not mark the output safe. Docs and tests are updated as well. Thanks for reviewing.

comment:7 Changed 5 years ago by Aymeric Augustin

Owner: changed from nobody to Aymeric Augustin

Thanks for updating the patch.

This is a new feature, so I need to check if I can slip it in 1.4 now or if it'll have to wait for 1.5. Anyway, I intend to commit it eventually, as I've felt the need for such a tag quite often.

comment:8 Changed 5 years ago by Ole Laursen

Sounds nice that it's going in, but playing it safe isn't necessarily that helpful. escape turns " into &quot;, which means you can't output a string in that JSON without safe - including a simple object like { "a": 123 }.

I've tested a simple

<script>
var x = &quot;foo&quot;;
</script>

with both an HTML5 doctype an XHTML 1 strict doctype, and it doesn't work in any of them. So it seems to me this filter is never useful in script tags without safe? It would perhaps be better to add a warning to the documentation?

Changed 5 years ago by neaf

Attachment: json-filter.v3.diff added

Add a warning about naively displaying strings from JSON marked as safe.

comment:9 Changed 5 years ago by neaf

Triage Stage: AcceptedReady for checkin

comment:10 Changed 5 years ago by neaf

Triage Stage: Ready for checkinAccepted

Changed 5 years ago by neaf

Attachment: json-filter.v4.diff added

Add test for invalid argument (not JSON serializable)

comment:11 Changed 5 years ago by cypreess

Triage Stage: AcceptedReady for checkin

comment:12 Changed 5 years ago by Aymeric Augustin

Triage Stage: Ready for checkinAccepted

This ticket is assigned to me because I'm currently working on it... It's recommended to ping the owner of a ticket before taking it over, to avoid duplicate work...

This is a new feature and we're about to release 1.4 beta, so I can't commit it now

Unless the other core devs have more objections than expected, I'll commit it for 1.5.

comment:13 Changed 5 years ago by neaf

Version: 1.4-alpha-1SVN

comment:14 Changed 5 years ago by Aymeric Augustin

Easy pickings: unset
Triage Stage: AcceptedDesign decision needed

When the output isn't marked safe, indeed, {{ foobar|json }} will fail whenever foobar contains strings. But I don't think Django can or should do something about this fact. If the output were arbitrarily (and wrongly) marked safe, that would defeat Django's anti-XSS protection.

Let's just try this with some interesting values of a:

<script>var a = {{ a|json }};</script>

For instance:

<script>var a = "foo</script><script>alert('pwnd!');</script><script>bar;"</script>

Wrapping with CDATA doesn't help:

<script><![CDATA[var a = "foo]]></script><script>alert('pwnd!');</script><script><![CDATA[bar;";]]></script>

Per the "Don't Give Users Guns Aimed At Feet", this isn't possible.


So I hesitate between two alternatives at this point:

  • add the filter with safety-by-default, and explain in the docs that if you have a problem with quotes being escaped, you should use another technique — like loading the data via AJAX — or carefully escape all the strings that end up in you data structure and add |safe in your template.
  • not add the filter at all, because the only implementation we can afford (the safe one) isn't useful enough.

What do you think?

comment:15 Changed 5 years ago by Mike Fogel

It's a bit hacky, but we might be able to use JSONEncoder.iterencode to escape only the string data in the json object. Proof of concept:

from django.utils import simplejson
from django.utils.html import escape

def encode_as_escaped_json(obj):
    result = ''
    for part in simplejson.JSONEncoder().iterencode(obj):
        if part[0:3] == ', "':
            result += ', "' + escape(part[3:-1]) + '"'
        elif part[0] == '"':
            result += '"' + escape(part[1:-1]) + '"'
        else:
            result += part
    return result

if __name__ == '__main__':
    my_obj = [
        {'k1': '</script><script>Attack!</script><script>', 'k2': 42},
        'e"eer',
        '</script><script>More attack!</script><script>',
    ]
    escaped_json = encode_as_escaped_json(my_obj)
    print escaped_json

Running this yields:

1558]$ python ./tmp.py 
[{"k2": 42, "k1": "&lt;/script&gt;&lt;script&gt;Attack!&lt;/script&gt;&lt;script&gt;"}, "e\&quot;eer", "&lt;/script&gt;&lt;script&gt;More attack!&lt;/script&gt;&lt;script&gt;"]

This works because iterencode() returns string-based dictionary keys and values as "<string>", and strings in lists as , "<string>" .

comment:16 Changed 5 years ago by Aymeric Augustin

Triage Stage: Design decision neededAccepted

It's hard to prove that this technique is secure...

We could achieve a similar effect with a custom JSON encode that only escapes strings. I'm going to write a patch.

comment:17 Changed 5 years ago by Aymeric Augustin

Custom JSON encoders will only be called if the default JSON encoder can't process some data. Since the default encoder can process strings, custom encoders don't resolve our problem.

I've also tried recursively escaping the value before passing it to the JSON encoder (see attached patch), but it's still insecure :(

>>> from django.template import *
>>> Template('{{ data|json }}').render(Context({'data': '<>'}))
u'"&lt;&gt;"'
>>> class NastyInt(int):
...     def __str__(self):
...         return '</script><script>alert("%d");' % self
... 
>>> Template('{{ data|json }}').render(Context({'data': NastyInt(42)}))
u'</script><script>alert("42");'

Of course, this is an edge case, but we can't compromise on security.

Changed 5 years ago by Aymeric Augustin

comment:18 Changed 5 years ago by Aymeric Augustin

Resolution: wontfix
Status: newclosed

I've finally changed my mind on this ticket. There's a good reason why it isn't recommended to just include raw JSON data in the HTML: it's extremely hard to prevent XSS unless you enforce other rules about the context, and this isn't under the control of Django.

It's obviously possible to solve the problem for particular use cases with controlled data. In such cases, the implementation only takes a few lines (see the first patch on this ticket). However, it appears very hard to provide a general |json filter that will be safe in all contexts.

If you think you have a working implementation, please submit it on the mailing list. I'll expect strong evidence that it's secure :)

Changed 2 years ago by Jacob Rief

Attachment: jsonfilter.py added

Implement a safe JSON filter templatetag

comment:19 Changed 2 years ago by Jacob Rief

Sorry for adding my two cents to this ticket, 3 years after it has been set to wontfix, but for real projects such a filter still is an issue and often required. And since there is no solution out-of-the-box, programmers start to implement their own stuff, which then is vulnerable to exactly the XSS attacks you're referring to. For instance here: https://github.com/divio/django-cms/blob/develop/cms/templatetags/cms_js_tags.py#L14.

Therefore I started to rethink about this problem and came to a solution which seems to be secure; at least I was unable to inject malicious code into this JSON filter. Please have a look at my attached implementation.

One thing to note here: If someone pushes data through this filter marked as safe (using mark_safe), then of course XSS attacks are possible, but this is intentional misbehavior if applied to non-validated content.
All other Python lists, dicts and strings (in my opinion) are safe when pushed through this filter.

comment:20 in reply to:  18 Changed 23 months ago by Aryeh Hillman

Replying to aaugustin:

I've finally changed my mind on this ticket. There's a good reason why it isn't recommended to just include raw JSON data in the HTML: it's extremely hard to prevent XSS unless you enforce other rules about the context, and this isn't under the control of Django.

It's obviously possible to solve the problem for particular use cases with controlled data. In such cases, the implementation only takes a few lines (see the first patch on this ticket). However, it appears very hard to provide a general |json filter that will be safe in all contexts.

If you think you have a working implementation, please submit it on the mailing list. I'll expect strong evidence that it's secure :)

This is a pretty old ticket. But out of curiosity, how would you deal with this kind of situation, Aymeric? That is, binding some data to a JavaScript variable. Just create an API endpoint to grab data via ajax and bind it to a variable there? Create a template tag for your own usage given that you are aware of its potential security issues?

comment:21 Changed 23 months ago by Aymeric Augustin

Yes, I would do the former if user-controlled data is involved and the latter if the data is sufficiently simple and controlled that security isn't an issue.

comment:22 Changed 11 months ago by Tim Graham

Cc: Florian Apolloner added
Has patch: unset
Resolution: wontfix
Status: closednew
Summary: JSON template tagAdd a JSON template filter

Florian says we should include this because, for one reason, there are many insecure third-party implementations. He "got an okay from fusionbox to pull their json filter into core."

comment:23 Changed 11 months ago by Aymeric Augustin

I agree. (I changed my mind on this.)

We should document clearly the safety restrictions.

If I understand correctly, that usage is unsafe:

<div data-json-stuff="{{ stuff | json }}">

comment:24 Changed 11 months ago by Florian Apolloner

Yes, this usage is unsafe since the filter first and foremost just prevents "</script>" from getting executed.

As for mentioning the origial author(s), we should not forgot to clearly attribute "Fusionbox, Inc. <programmers at fusionbox dot com>"

Last edited 11 months ago by Florian Apolloner (previous) (diff)

comment:25 Changed 11 months ago by Nathan Vander Wilt

The suggested patch has rather surprising behavior to me, with its custom escaping. Normally an HTML attribute is the *best* place to put JSON-formatted data:

  • plays nicely most template languages' default HTML escaping (i.e. the usual &quot; behaves correctly)
  • provides a safe place for data to live _as data_, rather than as code
  • just as easy to get to from an external script file as an inline one
  • assuming it is placed in a data- attribute (as it should be), access can still be very convenient:
<script type="x-config/x-data" id="variables" data-my-info="{{ info|to_json }}"></script>
<script>
  // with jQuery
  var myInfo = $('#variables').data('myInfo');
  
  // without, one simple option
  var myInfo = JSON.parse(document.getElementById('variables').getAttribute('data-my-info'));
</script>

Obviously this is very different usage than intended by django-argonauts, as evidenced by the examples their current README, which are using the output as JavaScript literals directly within an inline script.

The argonauts approach isn't necessarily wrong, but feels more like its intended to be used as "generate some JavaScript" tag rather than one for "output JSON content [safely in an HTML context]". Granted, the former (inline JavaScript) could very well be what many developers want, and the latter (HTML-safe JSON) ends up being a bit of a pain anywhere *but* in an attribute.

Hopefully my 2¢ on an alternative consideration is helpful feedback. There's certainly some irony in proposing a solution that only works in one slightly awkward place, up against a solution that is convenient everywhere *except* that same place (albeit where it is then unsafe). For me it was a tradeoff: fully escaped output is _safe_ anywhere it ends up. Unfortunately it's only _useful_ when extracted from an attribute value, but that's a habit I was willing to settle on; I can use the same pattern across pretty much any platform's template engine. YMMV.

Last edited 11 months ago by Nathan Vander Wilt (previous) (diff)

comment:26 in reply to:  25 Changed 11 months ago by Florian Apolloner

Replying to natevw:

The suggested patch has rather surprising behavior to me, with its custom escaping. Normally an HTML attribute is the *best* place to put JSON-formatted data

I would not necessarily agree on *best*. <script type="application/json"> is just as valid as a data attribute and requires way less escaping. Strictly speaking, if you are actually putting a lot of small JSON snippets into data attributes, escaping takes up quite a bit of space (And we've all learned by now that compressing the output inside TLS can be dangerous). That aside, your argument is not actually correct; depending on the HTML attribute you are using, different escaping rules apply (think of all the js event attributes [onclick etc…]). So in that sense: "Fully" (for some definition of fully) escaped output is not safe anywhere it ends up.

So in the end, we at least need documentation on how to safely use JSON in HTML (though one might argue that docs exist already outside of Django and this is not Django specific) and/or make usages outside of data-* attributes also safe (ie the json filter for the most "common" situations).

Regarding:

Obviously this is very different usage than intended by django-argonauts, as evidenced by the examples ​their current README, which are using the output as JavaScript literals directly within an inline script.

Yes, this is imo an important example as loads and loads of tutorials on the web suggest such a usage (for better or worse).

Last edited 11 months ago by Florian Apolloner (previous) (diff)

comment:27 Changed 10 months ago by Adrian Macneil

Cc: adrian.macneil@… added

I think it's very important to add some form of json filter. Right now there is a situation where developers are putting the output of json.dumps() directly into templates and marking it as safe, which causes an instant XSS vulnerability (I have seen this in the wild, hence why I came across this ticket). This is a fairly complicated security issue, so Django should make it easy for users to do the right thing.

To summarize previous comments, there are two different contexts where you may wish to use JSON in templates:

  1. Inside HTML (e.g. data-* attributes)
  2. Inside CDATA (e.g. var foo = {}; inside a <script> tag)

The problem here is that each context requires different escaping rules, and there is no easy way for Django to tell which context you are in.

To escape for context 1 (inside html attributes) is actually fairly easy - run json.dumps() on the object, then apply regular html escaping:

foo = {'foo': 'bar'}
json_str = json.dumps(foo)
<div data-foo="{{ json_str }}">
<div data-foo="{&quot;foo&quot;:&quot;bar&quot;}">

To escape for context 2 (inside script tags), the problem is that HTML escaping will produce invalid javascript:

<script>
var foo = {{ json_str }};
</script>
<script>
var foo = {&quot;foo&quot;:&quot;bar&quot;}; // causes JS parse error
</script>

Currently people are solving this by marking the JSON string as safe, incorrectly assuming that the output is safe inside a script tag. However, as has been mentioned above, some valid JSON has special meaning inside script tags:

foo = {'foo': 'bar </script> attack'}
json_str = json.dumps(foo)
<script>
var foo = {{ json_str|safe }};
</script>
<script>
var foo = {"foo":"bar </script> attack"};
</script>

The problem here is that HTML has no knowledge of parsing JS, so treats the first </script> as the end of the script tag, and allows the attacker to insert arbitrary HTML or JS afterwards.

This can be solved simply and effectively by replacing unsafe HTML characters with unicode escape sequences. The characters which should be replaced are &, <, >, \u2028, and \u2029 (the last two are treated as newlines by some javascript engines, which may allow an attacker to begin a new javascript instruction - I haven't seen them mentioned in this thread yet). These 5 characters are also replaced by the json_escape filter in Rails, which has a similar purpose. Safe output will look like this:

<script>
var foo = {"foo":"bar \u003c/script\u003e attack"};
</script>

Replacing these characters is easy to do, and valid in both contexts (html and script tags), so they should always be replaced by any Django json filter. I have created a simple implementation of a json filter which replaces these characters here:

https://gist.github.com/amacneil/5af7cd0e934f5465b695

The last remaining issue is whether the output of this filter should be marked as safe. The alternatives are:

  • Automatically mark json filter output as safe. It will work as expected inside script tags, but any use inside html attributes will result in invalid HTML and potential XSS attacks.
  • Do not mark output as safe. JSON will be correctly escaped inside html attributes, and use inside script tags will result in malformed JS. Safe use inside script tags can be achieved by using {{ data|json|safe }}. Developers must be made aware that the safe filter should ONLY be used when in the context of a script tag, and not for general HTML.

Neither of these is a great option, because it relies on the developer knowing/remembering that output is safe in one context, but unsafe in another. However, of the two I would probably prefer the second option (JSON output is unsafe by default, and must be manually marked as safe for use inside script tags), only because it is more safe by default, and requires explicit thought to force the output as safe, which should ring alarm bells and cause extra caution.

A final alternative may be to create another filter, named something like scriptjson, which has the same effect as json, but which marks the output as safe. Documentation could state that this filter should ONLY be used inside <script> tags, and that the regular json filter should be used everywhere else. Functionally this would be identical to using json|safe, but it is more semantically obvious what the author is trying to do, and it would be easier to document correct usage.

comment:28 Changed 10 months ago by Gavin Wahl

I want to point out that django-argonauts can be used both to generate javascript code and to populate a <script type="application/json" element. <script> tags have the same escaping rules for their content no matter what type they are, so it doesn't matter to the json filter. I would prefer that the django docs endorse <script type="application/javascript"> rather than data attributes.

comment:29 Changed 7 months ago by Sergei Maertens

Cc: sergeimaertens@… added
Note: See TracTickets for help on using tickets.
Back to Top