Opened 13 years ago

Closed 7 years ago

Last modified 6 years ago

#17419 closed New feature (fixed)

Add a JSON template filter

Reported by: Lau Bech Lauritzen Owned by: Aymeric Augustin
Component: Template system Version: dev
Severity: Normal Keywords: json template tag
Cc: mike@…, Florian Apolloner, adrian.macneil@…, sergeimaertens@…, nate-djangoproject@… Triage Stage: Accepted
Has patch: yes Needs documentation: no
Needs tests: no Patch needs improvement: yes
Easy pickings: no UI/UX: no

Description

It's nice to be able to quickly make Python lists and dictionaries into JSON. The normal way of doing this is to dump the data structure as JSON in the view and then outputting the JSON within a script tag in the template, remembering to pipe it through safe. This little addition to Django would streamline that process.

It lets you do this:

<script>
  var data = {{ data|json }};
</script>

Instead of:

views.py:

from django.utils import simplejson 

def home(request):
  data = {'hello: ['world', 'universe']}
  return render_to_response("home.html", {
    'data': simplejson.dumps(data),
    }, context_instance=get_context(request))    


home.html:

<script>
  var data = {{ data|safe }};
</script>

Attachments (8)

json.py (222 bytes ) - added by Lau Bech Lauritzen 13 years ago.
JSON template tag
json-filter.diff (3.8 KB ) - added by Mike Fogel 13 years ago.
json template filter with tests and docs
json-filter.v2.diff (3.7 KB ) - added by Mike Fogel 13 years ago.
json template filter with tests and docs - output no longer marked safe
json-filter.v3.diff (3.8 KB ) - added by neaf 13 years ago.
Add a warning about naively displaying strings from JSON marked as safe.
json-filter.v4.diff (2.2 KB ) - added by neaf 13 years ago.
Add test for invalid argument (not JSON serializable)
17419-insecure-do-not-use-this.patch (6.5 KB ) - added by Aymeric Augustin 13 years ago.
jsonfilter.py (1.0 KB ) - added by Jacob Rief 10 years ago.
Implement a safe JSON filter templatetag
json_tags.py (1.4 KB ) - added by blighj 8 years ago.

Download all attachments as: .zip

Change History (51)

by Lau Bech Lauritzen, 13 years ago

Attachment: json.py added

JSON template tag

comment:1 by Aymeric Augustin, 13 years ago

Triage Stage: UnreviewedAccepted

comment:2 by Aymeric Augustin, 13 years ago

Easy pickings: set
Needs documentation: set
Needs tests: set

comment:3 by Mike Fogel, 13 years ago

Cc: mike@… added

comment:4 by Mike Fogel, 13 years ago

Needs documentation: unset
Needs tests: unset
Version: 1.31.4-alpha-1

I've uploaded a patch that applies cleanly to trunk that contains the new filter, tests, and documentation.

Would be great to get this into 1.4 - let me know if you need any additions/changes in order to accept it. Thanks.

by Mike Fogel, 13 years ago

Attachment: json-filter.diff added

json template filter with tests and docs

comment:5 by Aymeric Augustin, 13 years ago

Patch needs improvement: set

It isn't obvious to me that marking the output as safe by default is the right thing to do. Not everyone adds CDATA markers to its <script> tags. Actually, most frontend devs I've worked with don't.

Wouldn't the current implementation break HTML parsing when the filter is used naively?

If so, I'd prefer {{ data|json|safe }} within CDATA sections and {{ data|json }} everywhere else -- security should be on be default.

Last edited 13 years ago by Aymeric Augustin (previous) (diff)

by Mike Fogel, 13 years ago

Attachment: json-filter.v2.diff added

json template filter with tests and docs - output no longer marked safe

comment:6 by Mike Fogel, 13 years ago

Patch needs improvement: unset

Not everyone adds CDATA markers to its <script> tags.

Only required if you're using an XHTML doctype (rather than HTML 5 or 4.01). But no matter - I'm on board with a safety-by-default approach.

I've attached another patch, this one does not mark the output safe. Docs and tests are updated as well. Thanks for reviewing.

comment:7 by Aymeric Augustin, 13 years ago

Owner: changed from nobody to Aymeric Augustin

Thanks for updating the patch.

This is a new feature, so I need to check if I can slip it in 1.4 now or if it'll have to wait for 1.5. Anyway, I intend to commit it eventually, as I've felt the need for such a tag quite often.

comment:8 by Ole Laursen, 13 years ago

Sounds nice that it's going in, but playing it safe isn't necessarily that helpful. escape turns " into &quot;, which means you can't output a string in that JSON without safe - including a simple object like { "a": 123 }.

I've tested a simple

<script>
var x = &quot;foo&quot;;
</script>

with both an HTML5 doctype an XHTML 1 strict doctype, and it doesn't work in any of them. So it seems to me this filter is never useful in script tags without safe? It would perhaps be better to add a warning to the documentation?

by neaf, 13 years ago

Attachment: json-filter.v3.diff added

Add a warning about naively displaying strings from JSON marked as safe.

comment:9 by neaf, 13 years ago

Triage Stage: AcceptedReady for checkin

comment:10 by neaf, 13 years ago

Triage Stage: Ready for checkinAccepted

by neaf, 13 years ago

Attachment: json-filter.v4.diff added

Add test for invalid argument (not JSON serializable)

comment:11 by cypreess, 13 years ago

Triage Stage: AcceptedReady for checkin

comment:12 by Aymeric Augustin, 13 years ago

Triage Stage: Ready for checkinAccepted

This ticket is assigned to me because I'm currently working on it... It's recommended to ping the owner of a ticket before taking it over, to avoid duplicate work...

This is a new feature and we're about to release 1.4 beta, so I can't commit it now

Unless the other core devs have more objections than expected, I'll commit it for 1.5.

comment:13 by neaf, 13 years ago

Version: 1.4-alpha-1SVN

comment:14 by Aymeric Augustin, 13 years ago

Easy pickings: unset
Triage Stage: AcceptedDesign decision needed

When the output isn't marked safe, indeed, {{ foobar|json }} will fail whenever foobar contains strings. But I don't think Django can or should do something about this fact. If the output were arbitrarily (and wrongly) marked safe, that would defeat Django's anti-XSS protection.

Let's just try this with some interesting values of a:

<script>var a = {{ a|json }};</script>

For instance:

<script>var a = "foo</script><script>alert('pwnd!');</script><script>bar;"</script>

Wrapping with CDATA doesn't help:

<script><![CDATA[var a = "foo]]></script><script>alert('pwnd!');</script><script><![CDATA[bar;";]]></script>

Per the "Don't Give Users Guns Aimed At Feet", this isn't possible.


So I hesitate between two alternatives at this point:

  • add the filter with safety-by-default, and explain in the docs that if you have a problem with quotes being escaped, you should use another technique — like loading the data via AJAX — or carefully escape all the strings that end up in you data structure and add |safe in your template.
  • not add the filter at all, because the only implementation we can afford (the safe one) isn't useful enough.

What do you think?

comment:15 by Mike Fogel, 13 years ago

It's a bit hacky, but we might be able to use JSONEncoder.iterencode to escape only the string data in the json object. Proof of concept:

from django.utils import simplejson
from django.utils.html import escape

def encode_as_escaped_json(obj):
    result = ''
    for part in simplejson.JSONEncoder().iterencode(obj):
        if part[0:3] == ', "':
            result += ', "' + escape(part[3:-1]) + '"'
        elif part[0] == '"':
            result += '"' + escape(part[1:-1]) + '"'
        else:
            result += part
    return result

if __name__ == '__main__':
    my_obj = [
        {'k1': '</script><script>Attack!</script><script>', 'k2': 42},
        'e"eer',
        '</script><script>More attack!</script><script>',
    ]
    escaped_json = encode_as_escaped_json(my_obj)
    print escaped_json

Running this yields:

1558]$ python ./tmp.py 
[{"k2": 42, "k1": "&lt;/script&gt;&lt;script&gt;Attack!&lt;/script&gt;&lt;script&gt;"}, "e\&quot;eer", "&lt;/script&gt;&lt;script&gt;More attack!&lt;/script&gt;&lt;script&gt;"]

This works because iterencode() returns string-based dictionary keys and values as "<string>", and strings in lists as , "<string>" .

comment:16 by Aymeric Augustin, 13 years ago

Triage Stage: Design decision neededAccepted

It's hard to prove that this technique is secure...

We could achieve a similar effect with a custom JSON encode that only escapes strings. I'm going to write a patch.

comment:17 by Aymeric Augustin, 13 years ago

Custom JSON encoders will only be called if the default JSON encoder can't process some data. Since the default encoder can process strings, custom encoders don't resolve our problem.

I've also tried recursively escaping the value before passing it to the JSON encoder (see attached patch), but it's still insecure :(

>>> from django.template import *
>>> Template('{{ data|json }}').render(Context({'data': '<>'}))
u'"&lt;&gt;"'
>>> class NastyInt(int):
...     def __str__(self):
...         return '</script><script>alert("%d");' % self
... 
>>> Template('{{ data|json }}').render(Context({'data': NastyInt(42)}))
u'</script><script>alert("42");'

Of course, this is an edge case, but we can't compromise on security.

by Aymeric Augustin, 13 years ago

comment:18 by Aymeric Augustin, 13 years ago

Resolution: wontfix
Status: newclosed

I've finally changed my mind on this ticket. There's a good reason why it isn't recommended to just include raw JSON data in the HTML: it's extremely hard to prevent XSS unless you enforce other rules about the context, and this isn't under the control of Django.

It's obviously possible to solve the problem for particular use cases with controlled data. In such cases, the implementation only takes a few lines (see the first patch on this ticket). However, it appears very hard to provide a general |json filter that will be safe in all contexts.

If you think you have a working implementation, please submit it on the mailing list. I'll expect strong evidence that it's secure :)

by Jacob Rief, 10 years ago

Attachment: jsonfilter.py added

Implement a safe JSON filter templatetag

comment:19 by Jacob Rief, 10 years ago

Sorry for adding my two cents to this ticket, 3 years after it has been set to wontfix, but for real projects such a filter still is an issue and often required. And since there is no solution out-of-the-box, programmers start to implement their own stuff, which then is vulnerable to exactly the XSS attacks you're referring to. For instance here: https://github.com/divio/django-cms/blob/develop/cms/templatetags/cms_js_tags.py#L14.

Therefore I started to rethink about this problem and came to a solution which seems to be secure; at least I was unable to inject malicious code into this JSON filter. Please have a look at my attached implementation.

One thing to note here: If someone pushes data through this filter marked as safe (using mark_safe), then of course XSS attacks are possible, but this is intentional misbehavior if applied to non-validated content.
All other Python lists, dicts and strings (in my opinion) are safe when pushed through this filter.

in reply to:  18 comment:20 by Aryeh Hillman, 10 years ago

Replying to aaugustin:

I've finally changed my mind on this ticket. There's a good reason why it isn't recommended to just include raw JSON data in the HTML: it's extremely hard to prevent XSS unless you enforce other rules about the context, and this isn't under the control of Django.

It's obviously possible to solve the problem for particular use cases with controlled data. In such cases, the implementation only takes a few lines (see the first patch on this ticket). However, it appears very hard to provide a general |json filter that will be safe in all contexts.

If you think you have a working implementation, please submit it on the mailing list. I'll expect strong evidence that it's secure :)

This is a pretty old ticket. But out of curiosity, how would you deal with this kind of situation, Aymeric? That is, binding some data to a JavaScript variable. Just create an API endpoint to grab data via ajax and bind it to a variable there? Create a template tag for your own usage given that you are aware of its potential security issues?

comment:21 by Aymeric Augustin, 10 years ago

Yes, I would do the former if user-controlled data is involved and the latter if the data is sufficiently simple and controlled that security isn't an issue.

comment:22 by Tim Graham, 9 years ago

Cc: Florian Apolloner added
Has patch: unset
Resolution: wontfix
Status: closednew
Summary: JSON template tagAdd a JSON template filter

Florian says we should include this because, for one reason, there are many insecure third-party implementations. He "got an okay from fusionbox to pull their json filter into core."

comment:23 by Aymeric Augustin, 9 years ago

I agree. (I changed my mind on this.)

We should document clearly the safety restrictions.

If I understand correctly, that usage is unsafe:

<div data-json-stuff="{{ stuff | json }}">

comment:24 by Florian Apolloner, 9 years ago

Yes, this usage is unsafe since the filter first and foremost just prevents "</script>" from getting executed.

As for mentioning the origial author(s), we should not forgot to clearly attribute "Fusionbox, Inc. <programmers at fusionbox dot com>"

Last edited 9 years ago by Florian Apolloner (previous) (diff)

comment:25 by Nathan Vander Wilt, 9 years ago

The suggested patch has rather surprising behavior to me, with its custom escaping. Normally an HTML attribute is the *best* place to put JSON-formatted data:

  • plays nicely most template language's default HTML escaping (i.e. the usual &quot; behaves correctly)
  • provides a safe place for data to live _as data_, rather than as code
  • just as easy to get to from an external script file as an inline one
  • assuming it is placed in a data- attribute (as it should be), access can still be very convenient:
<script type="x-data/x-config" id="variables" data-my-info="{{ info|to_json }}">
<script>
  // with jQuery
  var myInfo = $('#variables').data('myInfo');
  
  // without, one simple option
  var myInfo = JSON.parse(document.getElementById('variables').getAttribute('data-my-info'));
</script>

Obviously this is very different usage than intended by django-argonauts, as evidenced by the examples their current README, which are using the output as JavaScript literals directly within an inline script.

The argonauts approach isn't necessarily wrong, but feels more like its intended to be used as "generate some JavaScript" tag rather than one for "output JSON content [safely in an HTML context]". Granted, the former (inline JavaScript) could very well be what many developers want, and the latter (HTML-safe JSON) ends up being a bit of a pain anywhere *but* in an attribute.

Hopefully my 2¢ on an alternative consideration is helpful feedback. There's certainly some irony in proposing a solution that only works one place, up against a solution that is convenient everywhere *except* that same place where it happens to be unsafe. For me it was a tradeoff: fully escaped output is _safe_ anywhere it ends up. Unfortunately it's only _useful_ when extracted from an attribute value, but that's a habit I was willing to settle on; I can use the same pattern across pretty much any platform's template engine. YMMV.

Version 0, edited 9 years ago by Nathan Vander Wilt (next)

in reply to:  25 comment:26 by Florian Apolloner, 9 years ago

Replying to natevw:

The suggested patch has rather surprising behavior to me, with its custom escaping. Normally an HTML attribute is the *best* place to put JSON-formatted data

I would not necessarily agree on *best*. <script type="application/json"> is just as valid as a data attribute and requires way less escaping. Strictly speaking, if you are actually putting a lot of small JSON snippets into data attributes, escaping takes up quite a bit of space (And we've all learned by now that compressing the output inside TLS can be dangerous). That aside, your argument is not actually correct; depending on the HTML attribute you are using, different escaping rules apply (think of all the js event attributes [onclick etc…]). So in that sense: "Fully" (for some definition of fully) escaped output is not safe anywhere it ends up.

So in the end, we at least need documentation on how to safely use JSON in HTML (though one might argue that docs exist already outside of Django and this is not Django specific) and/or make usages outside of data-* attributes also safe (ie the json filter for the most "common" situations).

Regarding:

Obviously this is very different usage than intended by django-argonauts, as evidenced by the examples ​their current README, which are using the output as JavaScript literals directly within an inline script.

Yes, this is imo an important example as loads and loads of tutorials on the web suggest such a usage (for better or worse).

Last edited 9 years ago by Florian Apolloner (previous) (diff)

comment:27 by Adrian Macneil, 9 years ago

Cc: adrian.macneil@… added

I think it's very important to add some form of json filter. Right now there is a situation where developers are putting the output of json.dumps() directly into templates and marking it as safe, which causes an instant XSS vulnerability (I have seen this in the wild, hence why I came across this ticket). This is a fairly complicated security issue, so Django should make it easy for users to do the right thing.

To summarize previous comments, there are two different contexts where you may wish to use JSON in templates:

  1. Inside HTML (e.g. data-* attributes)
  2. Inside CDATA (e.g. var foo = {}; inside a <script> tag)

The problem here is that each context requires different escaping rules, and there is no easy way for Django to tell which context you are in.

To escape for context 1 (inside html attributes) is actually fairly easy - run json.dumps() on the object, then apply regular html escaping:

foo = {'foo': 'bar'}
json_str = json.dumps(foo)
<div data-foo="{{ json_str }}">
<div data-foo="{&quot;foo&quot;:&quot;bar&quot;}">

To escape for context 2 (inside script tags), the problem is that HTML escaping will produce invalid javascript:

<script>
var foo = {{ json_str }};
</script>
<script>
var foo = {&quot;foo&quot;:&quot;bar&quot;}; // causes JS parse error
</script>

Currently people are solving this by marking the JSON string as safe, incorrectly assuming that the output is safe inside a script tag. However, as has been mentioned above, some valid JSON has special meaning inside script tags:

foo = {'foo': 'bar </script> attack'}
json_str = json.dumps(foo)
<script>
var foo = {{ json_str|safe }};
</script>
<script>
var foo = {"foo":"bar </script> attack"};
</script>

The problem here is that HTML has no knowledge of parsing JS, so treats the first </script> as the end of the script tag, and allows the attacker to insert arbitrary HTML or JS afterwards.

This can be solved simply and effectively by replacing unsafe HTML characters with unicode escape sequences. The characters which should be replaced are &, <, >, \u2028, and \u2029 (the last two are treated as newlines by some javascript engines, which may allow an attacker to begin a new javascript instruction - I haven't seen them mentioned in this thread yet). These 5 characters are also replaced by the json_escape filter in Rails, which has a similar purpose. Safe output will look like this:

<script>
var foo = {"foo":"bar \u003c/script\u003e attack"};
</script>

Replacing these characters is easy to do, and valid in both contexts (html and script tags), so they should always be replaced by any Django json filter. I have created a simple implementation of a json filter which replaces these characters here:

https://gist.github.com/amacneil/5af7cd0e934f5465b695

The last remaining issue is whether the output of this filter should be marked as safe. The alternatives are:

  • Automatically mark json filter output as safe. It will work as expected inside script tags, but any use inside html attributes will result in invalid HTML and potential XSS attacks.
  • Do not mark output as safe. JSON will be correctly escaped inside html attributes, and use inside script tags will result in malformed JS. Safe use inside script tags can be achieved by using {{ data|json|safe }}. Developers must be made aware that the safe filter should ONLY be used when in the context of a script tag, and not for general HTML.

Neither of these is a great option, because it relies on the developer knowing/remembering that output is safe in one context, but unsafe in another. However, of the two I would probably prefer the second option (JSON output is unsafe by default, and must be manually marked as safe for use inside script tags), only because it is more safe by default, and requires explicit thought to force the output as safe, which should ring alarm bells and cause extra caution.

A final alternative may be to create another filter, named something like scriptjson, which has the same effect as json, but which marks the output as safe. Documentation could state that this filter should ONLY be used inside <script> tags, and that the regular json filter should be used everywhere else. Functionally this would be identical to using json|safe, but it is more semantically obvious what the author is trying to do, and it would be easier to document correct usage.

comment:28 by Gavin Wahl, 9 years ago

I want to point out that django-argonauts can be used both to generate javascript code and to populate a <script type="application/json" element. <script> tags have the same escaping rules for their content no matter what type they are, so it doesn't matter to the json filter. I would prefer that the django docs endorse <script type="application/javascript"> rather than data attributes.

comment:29 by Sergei Maertens, 9 years ago

Cc: sergeimaertens@… added

comment:30 by Nathan Shafer, 8 years ago

Cc: nate-djangoproject@… added

by blighj, 8 years ago

Attachment: json_tags.py added

comment:31 by blighj, 8 years ago

My Two cents on this.
Escaping json is tricky for all the reasons outlined above, is it in a <script> is it in an attribute and will people know what to do with a filter that seems to output as safe json, but turns out it is only in certain circumstances.

I would propose that we have a simple filter that outputs the json in a <script type="application/json"></script> block

{{ data|as_json_script:"id_data" }} 

which would output

<script id="id_data" type="application/json">{"hello": ["world", "universe"]}</script>

and which can then be read in by other javascript

var data = JSON.parse(document.getElementById("id_data").textContent)

or similar with jquery etc

This doesn't leave developers under any illusion that it could be used in html attribute or within a script tag and has the added advantage that it facilitates moving away from inline javascript which is better for supporting good CSP policies.

Would be happy to put together a pull request if there was support for this approach.

comment:32 by Florian Apolloner, 8 years ago

I do not like the name of the filter, but I do like the idea. Data attributes don't require any special considerations if done properly and we can suggest to not use json in normal <script> tags (which one should not have either way due to CSP).

comment:33 by blighj, 8 years ago

The name I am in no way tied to, the little thinking I did on it was trying to make it similar to form.as_p, etc
I was looking for a name that would be obvious that you should not use this tag in the other scenarios in the thread, attributes and inline javascript.

in reply to:  31 ; comment:34 by Mateusz Jankowski, 8 years ago

Replying to blighj:

Would be happy to put together a pull request if there was support for this approach.

Hi! Are you still planning to work on it? I could contribute with extending your patch with tests, if you like.

in reply to:  34 comment:35 by blighj, 8 years ago

Replying to Mateusz Jankowski:

A review would be great, here is the current commit for the patch
https://github.com/blighj/django/commit/af32bab8f9575cb968cd92931cec6f442bf31723

A general review would be good, plus specific things that should be looked at would be

  • naming the filter
  • location of functions, I thought it useful to have the main function available outside the template filter, so it could be reused and repurposed by other projects, so I included it in utils/html.py. This caused me issues with including DjangoJSONEncoder at the top of the file, so I included it in the function. Which is bad right?

comment:36 by Gavin Wahl, 8 years ago

Why are you trying to magically support both already-encoded and to-be-encoded arguments? This leads to bugs. js_json('"asdf"') -- is that an json-encoded asdf, or a string whose value really has quotes in it?

comment:37 by blighj, 8 years ago

Seemed like a good idea at the time. If you had a variable that was already json it would give you a simple way to wrap it in a script block. If for what ever reason you had the following string passed to your template

'{"variable":"user contributed string</script>"}'

which you wanted to output to javascript in a similar style, the filter would be useful to output in a script safely but keeping it as json

What I hadn't considered was your simple string example or equally, if someone did to_json("4"). They will expect that to be output as a string not a number.

So yeah let's get rid of that idea, if someone has a json string and want to use this filter, then call json.loads first.

Any other comments or advice?

comment:38 by Jonas Haag, 7 years ago

Removed the "magic string type check" and updated to latest master: https://github.com/django/django/pull/9235

comment:39 by Jonas Haag, 7 years ago

Has patch: set

comment:40 by Tim Graham, 7 years ago

Patch needs improvement: set

comment:41 by Tim Graham <timograham@…>, 7 years ago

Resolution: fixed
Status: newclosed

In 8c709d79:

Fixed #17419 -- Added json_tag template filter.

comment:42 by Tim Graham <timograham@…>, 6 years ago

In 02cd16a:

Refs #17419 -- Removed IE8 support in json_script example.

comment:43 by Tim Graham <timograham@…>, 6 years ago

In d5482df:

[2.1.x] Refs #17419 -- Removed IE8 support in json_script example.

Backport of 02cd16a7a04529c726e5bb5a13d5979119f25c7d from master

Note: See TracTickets for help on using tickets.
Back to Top