#17419 closed New feature (fixed)
Add a JSON template filter
Reported by: | Lau Bech Lauritzen | Owned by: | Aymeric Augustin |
---|---|---|---|
Component: | Template system | Version: | dev |
Severity: | Normal | Keywords: | json template tag |
Cc: | mike@…, Florian Apolloner, adrian.macneil@…, sergeimaertens@…, nate-djangoproject@… | Triage Stage: | Accepted |
Has patch: | yes | Needs documentation: | no |
Needs tests: | no | Patch needs improvement: | yes |
Easy pickings: | no | UI/UX: | no |
Description
It's nice to be able to quickly make Python lists and dictionaries into JSON. The normal way of doing this is to dump the data structure as JSON in the view and then outputting the JSON within a script tag in the template, remembering to pipe it through safe. This little addition to Django would streamline that process.
It lets you do this:
<script> var data = {{ data|json }}; </script>
Instead of:
views.py: from django.utils import simplejson def home(request): data = {'hello: ['world', 'universe']} return render_to_response("home.html", { 'data': simplejson.dumps(data), }, context_instance=get_context(request)) home.html: <script> var data = {{ data|safe }}; </script>
Attachments (8)
Change History (51)
by , 13 years ago
comment:1 by , 13 years ago
Triage Stage: | Unreviewed → Accepted |
---|
comment:2 by , 13 years ago
Easy pickings: | set |
---|---|
Needs documentation: | set |
Needs tests: | set |
comment:3 by , 13 years ago
Cc: | added |
---|
comment:4 by , 13 years ago
Needs documentation: | unset |
---|---|
Needs tests: | unset |
Version: | 1.3 → 1.4-alpha-1 |
I've uploaded a patch that applies cleanly to trunk that contains the new filter, tests, and documentation.
Would be great to get this into 1.4 - let me know if you need any additions/changes in order to accept it. Thanks.
comment:5 by , 13 years ago
Patch needs improvement: | set |
---|
It isn't obvious to me that marking the output as safe by default is the right thing to do. Not everyone adds CDATA markers to its <script> tags. Actually, most frontend devs I've worked with don't.
Wouldn't the current implementation break HTML parsing when the filter is used naively?
If so, I'd prefer {{ data|json|safe }}
within CDATA sections and {{ data|json }}
everywhere else -- security should be on be default.
by , 13 years ago
Attachment: | json-filter.v2.diff added |
---|
json template filter with tests and docs - output no longer marked safe
comment:6 by , 13 years ago
Patch needs improvement: | unset |
---|
Not everyone adds CDATA markers to its <script> tags.
Only required if you're using an XHTML doctype (rather than HTML 5 or 4.01). But no matter - I'm on board with a safety-by-default approach.
I've attached another patch, this one does not mark the output safe. Docs and tests are updated as well. Thanks for reviewing.
comment:7 by , 13 years ago
Owner: | changed from | to
---|
Thanks for updating the patch.
This is a new feature, so I need to check if I can slip it in 1.4 now or if it'll have to wait for 1.5. Anyway, I intend to commit it eventually, as I've felt the need for such a tag quite often.
comment:8 by , 13 years ago
Sounds nice that it's going in, but playing it safe isn't necessarily that helpful. escape
turns " into "
, which means you can't output a string in that JSON without safe
- including a simple object like { "a": 123
}.
I've tested a simple
<script> var x = "foo"; </script>
with both an HTML5 doctype an XHTML 1 strict doctype, and it doesn't work in any of them. So it seems to me this filter is never useful in script tags without safe
? It would perhaps be better to add a warning to the documentation?
by , 13 years ago
Attachment: | json-filter.v3.diff added |
---|
Add a warning about naively displaying strings from JSON marked as safe.
comment:9 by , 13 years ago
Triage Stage: | Accepted → Ready for checkin |
---|
comment:10 by , 13 years ago
Triage Stage: | Ready for checkin → Accepted |
---|
by , 13 years ago
Attachment: | json-filter.v4.diff added |
---|
Add test for invalid argument (not JSON serializable)
comment:11 by , 13 years ago
Triage Stage: | Accepted → Ready for checkin |
---|
comment:12 by , 13 years ago
Triage Stage: | Ready for checkin → Accepted |
---|
This ticket is assigned to me because I'm currently working on it... It's recommended to ping the owner of a ticket before taking it over, to avoid duplicate work...
This is a new feature and we're about to release 1.4 beta, so I can't commit it now
Unless the other core devs have more objections than expected, I'll commit it for 1.5.
comment:13 by , 13 years ago
Version: | 1.4-alpha-1 → SVN |
---|
comment:14 by , 13 years ago
Easy pickings: | unset |
---|---|
Triage Stage: | Accepted → Design decision needed |
When the output isn't marked safe, indeed, {{ foobar|json }}
will fail whenever foobar
contains strings. But I don't think Django can or should do something about this fact. If the output were arbitrarily (and wrongly) marked safe, that would defeat Django's anti-XSS protection.
Let's just try this with some interesting values of a
:
<script>var a = {{ a|json }};</script>
For instance:
<script>var a = "foo</script><script>alert('pwnd!');</script><script>bar;"</script>
Wrapping with CDATA doesn't help:
<script><![CDATA[var a = "foo]]></script><script>alert('pwnd!');</script><script><![CDATA[bar;";]]></script>
Per the "Don't Give Users Guns Aimed At Feet", this isn't possible.
So I hesitate between two alternatives at this point:
- add the filter with safety-by-default, and explain in the docs that if you have a problem with quotes being escaped, you should use another technique — like loading the data via AJAX — or carefully escape all the strings that end up in you data structure and add
|safe
in your template. - not add the filter at all, because the only implementation we can afford (the safe one) isn't useful enough.
What do you think?
comment:15 by , 13 years ago
It's a bit hacky, but we might be able to use JSONEncoder.iterencode to escape only the string data in the json object. Proof of concept:
from django.utils import simplejson from django.utils.html import escape def encode_as_escaped_json(obj): result = '' for part in simplejson.JSONEncoder().iterencode(obj): if part[0:3] == ', "': result += ', "' + escape(part[3:-1]) + '"' elif part[0] == '"': result += '"' + escape(part[1:-1]) + '"' else: result += part return result if __name__ == '__main__': my_obj = [ {'k1': '</script><script>Attack!</script><script>', 'k2': 42}, 'e"eer', '</script><script>More attack!</script><script>', ] escaped_json = encode_as_escaped_json(my_obj) print escaped_json
Running this yields:
1558]$ python ./tmp.py [{"k2": 42, "k1": "</script><script>Attack!</script><script>"}, "e\"eer", "</script><script>More attack!</script><script>"]
This works because iterencode() returns string-based dictionary keys and values as "<string>", and strings in lists as , "<string>" .
comment:16 by , 13 years ago
Triage Stage: | Design decision needed → Accepted |
---|
It's hard to prove that this technique is secure...
We could achieve a similar effect with a custom JSON encode that only escapes strings. I'm going to write a patch.
comment:17 by , 13 years ago
Custom JSON encoders will only be called if the default JSON encoder can't process some data. Since the default encoder can process strings, custom encoders don't resolve our problem.
I've also tried recursively escaping the value before passing it to the JSON encoder (see attached patch), but it's still insecure :(
>>> from django.template import * >>> Template('{{ data|json }}').render(Context({'data': '<>'})) u'"<>"'
>>> class NastyInt(int): ... def __str__(self): ... return '</script><script>alert("%d");' % self ... >>> Template('{{ data|json }}').render(Context({'data': NastyInt(42)})) u'</script><script>alert("42");'
Of course, this is an edge case, but we can't compromise on security.
by , 13 years ago
Attachment: | 17419-insecure-do-not-use-this.patch added |
---|
follow-up: 20 comment:18 by , 13 years ago
Resolution: | → wontfix |
---|---|
Status: | new → closed |
I've finally changed my mind on this ticket. There's a good reason why it isn't recommended to just include raw JSON data in the HTML: it's extremely hard to prevent XSS unless you enforce other rules about the context, and this isn't under the control of Django.
It's obviously possible to solve the problem for particular use cases with controlled data. In such cases, the implementation only takes a few lines (see the first patch on this ticket). However, it appears very hard to provide a general |json
filter that will be safe in all contexts.
If you think you have a working implementation, please submit it on the mailing list. I'll expect strong evidence that it's secure :)
comment:19 by , 10 years ago
Sorry for adding my two cents to this ticket, 3 years after it has been set to wontfix, but for real projects such a filter still is an issue and often required. And since there is no solution out-of-the-box, programmers start to implement their own stuff, which then is vulnerable to exactly the XSS attacks you're referring to. For instance here: https://github.com/divio/django-cms/blob/develop/cms/templatetags/cms_js_tags.py#L14.
Therefore I started to rethink about this problem and came to a solution which seems to be secure; at least I was unable to inject malicious code into this JSON filter. Please have a look at my attached implementation.
One thing to note here: If someone pushes data through this filter marked as safe (using mark_safe
), then of course XSS attacks are possible, but this is intentional misbehavior if applied to non-validated content.
All other Python lists, dicts and strings (in my opinion) are safe when pushed through this filter.
comment:20 by , 10 years ago
Replying to aaugustin:
I've finally changed my mind on this ticket. There's a good reason why it isn't recommended to just include raw JSON data in the HTML: it's extremely hard to prevent XSS unless you enforce other rules about the context, and this isn't under the control of Django.
It's obviously possible to solve the problem for particular use cases with controlled data. In such cases, the implementation only takes a few lines (see the first patch on this ticket). However, it appears very hard to provide a general
|json
filter that will be safe in all contexts.
If you think you have a working implementation, please submit it on the mailing list. I'll expect strong evidence that it's secure :)
This is a pretty old ticket. But out of curiosity, how would you deal with this kind of situation, Aymeric? That is, binding some data to a JavaScript variable. Just create an API endpoint to grab data via ajax and bind it to a variable there? Create a template tag for your own usage given that you are aware of its potential security issues?
comment:21 by , 10 years ago
Yes, I would do the former if user-controlled data is involved and the latter if the data is sufficiently simple and controlled that security isn't an issue.
comment:22 by , 9 years ago
Cc: | added |
---|---|
Has patch: | unset |
Resolution: | wontfix |
Status: | closed → new |
Summary: | JSON template tag → Add a JSON template filter |
Florian says we should include this because, for one reason, there are many insecure third-party implementations. He "got an okay from fusionbox to pull their json filter into core."
comment:23 by , 9 years ago
I agree. (I changed my mind on this.)
We should document clearly the safety restrictions.
If I understand correctly, that usage is unsafe:
<div data-json-stuff="{{ stuff | json }}">
comment:24 by , 9 years ago
Yes, this usage is unsafe since the filter first and foremost just prevents "</script>" from getting executed.
As for mentioning the origial author(s), we should not forgot to clearly attribute "Fusionbox, Inc. <rogrammers@…>".
follow-up: 26 comment:25 by , 9 years ago
The suggested patch has rather surprising behavior to me, with its custom escaping. Normally an HTML attribute is the *best* place to put JSON-formatted data:
- plays nicely most template languages' default HTML escaping (i.e. the usual
"
behaves correctly) - provides a safe place for data to live _as data_, rather than as code
- just as easy to get to from an external script file as an inline one
- assuming it is placed in a
data-
attribute (as it should be), access can still be very convenient:
<script type="x-config/x-data" id="variables" data-my-info="{{ info|to_json }}"></script> <script> // with jQuery var myInfo = $('#variables').data('myInfo'); // without, one simple option var myInfo = JSON.parse(document.getElementById('variables').getAttribute('data-my-info')); </script>
Obviously this is very different usage than intended by django-argonauts, as evidenced by the examples their current README, which are using the output as JavaScript literals directly within an inline script.
The argonauts approach isn't necessarily wrong, but feels more like its intended to be used as "generate some JavaScript" tag rather than one for "output JSON content [safely in an HTML context]". Granted, the former (inline JavaScript) could very well be what many developers want, and the latter (HTML-safe JSON) ends up being a bit of a pain anywhere *but* in an attribute.
Hopefully my 2¢ on an alternative consideration is helpful feedback. There's certainly some irony in proposing a solution that only works in one slightly awkward place, up against a solution that is convenient everywhere *except* that same place (albeit where it is then unsafe). For me it was a tradeoff: fully escaped output is _safe_ anywhere it ends up. Unfortunately it's only _useful_ when extracted from an attribute value, but that's a habit I was willing to settle on; I can use the same pattern across pretty much any platform's template engine. YMMV.
comment:26 by , 9 years ago
Replying to natevw:
The suggested patch has rather surprising behavior to me, with its custom escaping. Normally an HTML attribute is the *best* place to put JSON-formatted data
I would not necessarily agree on *best*. <script type="application/json">
is just as valid as a data attribute and requires way less escaping. Strictly speaking, if you are actually putting a lot of small JSON snippets into data attributes, escaping takes up quite a bit of space (And we've all learned by now that compressing the output inside TLS can be dangerous). That aside, your argument is not actually correct; depending on the HTML attribute you are using, different escaping rules apply (think of all the js event attributes [onclick etc…]). So in that sense: "Fully" (for some definition of fully) escaped output is not safe anywhere it ends up.
So in the end, we at least need documentation on how to safely use JSON in HTML (though one might argue that docs exist already outside of Django and this is not Django specific) and/or make usages outside of data-*
attributes also safe (ie the json filter for the most "common" situations).
Regarding:
Obviously this is very different usage than intended by django-argonauts, as evidenced by the examples their current README, which are using the output as JavaScript literals directly within an inline script.
Yes, this is imo an important example as loads and loads of tutorials on the web suggest such a usage (for better or worse).
comment:27 by , 9 years ago
Cc: | added |
---|
I think it's very important to add some form of json
filter. Right now there is a situation where developers are putting the output of json.dumps()
directly into templates and marking it as safe, which causes an instant XSS vulnerability (I have seen this in the wild, hence why I came across this ticket). This is a fairly complicated security issue, so Django should make it easy for users to do the right thing.
To summarize previous comments, there are two different contexts where you may wish to use JSON in templates:
- Inside HTML (e.g.
data-*
attributes) - Inside CDATA (e.g.
var foo = {};
inside a<script>
tag)
The problem here is that each context requires different escaping rules, and there is no easy way for Django to tell which context you are in.
To escape for context 1 (inside html attributes) is actually fairly easy - run json.dumps()
on the object, then apply regular html escaping:
foo = {'foo': 'bar'} json_str = json.dumps(foo)
<div data-foo="{{ json_str }}">
<div data-foo="{"foo":"bar"}">
To escape for context 2 (inside script tags), the problem is that HTML escaping will produce invalid javascript:
<script> var foo = {{ json_str }}; </script>
<script> var foo = {"foo":"bar"}; // causes JS parse error </script>
Currently people are solving this by marking the JSON string as safe, incorrectly assuming that the output is safe inside a script tag. However, as has been mentioned above, some valid JSON has special meaning inside script tags:
foo = {'foo': 'bar </script> attack'} json_str = json.dumps(foo)
<script> var foo = {{ json_str|safe }}; </script>
<script> var foo = {"foo":"bar </script> attack"}; </script>
The problem here is that HTML has no knowledge of parsing JS, so treats the first </script>
as the end of the script tag, and allows the attacker to insert arbitrary HTML or JS afterwards.
This can be solved simply and effectively by replacing unsafe HTML characters with unicode escape sequences. The characters which should be replaced are &
, <
, >
, \u2028
, and \u2029
(the last two are treated as newlines by some javascript engines, which may allow an attacker to begin a new javascript instruction - I haven't seen them mentioned in this thread yet). These 5 characters are also replaced by the json_escape filter in Rails, which has a similar purpose. Safe output will look like this:
<script> var foo = {"foo":"bar \u003c/script\u003e attack"}; </script>
Replacing these characters is easy to do, and valid in both contexts (html and script tags), so they should always be replaced by any Django json
filter. I have created a simple implementation of a json filter which replaces these characters here:
https://gist.github.com/amacneil/5af7cd0e934f5465b695
The last remaining issue is whether the output of this filter should be marked as safe. The alternatives are:
- Automatically mark
json
filter output as safe. It will work as expected inside script tags, but any use inside html attributes will result in invalid HTML and potential XSS attacks. - Do not mark output as safe. JSON will be correctly escaped inside html attributes, and use inside script tags will result in malformed JS. Safe use inside script tags can be achieved by using
{{ data|json|safe }}
. Developers must be made aware that thesafe
filter should ONLY be used when in the context of a script tag, and not for general HTML.
Neither of these is a great option, because it relies on the developer knowing/remembering that output is safe in one context, but unsafe in another. However, of the two I would probably prefer the second option (JSON output is unsafe by default, and must be manually marked as safe for use inside script tags), only because it is more safe by default, and requires explicit thought to force the output as safe, which should ring alarm bells and cause extra caution.
A final alternative may be to create another filter, named something like scriptjson
, which has the same effect as json
, but which marks the output as safe. Documentation could state that this filter should ONLY be used inside <script>
tags, and that the regular json
filter should be used everywhere else. Functionally this would be identical to using json|safe
, but it is more semantically obvious what the author is trying to do, and it would be easier to document correct usage.
comment:28 by , 9 years ago
I want to point out that django-argonauts can be used both to generate javascript code and to populate a <script type="application/json"
element. <script>
tags have the same escaping rules for their content no matter what type they are, so it doesn't matter to the json filter. I would prefer that the django docs endorse <script type="application/javascript">
rather than data attributes.
comment:29 by , 9 years ago
Cc: | added |
---|
comment:30 by , 8 years ago
Cc: | added |
---|
by , 8 years ago
Attachment: | json_tags.py added |
---|
follow-up: 34 comment:31 by , 8 years ago
My Two cents on this.
Escaping json is tricky for all the reasons outlined above, is it in a <script> is it in an attribute and will people know what to do with a filter that seems to output as safe json, but turns out it is only in certain circumstances.
I would propose that we have a simple filter that outputs the json in a <script type="application/json"></script> block
{{ data|as_json_script:"id_data" }}
which would output
<script id="id_data" type="application/json">{"hello": ["world", "universe"]}</script>
and which can then be read in by other javascript
var data = JSON.parse(document.getElementById("id_data").textContent)
or similar with jquery etc
This doesn't leave developers under any illusion that it could be used in html attribute or within a script tag and has the added advantage that it facilitates moving away from inline javascript which is better for supporting good CSP policies.
Would be happy to put together a pull request if there was support for this approach.
comment:32 by , 8 years ago
I do not like the name of the filter, but I do like the idea. Data attributes don't require any special considerations if done properly and we can suggest to not use json in normal <script> tags (which one should not have either way due to CSP).
comment:33 by , 8 years ago
The name I am in no way tied to, the little thinking I did on it was trying to make it similar to form.as_p, etc
I was looking for a name that would be obvious that you should not use this tag in the other scenarios in the thread, attributes and inline javascript.
follow-up: 35 comment:34 by , 8 years ago
Replying to blighj:
Would be happy to put together a pull request if there was support for this approach.
Hi! Are you still planning to work on it? I could contribute with extending your patch with tests, if you like.
comment:35 by , 8 years ago
Replying to Mateusz Jankowski:
A review would be great, here is the current commit for the patch
https://github.com/blighj/django/commit/af32bab8f9575cb968cd92931cec6f442bf31723
A general review would be good, plus specific things that should be looked at would be
- naming the filter
- location of functions, I thought it useful to have the main function available outside the template filter, so it could be reused and repurposed by other projects, so I included it in utils/html.py. This caused me issues with including DjangoJSONEncoder at the top of the file, so I included it in the function. Which is bad right?
comment:36 by , 8 years ago
Why are you trying to magically support both already-encoded and to-be-encoded arguments? This leads to bugs. js_json('"asdf"')
-- is that an json-encoded asdf
, or a string whose value really has quotes in it?
comment:37 by , 8 years ago
Seemed like a good idea at the time. If you had a variable that was already json it would give you a simple way to wrap it in a script block. If for what ever reason you had the following string passed to your template
'{"variable":"user contributed string</script>"}'
which you wanted to output to javascript in a similar style, the filter would be useful to output in a script safely but keeping it as json
What I hadn't considered was your simple string example or equally, if someone did to_json("4"). They will expect that to be output as a string not a number.
So yeah let's get rid of that idea, if someone has a json string and want to use this filter, then call json.loads first.
Any other comments or advice?
comment:38 by , 7 years ago
Removed the "magic string type check" and updated to latest master: https://github.com/django/django/pull/9235
comment:39 by , 7 years ago
Has patch: | set |
---|
comment:40 by , 7 years ago
Patch needs improvement: | set |
---|
JSON template tag