id,summary,reporter,owner,description,type,status,component,version,severity,resolution,keywords,cc,stage,has_patch,needs_docs,needs_tests,needs_better_patch,easy,ui_ux 22130,Accelerated deprecation fix_ampersands and clean_html,Sasha Romijn,Aymeric Augustin,"I would like to propose the accelerated deprecation of the `fix_ampersands` built-in template filter, because in all use cases, it either simply does not work, or stimulates the user to create security vulnerabilities. '''fix_ampersands''' Fix_ampersands aims to fix unescaped ampersands, by replacing them with `&`. It attempts smart detection of HTML entities, so in `©`, no change is made. The detection of this is, understandably, limited, for example `R&D; HRM; ...` would cause it not to 'fix' the ampersand, even though it is not an entity. The template tag is marked with `is_safe=True`, which means that the default escaping will occur of the input string was unsafe, and will not occur if the string was safe. However, this means that with an unsafe string, the fixing is ineffective, because the ampersand in the just generated entity is again escaped by the standard HTML escaping: {{{ >>> print Template(""{{ text|fix_ampersands }}"").render(Context({'text': 'foo & bar'})) foo & bar }}} If the string is marked as safe, this issue does not occur: {{{ >>> print Template(""{{ text|fix_ampersands|safe }}"").render(Context({'text': 'foo & bar'})) foo & bar }}} To the unsuspecting user, this may seem like the solution. However, this means only the ampersand is escaped, introducing an XSS vulnerability: {{{ >>> print Template(""{{ text|fix_ampersands|safe }}"").render(Context({'text': 'foo & bar '})) foo & bar }}} Therefore, the fix_ampersands tag is only useful when an otherwise safe string is encountered, but which has unescaped ampersands, and does not have any unescaped ampersands that could be misread as an HTML entity. However, safe strings should never have unescaped ampersands. Therefore, the only use cases I see for this tag are broken or insecure. '''clean_html''' Fix_ampersands is only used in `django.utils.clean_html`. This function is not documented, and appears to have been added with the initial publication of django. The use of this function is unclear, because it does numerous manipulations on HTML tags, but does not actually make it safe to include. Also, the things it cleans look like they are arbitrary selected, like 'Trim stupid HTML such as
'. '''Proposal''' Searching on GitHub, I have not actually managed to locate any usage of either `clean_html` or `fix_ampersands`. I did not do a very thorough search, but it seems it is quite rarely used. Considering that `clean_html`'s behaviour is rather arbitrary and incomplete, and any use of `fix_ampersands` is broken or insecure, I suggest to start an accelerated deprecation cycle as of 1.7. I suggest this to be accelerated because the current tag stimulates users to create potentially serious vulnerabilities in their projects. If accepted, I will be happy to write the patch.",Cleanup/optimization,closed,Utilities,dev,Normal,fixed,nlsprint14,eromijn@…,Accepted,1,0,0,0,0,0