Opened 19 months ago

Closed 18 months ago

Last modified 18 months ago

#22130 closed Cleanup/optimization (fixed)

Accelerated deprecation fix_ampersands and clean_html

Reported by: erikr Owned by: aaugustin
Component: Utilities Version: master
Severity: Normal Keywords: nlsprint14
Cc: eromijn@… Triage Stage: Accepted
Has patch: yes Needs documentation: no
Needs tests: no Patch needs improvement: no
Easy pickings: no UI/UX: no

Description

I would like to propose the accelerated deprecation of the fix_ampersands built-in template filter, because in all use cases, it either simply does not work, or stimulates the user to create security vulnerabilities.

fix_ampersands
Fix_ampersands aims to fix unescaped ampersands, by replacing them with &. It attempts smart detection of HTML entities, so in ©, no change is made. The detection of this is, understandably, limited, for example R&D; HRM; ... would cause it not to 'fix' the ampersand, even though it is not an entity.

The template tag is marked with is_safe=True, which means that the default escaping will occur of the input string was unsafe, and will not occur if the string was safe. However, this means that with an unsafe string, the fixing is ineffective, because the ampersand in the just generated entity is again escaped by the standard HTML escaping:

>>> print Template("{{ text|fix_ampersands }}").render(Context({'text': 'foo & bar'}))
foo & bar

If the string is marked as safe, this issue does not occur:

>>> print Template("{{ text|fix_ampersands|safe }}").render(Context({'text': 'foo & bar'}))
foo & bar

To the unsuspecting user, this may seem like the solution. However, this means only the ampersand is escaped, introducing an XSS vulnerability:

>>> print Template("{{ text|fix_ampersands|safe }}").render(Context({'text': 'foo & bar <script>alert("xss")</script>'}))
foo &amp; bar <script>alert("xss")</script>

Therefore, the fix_ampersands tag is only useful when an otherwise safe string is encountered, but which has unescaped ampersands, and does not have any unescaped ampersands that could be misread as an HTML entity. However, safe strings should never have unescaped ampersands. Therefore, the only use cases I see for this tag are broken or insecure.

clean_html
Fix_ampersands is only used in django.utils.clean_html. This function is not documented, and appears to have been added with the initial publication of django. The use of this function is unclear, because it does numerous manipulations on HTML tags, but does not actually make it safe to include. Also, the things it cleans look like they are arbitrary selected, like 'Trim stupid HTML such as <br clear="all">'.

Proposal
Searching on GitHub, I have not actually managed to locate any usage of either clean_html or fix_ampersands. I did not do a very thorough search, but it seems it is quite rarely used.

Considering that clean_html's behaviour is rather arbitrary and incomplete, and any use of fix_ampersands is broken or insecure, I suggest to start an accelerated deprecation cycle as of 1.7. I suggest this to be accelerated because the current tag stimulates users to create potentially serious vulnerabilities in their projects. If accepted, I will be happy to write the patch.

Change History (13)

comment:1 Changed 19 months ago by erikr

  • Cc eromijn@… added
  • Needs documentation unset
  • Needs tests unset
  • Patch needs improvement unset

comment:2 Changed 19 months ago by erikr

Django-developers thread started on https://groups.google.com/forum/#!topic/django-developers/UehmGnEjiSE

Also, it should be django.utils.html.clean_html, not django.utils.clean_html.

comment:3 Changed 19 months ago by bmispelon

  • Component changed from Template system to Utilities
  • Triage Stage changed from Unreviewed to Accepted
  • Type changed from Bug to Cleanup/optimization
  • Version changed from 1.6 to master

I'm +1 on the idea.

fix_ampersands and clean_html have been in Django since the beginning and I'm guessing that they were made for a very specific use-case at World Online but I don't believe they have their place inside a generic web framework.

As an argument, here's clean_html's docstring, explaining what it does exactly:

Clean the given HTML.  Specifically, do the following:
    * Convert <b> and <i> to <strong> and <em>.
    * Encode all ampersands correctly.
    * Remove all "target" attributes from <a> tags.
    * Remove extraneous HTML, such as presentational tags that open and
      immediately close and <br clear="all">.
    * Convert hard-coded bullets into HTML unordered lists.
    * Remove stuff like "<p>&nbsp;&nbsp;</p>", but only if it's at the
      bottom of the text.

comment:4 Changed 18 months ago by erikr

  • Owner changed from nobody to erikr
  • Status changed from new to assigned

As there were no objections on the mailing list, I think we can move forward on this.

As Alex clarified on the list, there does not appear to be any code on GitHub that uses clean_html, and given the functionality, I don't see any use for it at all. There was also no documentation for it. Therefore, I will simply remove clean_html. fix_ampersands will be set up for accelerated deprecation: DeprecationWarning in 1.7, removal in 1.8.

comment:6 Changed 18 months ago by Erik Romijn <eromijn@…>

  • Resolution set to fixed
  • Status changed from assigned to closed

In 775975f15d7d461c154e558cba5fb0592539126f:

Fixed #22130 -- Deprecated fix_ampersands, removed utils.clean_html()

comment:7 Changed 18 months ago by Aymeric Augustin <aymeric.augustin@…>

In 3273bd7b254680a5b241e2fdbc3196956b2b44e8:

Merge pull request #2382 from erikr/deprecate-fix-ampersands

Fixed #22130 -- Deprecated fix_ampersands and clean_html.

comment:8 Changed 18 months ago by timo

  • Has patch unset
  • Resolution fixed deleted
  • Status changed from closed to new

The test of the new warning fails when run after template_tests: ./runtests.py template_tests defaultfilters

See Jenkins for details.

comment:9 Changed 18 months ago by erikr

  • Has patch set

Sorry about that - this is the same issue I asked about on the mailing list: it turns out the warnings filters across multiple tests interact. template_tests sets an ignore for these DeprecationWarnings, which means defaultfilters does not see them anymore, even though it sets filters back to default. Even an explicit call to warnings.resetwarnings() did not change this. I fixed this for utils_tests by removing that specific check, hadn't realised it was an issue here too as the order for my locally run tests was different.

As Aymeric told me it was not really necessary to test for the warning actually being issued, the quick fix is: https://github.com/django/django/pull/2383

comment:10 Changed 18 months ago by aaugustin

  • Owner changed from erikr to aaugustin
  • Status changed from new to assigned

I'll try to figure out if I can stop template_tests from leaking warning state first.

comment:11 Changed 18 months ago by aaugustin

I spend some time working on this and I cannot figure out what's going on. It doesn't appear to be related to warning filters. If you insert the following code in test_fix_ampersands:

            print("\n\nWARNING STATE\n")
            for f in warnings.filters:
                print(f)
                if f[1] and f[1].pattern: print('    ' + f[1].pattern)
                if f[3] and f[3].pattern: print('    ' + f[3].pattern)

you can see that the warning filters are exactly identical with or without running the template_tests first, even though captured warnings are different.

While attempting to nail down the issue, I hit #22184, and that's more work than I'm willing to do just to test a deprecation warning that very obviously works. I don't want to waste more time on this, so I'll just merge Erik's pull request.

comment:12 Changed 18 months ago by Erik Romijn <eromijn@…>

  • Resolution set to fixed
  • Status changed from assigned to closed

In 6268792e19ab37446e438f4777c0bb8e786c5c3f:

Fixed #22130 -- fixed template_tests/defaultfilters order dependent test failure

comment:13 Changed 18 months ago by Aymeric Augustin <aymeric.augustin@…>

In a08f906556c14357852bbc9c4c7d10c6558dafcf:

Merge pull request #2383 from erikr/fix-tests-fix-ampersands-deprecation

Fixed #22130 -- fixed template_tests/defaultfilters order dependent test failure

Note: See TracTickets for help on using tickets.
Back to Top