#36499 closed Cleanup/optimization (fixed)
strip_tags() and test_parsing_errors() fails with patched Python versions due to HTMLParser EOF behavior change
| Reported by: | MeggyCal | Owned by: | Natalia Bidart |
|---|---|---|---|
| Component: | Utilities | Version: | 5.2 |
| Severity: | Normal | Keywords: | |
| Cc: | Clifford Gama | Triage Stage: | Ready for checkin |
| Has patch: | yes | Needs documentation: | no |
| Needs tests: | no | Patch needs improvement: | no |
| Easy pickings: | no | UI/UX: | no |
Description
Hi, I am a packager in (open)SUSE. My colleague patched our python interpreters with their respective fixes for https://github.com/python/cpython/issues/135462 and test_strip_tags started failing with these (see bellow). As per https://github.com/python/cpython/pull/135464#discussion_r2145171001 they introduced a change in behaviour with the fix and documented it. My understanding is that tags are now left alone if they are invalid.
There is no new CPython release yet, so nothing is set in stone and I understand you might have dificulties reproducing and addressing this issue preliminary, but I just wanted to let you know.
Failure:
[ 661s] ====================================================================== [ 661s] FAIL: test_strip_tags (utils_tests.test_html.TestUtilsHtml.test_strip_tags) [<object object at 0xed890348>] (value='><!&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&[CUT MANY &] [CUT MANY &] &&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&D', output='><!&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&& [CUT MANY &] &&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&D') [ 661s] ---------------------------------------------------------------------- [ 661s] Traceback (most recent call last): [ 661s] File "/usr/lib/python3.13/unittest/case.py", line 58, in testPartExecutor [ 661s] yield [ 661s] File "/usr/lib/python3.13/unittest/case.py", line 556, in subTest [ 661s] yield [ 661s] File "/home/abuild/rpmbuild/BUILD/python-Django-5.2.2-build/django-5.2.2/tests/utils_tests/test_html.py", line 156, in test_strip_tags [ 661s] self.check_output(strip_tags, value, output) [ 661s] ^^^^^^^ [ 661s] File "/home/abuild/rpmbuild/BUILD/python-Django-5.2.2-build/django-5.2.2/tests/utils_tests/test_html.py", line 34, in check_output [ 661s] self.assertEqual(function(value), output) [ 661s] ^^^^^^^^^^^^^^^ [ 661s] File "/usr/lib/python3.13/unittest/case.py", line 907, in assertEqual [ 661s] assertion_func(first, second, msg=msg) [ 661s] ^^^^^^^^^^^^^^^ [ 661s] File "/usr/lib/python3.13/unittest/case.py", line 1273, in assertMultiLineEqual [ 661s] self.fail(self._formatMessage(msg, standardMsg)) [ 661s] ^^^^^^^^^^^ [ 661s] File "/usr/lib/python3.13/unittest/case.py", line 732, in fail [ 661s] raise self.failureException(msg) [ 661s] ^^^^^^^^^^^^^^^ [ 661s] AssertionError: '>' != '><!&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&[15958 chars]&&&D' [ 661s] Diff is 16012 characters long. Set self.maxDiff to None to see it. [ 661s] [ 661s] ====================================================================== [ 661s] FAIL: test_strip_tags (utils_tests.test_html.TestUtilsHtml.test_strip_tags) [<object object at 0xed890348>] (value='><a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<aa', output='><a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<aa') [ 661s] ---------------------------------------------------------------------- [ 661s] Traceback (most recent call last): [ 661s] File "/usr/lib/python3.13/unittest/case.py", line 58, in testPartExecutor [ 661s] yield [ 661s] File "/usr/lib/python3.13/unittest/case.py", line 556, in subTest [ 661s] yield [ 661s] File "/home/abuild/rpmbuild/BUILD/python-Django-5.2.2-build/django-5.2.2/tests/utils_tests/test_html.py", line 156, in test_strip_tags [ 661s] self.check_output(strip_tags, value, output) [ 661s] ^^^^^^^ [ 661s] File "/home/abuild/rpmbuild/BUILD/python-Django-5.2.2-build/django-5.2.2/tests/utils_tests/test_html.py", line 34, in check_output [ 661s] self.assertEqual(function(value), output) [ 661s] ^^^^^^^^^^^^^^^ [ 661s] File "/usr/lib/python3.13/unittest/case.py", line 907, in assertEqual [ 661s] assertion_func(first, second, msg=msg) [ 661s] ^^^^^^^^^^^^^^^ [ 661s] File "/usr/lib/python3.13/unittest/case.py", line 1273, in assertMultiLineEqual [ 661s] self.fail(self._formatMessage(msg, standardMsg)) [ 661s] ^^^^^^^^^^^ [ 661s] File "/usr/lib/python3.13/unittest/case.py", line 732, in fail [ 661s] raise self.failureException(msg) [ 661s] ^^^^^^^^^^^^^^^ [ 661s] AssertionError: '>' != '><a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<[956 chars]a<aa' [ 661s] Diff is 1010 characters long. Set self.maxDiff to None to see it. [ 661s] [ 661s] ---------------------------------------------------------------------- [ 661s] Ran 17447 tests in 178.560s
Change History (22)
comment:2 by , 6 months ago
| Cc: | added |
|---|---|
| Component: | Uncategorized → Utilities |
| Summary: | CPython might have introduced a change of behaviour in their fix for https://github.com/python/cpython/issues/135462 → strip_tags() fails with patched Python versions due to HTMLParser EOF behavior change |
| Triage Stage: | Unreviewed → Accepted |
Thanks for the report! I managed to reproduce on against the main python e18829a8 branch. Since the commit (gh-135462) was backported to Python versions currently supported by Django, I think we can accept this on the basis that Django needs to make a decision.
The issue is that an unterminated tag is now being discarded. In the case of the failing tests these are "<a<a..." and "<&&&...&D" and the first "<sc" in "<sc<!-- -->ript>test<<!-- -->/script>".
I see two ways we may handle this:
- Adjust
strip_tags()to preserve pre-3.13 behavior, ensuring consistency, or - Update tests, and possibly note the behavioral shift in docs, although the latter may not be necessary as the changed behaviour was not documented.
(FWIW, the associated issue that introduced the commit in Python was marked is a security issue.)
comment:3 by , 6 months ago
| Owner: | set to |
|---|---|
| Severity: | Normal → Release blocker |
| Status: | new → assigned |
We are also seeing the failures in our scheduled tests CI but only when using Python 3.14 (example). I have also reproduced locally with Python 3.14 beta 4.
The changes in Python were driven by a security report started by the Django Security Team, following up some private reports we got. I think we need to update the tests and stick as much as possible to the Python's HTMLParser behavior. Also, we need to backport this to the supported stable branches, so I'll mark it as release blocker.
comment:4 by , 6 months ago
| Has patch: | set |
|---|---|
| Needs documentation: | set |
comment:5 by , 6 months ago
| Needs documentation: | unset |
|---|---|
| Patch needs improvement: | set |
| Severity: | Release blocker → Normal |
| Type: | Bug → Cleanup/optimization |
I've discussed this issue with Sarah and she made the valid point that since this affects tests only, it shouldn't require release notes nor the "Release Blocker" status. Updating!
Setting as "patch needs improvement" to block the PR until the Python versions are released.
comment:6 by , 6 months ago
| Summary: | strip_tags() fails with patched Python versions due to HTMLParser EOF behavior change → strip_tags() and test_parsing_errors() fails with patched Python versions due to HTMLParser EOF behavior change |
|---|
follow-up: 8 comment:7 by , 5 months ago
This is now released in CPython 3.13.6, and it has been backported back as far as to 3.9 (not released upstream yet, but at least some distributions have already backported it).
comment:8 by , 5 months ago
Replying to Michał Górny:
This is now released in CPython 3.13.6, and it has been backported back as far as to 3.9 (not released upstream yet, but at least some distributions have already backported it).
Thank you Michał! We are tracking Python releases and as soon as every version is released upstream (3.13.6, 3.12.12, 3.11.14, 3.10.19 and 3.9.24), we'll update our CI workers and land my PR.
comment:9 by , 5 months ago
| Patch needs improvement: | unset |
|---|---|
| Triage Stage: | Accepted → Ready for checkin |
Code has been adjusted to work with versions of Python with and without the fix. I'll set a reminder to clean the code up once all the Pythons are released and available in out CI/CD.
comment:21 by , 12 days ago
Just a heads up. I ran the Django test suite locally in my Ubuntu 24.04 Python 3.12.3 environment (sys.version showing as '3.12.3 (main, Nov 6 2025, 13:44:16) [GCC 13.3.0]') against the main Django branch (d6ae2ed868e43671afc4d433c3d8f4d27f7eb555).
I am seeing two subtest failures in the test_strip_tags test.
FAIL: test_strip_tags (utils_tests.test_html.TestUtilsHtml.test_strip_tags) [<object object at 0x75d20021d660>] (value='><!&&&&&&&&&&&&&&&&&&&&&& ... ====================================================================== FAIL: test_strip_tags (utils_tests.test_html.TestUtilsHtml.test_strip_tags) [<object object at 0x75d20021d660>] (value='><a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<aa', output='><a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<a<aa') ----------------------------------------------------------------------
When I inspect my local /usr/lib/python3.12/html/parser.py it appears to contain the patch from https://github.com/python/cpython/pull/135483/files.
The test_strip_tests changes from above seems to only consider 3.12.12+ as having been patched based on the logic in https://github.com/django/django/commit/7b80b2186300620931009fd62c2969f108fe7a62#diff-8d44648c5c0191dee21c4d5034021573a885c878a9c50af658259c1747209f19R177
Based on https://launchpad.net/ubuntu/noble/+source/python3.12/+changelog it seems that the libpython3.12-stdlib apt package was patched, but Python still reports as 3.12.3, so the Django tests can't account for the patched HTMLParser.
So others may experience this "issue", but not sure Django tests can/should do anything different to account for it?
comment:22 by , 11 days ago
Thanks for the info.
The version switches are left in for convenience for now, but we may remove all of them as soon as all CI runners are running fully patched Python versions, since Django doesn't officially support anything other than the latest point releases of Python. You will encounter other test failures (I think there is a recent mail-related ticket) if running earlier point releases.
Sorry, as I look at the test data alone, something ate almost all the
>s, which doesn't look intentional. I have to check the patches... Edit: at a glance our patches do not differ from the upstream ones.