Opened 3 weeks ago
Closed 2 weeks ago
#36769 closed Cleanup/optimization (fixed)
Limit recursive extraction of field values in XML deserializer
| Reported by: | Jacob Walls | Owned by: | Pravin |
|---|---|---|---|
| Component: | Core (Serialization) | Version: | dev |
| Severity: | Normal | Keywords: | xml deserializer |
| Cc: | Pravin, Shai Berger | Triage Stage: | Ready for checkin |
| Has patch: | no | Needs documentation: | no |
| Needs tests: | no | Patch needs improvement: | no |
| Easy pickings: | no | UI/UX: | no |
Description
While investigating CVE-2025-64460 (mitigated in 50efb718b31333051bc2dcb06911b8fa1358c98c), we noticed that the private helper getInnerText supports extracting arbitrarily nested text, however its only use in Django is to extract text at most one level deep, under a <natural> tag, like this fixture example.
We opted not to change this semantic in a patch release, but it occurred to me that we could only extract text at the exact expected depth (e.g. 0 if under <field> and 1 if under <field><natural>) and completely sidestep potential performance issues from invalid input making use of nested elements, see recent python CVE-2025-12084 we also mitigated yesterday.
I would appreciate any arguments I might be missing, for example, if there are legitimate use cases for ingesting text from nested tags e.g. from unescaped markup that this proposal would break.
Change History (10)
comment:1 by , 3 weeks ago
| Cc: | added |
|---|
comment:2 by , 3 weeks ago
| Cc: | added |
|---|---|
| Keywords: | xml deserializer added |
| Triage Stage: | Unreviewed → Accepted |
comment:3 by , 3 weeks ago
Either we could change the helper function to only accept the types that Django actually uses: regular text and the one-level-deep <natural> tag. This would stop the recursion immediately for anything else, or we could explicitly tell the function how deep it should go.
comment:4 by , 3 weeks ago
| Owner: | set to |
|---|---|
| Status: | new → assigned |
comment:5 by , 2 weeks ago
| Needs tests: | set |
|---|
comment:6 by , 2 weeks ago
| Patch needs improvement: | set |
|---|
comment:7 by , 2 weeks ago
| Needs tests: | unset |
|---|
comment:8 by , 2 weeks ago
| Patch needs improvement: | unset |
|---|---|
| Triage Stage: | Accepted → Ready for checkin |
comment:9 by , 2 weeks ago
One advantage of merging this cleanup is that it hastens the day we can remove the workarounds and the test from 1dbd07a608e495a0c229edaaf84d58d8976313b5, which we merged knowing might be flaky (and indeed has been).
I'm subscribed to the Python 3.12 backport PR with the mitigation for the minidom issue. Waiting for that to land and for a 3.12 patch release after.
Thank you!