Opened 3 weeks ago

Closed 2 weeks ago

#36737 closed Bug (fixed)

Escape C1 control sequence in `escapejs`

Reported by: Thibaut Decombe Owned by: farthestmage
Component: Template system Version: 5.2
Severity: Normal Keywords:
Cc: Thibaut Decombe Triage Stage: Ready for checkin
Has patch: no Needs documentation: no
Needs tests: no Patch needs improvement: no
Easy pickings: yes UI/UX: no

Description

The current implementation of the escapejshttps://github.com/django/django/blob/5c60763561c67924eff1069e1516b60a59d068d5/django/utils/html.py#L79-L80 escapes only C0 control characters (unicode values ranging from 0 to 31)

However, there are other control characters in the 127-159 range, the C1 control characters.
See https://en.wikipedia.org/wiki/C0_and_C1_control_codes#C1_controls

Should we escape these too ?

The rust helper char.is_control https://doc.rust-lang.org/src/core/char/methods.rs.html#952 consider both these ranges and we were considering using it in django_rusty_templates

I'll be happy to provide a PR if it make sense

Change History (8)

comment:1 by Rudraksha Dwivedi, 3 weeks ago

While I am new to contributing to Django, it looks like a straightforward job
the proposal does seem legit to me
escapejs already escapes C0 control characters, extending it to support C1 range would be better for consistency across control chars. in diff. languages
if there is no backward compatibility issue
and the ticket gets accepted I'd love to chip in

comment:2 by farthestmage, 3 weeks ago

Hey I am also new to contributing the thought process it seems that a simple loop can save the job

comment:3 by farthestmage, 3 weeks ago

Hey,
I was not assigned this task but majorly I believe I have solved the issue and issued a pull request on github could pls review the code
thanks

comment:4 by farthestmage, 3 weeks ago

Hello,
A doubt were you able to produce or implement a way to test it out

comment:5 by Jacob Walls, 3 weeks ago

Needs tests: set
Owner: set to farthestmage
Patch needs improvement: set
Status: newassigned
Triage Stage: UnreviewedAccepted
Type: Cleanup/optimizationBug

Okay, I think this is right.

The HTML parsing standard describes the parsing error control-character-in-input-stream like this:

This error occurs if the input stream contains a control code point that is not ASCII whitespace or U+0000 NULL. Such code points are parsed as-is and usually, where parsing rules don't apply any additional restrictions, make their way into the DOM.

control is defined here:

A control is a C0 control or a code point in the range U+007F DELETE to U+009F APPLICATION PROGRAM COMMAND, inclusive.

The C1 control range is U+0080 – U+009F, so the additional characters that need escaping are C1 control characters plus U+007F DELETE (which is apparently sometimes grouped with C0 controls).

See also HTML spec

PR doesn't escape U+007F DELETE and needs tests.

comment:6 by farthestmage, 3 weeks ago

Hey,
While creating Test cases I run runtests.py it works perfectly but while review fails returns this error AssertionError: '\\u009F' != '\x9f'.I have manipulated \u009f with taking it in a literal sense different error arise while running runtests.py while the current commit creates error only while reviewing not while executing runtests.py
help would be appreciated
thanks

PS: Just a new contributor learning

Version 0, edited 3 weeks ago by farthestmage (next)

comment:7 by Jacob Walls, 2 weeks ago

Needs tests: unset
Patch needs improvement: unset
Triage Stage: AcceptedReady for checkin

comment:8 by Jacob Walls <jacobtylerwalls@…>, 2 weeks ago

Resolution: fixed
Status: assignedclosed

In 07419875:

Fixed #36737 -- Escaped further control characters in escapejs.

Note: See TracTickets for help on using tickets.
Back to Top