Opened 7 years ago

Closed 4 years ago

#11778 closed Cleanup/optimization (fixed)

Patch to increase performance of escapejs

Reported by: josephdrose Owned by: josephdrose
Component: Template system Version: master
Severity: Normal Keywords:
Cc: jrose@…, Dmitry Jemerov Triage Stage: Accepted
Has patch: yes Needs documentation: no
Needs tests: no Patch needs improvement: no
Easy pickings: no UI/UX: no

Description

The current implementation of escapejs iterates through a list of values and does a string.replace(...). The patch here creates a regular expression with the values and uses a dictionary to do the replace upon the match. Local testing shows the new implementation to be faster; the script used to perform this test is also attached.

Attachments (6)

test_escapejs.py (1.8 KB) - added by josephdrose 7 years ago.
patch.diff (885 bytes) - added by josephdrose 7 years ago.
escapejs.patch (828 bytes) - added by gisle 7 years ago.
Patch update to make it pass the 'defaultfilters' test (added match on \u2028\u2029)
0001-Faster-escapejs-11778.patch (2.0 KB) - added by gisle 7 years ago.
Even faster version without double maintenance of keys
escapejs_benchmark.py (2.2 KB) - added by gisle 7 years ago.
Updated benchmark script
11778.diff (6.5 KB) - added by Dmitry Jemerov 5 years ago.
Patch against r16344 with updated expected test results

Download all attachments as: .zip

Change History (17)

Changed 7 years ago by josephdrose

Attachment: test_escapejs.py added

Changed 7 years ago by josephdrose

Attachment: patch.diff added

comment:1 Changed 7 years ago by Cliff Dyer

Triage Stage: UnreviewedDesign decision needed

Changed 7 years ago by gisle

Attachment: escapejs.patch added

Patch update to make it pass the 'defaultfilters' test (added match on \u2028\u2029)

comment:2 Changed 7 years ago by gisle

On my machine using josephdrose's benchmark script this approach doubles the speed of the escapejs() function.

A problem with the patch is that the set of chars to match now has to be maintained in 2 places: both where _base_js_escapes and _js_escapes_re are set. It would be an improvement to generate the re from _base_js_escapes.

Changed 7 years ago by gisle

Even faster version without double maintenance of keys

Changed 7 years ago by gisle

Attachment: escapejs_benchmark.py added

Updated benchmark script

comment:3 Changed 7 years ago by gisle

Triage Stage: Design decision neededUnreviewed

I've updated the patch with a new version that is better because you don't have to maintain the keys in 2 places and is faster because all chars to be replaces is matched by a single regex-charclass. I've also attached a new benchmark script.

Unfortunately there are cases where the new approach actually is slower. This happens when we try to escape long strings with large fraction of chars that need to be escaped. It still think this patch is worth to apply because it should make this function 2-5 times faster on most "real" strings.

When I run the escapejs_benchmark.py script here I see:

11.5x speedup 2.1s vs 0.2s 'xxxxx'

5.4x speedup 3.9s vs 0.7s 'xxxxxxxxxx...'
2.9x speedup 2.2s vs 0.8s ""
5.0x slowdown 10.0s vs 49.6s "
..."
2.4x speedup 2.5s vs 1.0s "Hello, I'm..."

comment:4 Changed 7 years ago by gisle

With proper wikiformatting it looks like :-(

 11.5x speedup  2.1s vs 0.2s 'xxxxx'
  5.4x speedup  3.9s vs 0.7s 'xxxxxxxxxx...'
  2.9x speedup  2.2s vs 0.8s "'''''"
  5.0x slowdown 10.0s vs 49.6s "''''''''''..."
  2.4x speedup  2.5s vs 1.0s "Hello, I'm..."

comment:5 Changed 7 years ago by gisle

Triage Stage: UnreviewedAccepted

comment:6 Changed 7 years ago by jkocherhans

milestone: 1.2

This isn't critical for the first 1.2 release. Bumping from the milestone.

comment:7 Changed 6 years ago by patchhammer

Easy pickings: unset
Patch needs improvement: set
Severity: Normal
Type: Uncategorized

0001-Faster-escapejs-11778.patch fails to apply cleanly on to trunk

comment:8 Changed 6 years ago by Julien Phalip

Type: UncategorizedCleanup/optimization

comment:9 Changed 5 years ago by Dmitry Jemerov

Cc: Dmitry Jemerov added
UI/UX: unset

Updated patch to latest trunk (r16344), updated expected test results due to the changes in escapejs output (now it uses \x## notation instead of \u#### for characters with low ASCII codes). The output change looks like a backward compatibility concern to me, although the new output is more compact than the old one (which saves a tiny bit of traffic).

Changed 5 years ago by Dmitry Jemerov

Attachment: 11778.diff added

Patch against r16344 with updated expected test results

comment:10 Changed 5 years ago by Dmitry Jemerov

Patch needs improvement: unset

comment:11 Changed 4 years ago by Claude Paroz

Resolution: fixed
Status: newclosed

This has probably already been solved by [44767f2caf028d8].

Note: See TracTickets for help on using tickets.
Back to Top