Opened 97 minutes ago
#36896 new Cleanup/optimization
Optimize TruncateCharsHTMLParser.process() to avoid redundant sum() calculation
| Reported by: | Tarek Nakkouch | Owned by: | |
|---|---|---|---|
| Component: | Utilities | Version: | 6.0 |
| Severity: | Normal | Keywords: | |
| Cc: | Triage Stage: | Unreviewed | |
| Has patch: | no | Needs documentation: | no |
| Needs tests: | no | Patch needs improvement: | no |
| Easy pickings: | no | UI/UX: | no |
Description
The TruncateCharsHTMLParser.process() method in django/utils/text.py recalculates sum(len(p) for p in self.output) every time it processes a text chunk. For HTML with multiple text nodes, this repeatedly iterates over the growing output list unnecessarily.
def process(self, data): self.processed_chars += len(data) if (self.processed_chars == self.length) and ( sum(len(p) for p in self.output) + len(data) == len(self.rawdata) ): self.output.append(data) raise self.TruncationCompleted output = escape("".join(data[: self.remaining])) return data, output
Suggested optimization
Cache the output length as self.output_len and increment it when appending to self.output:
- Initialize
self.output_len = 0inTruncateHTMLParser.__init__() - Increment in
handle_starttag(),handle_endtag(),handle_data(),feed(), andprocess() - Replace
sum(len(p) for p in self.output)withself.output_len
This eliminates redundant iteration over already-processed output.
Note:
See TracTickets
for help on using tickets.