#24001 closed Cleanup/optimization (needsinfo)
Add a regression test for strip_tags, html encoding and unicode MemoryError
Reported by: | twig | Owned by: | mhall1 |
---|---|---|---|
Component: | Template system | Version: | dev |
Severity: | Normal | Keywords: | |
Cc: | mhall1 | Triage Stage: | Accepted |
Has patch: | no | Needs documentation: | no |
Needs tests: | no | Patch needs improvement: | no |
Easy pickings: | no | UI/UX: | no |
Description
We noticed some processes were using up to 22gb of memory and throwing MemoryError exceptions.
Here's some sample code in a django/python shell:
from django.template.defaultfilters import striptags value = """<p class="storybody"><h2>Images and Text Do Not Mix</h2><br><br>This PowerPoint <a href="http://www.slideshare.net/anilkr123/car-and-technology" target="_blank">presentation on cars</a> (we know it\u2019s about cars because an introductory slide consists of the word "CARS" in huge, garish orange-and-blue letters) puts all of its images in the background (after applying a little tasteful fading), with <a href="http://www.pcworld.com/article/7774/make_a_bold_statement_with_text_in_powerpoint.html" target="_blank">paragraphs of text</a> overlaid on them. This accomplishes the difficult feat of making the images hard to look at <i>and</i> the text hard to read. Perfect—a lose-lose situation! <br><br>The presenter could have consolidated the text in one part of the image, using the image\u2019s horizontal guiding lines; but that didn\u2019t happen, so the slide manages to look sloppy as well as unreadable. Bonus points for misspelling \u201ccarburetor.\u201d</p>""" striptags(value)
Removing the "—" after "Perfect" fixes the problem. The character is the long-dash, most likely copy pasted from Microsoft Word.
Tested with v1.6.8 and v1.7.1
Change History (7)
comment:2 by , 10 years ago
I have a regression test in the works to make sure this doesn't come up again. Assigning to myself for now.
comment:3 by , 10 years ago
Cc: | added |
---|---|
Owner: | changed from | to
Status: | new → assigned |
Triage Stage: | Unreviewed → Accepted |
comment:4 by , 10 years ago
Component: | Uncategorized → Template system |
---|---|
Summary: | strip_tags, html encoding and unicode usage causes MemoryError on short string → Add a regression test for strip_tags, html encoding and unicode MemoryError |
Type: | Bug → Cleanup/optimization |
Version: | 1.6 → master |
follow-up: 7 comment:5 by , 10 years ago
I tried to reproduce this again on 1.6.8, 1.6.9 alpha, 1.7.1, and 1.7.2 alpha just to be sure, and I haven't had any success. The unit test needed to check for this is a bit resource-intensive so I'd like to pin down the issue first.
@twig, if you could provide any other info such as python version, database backend, etc. I'd really appreciate it. I'm moving this to "needsinfo" for now.
comment:6 by , 10 years ago
Resolution: | → needsinfo |
---|---|
Status: | assigned → closed |
comment:7 by , 10 years ago
Replying to mhall1:
I tried to reproduce this again on 1.6.8, 1.6.9 alpha, 1.7.1, and 1.7.2 alpha just to be sure, and I haven't had any success. The unit test needed to check for this is a bit resource-intensive so I'd like to pin down the issue first.
@twig, if you could provide any other info such as python version, database backend, etc. I'd really appreciate it. I'm moving this to "needsinfo" for now.
This is bug in Python<=2.7.8 http://bugs.python.org/issue20288. Fixed in 2.7.9 and higher.
I've verified that the problem is fixed on 1.6.9 alpha and 1.7.2 alpha.