Opened 13 years ago

Closed 13 years ago

#16066 closed Uncategorized (wontfix)

fix_ampersands does not convert abbreviations followed by a semi-colon

Reported by: Jerry Owned by: nobody
Component: Uncategorized Version: 1.3
Severity: Normal Keywords: ampersands fix_ampersands html.py
Cc: Triage Stage: Unreviewed
Has patch: yes Needs documentation: no
Needs tests: no Patch needs improvement: no
Easy pickings: no UI/UX: no

Description

In django/utils/html.py, unencoded_ampersands_re will not convert ampersands if they are followed by at least one alphabetical character and a semicolon. There are no named entities with only a single character, but abbreviations of that form are common in some circles: D&D and R&D for example.

Each issue has adventures designed for early D&D; there’s the beginnings of a megadungeon in issue 2, “The Darkness Beneath”, and a lot of weirdness.

List of Our Mission in R&D; 1: Foster the creation of new business. 2: Create and accumulate advanced technologies. 3: Extend our value chain globally. 4: Fulfill our social responsibilities.

Assuming that it is safe to encode what look like one-character entities, the \w+ can be changed to \w{2,}.

There are no one-character entities listed on http://www.w3.org/TR/WD-html40-970708/sgml/entities.html; or on the less-canonical http://en.wikipedia.org/wiki/List_of_XML_and_HTML_character_entity_references and http://www.w3schools.com/tags/ref_entities.asp.

If it isn't safe to assume that there will not be one-character entities, a note in the documentation (http://docs.djangoproject.com/en/dev/ref/templates/builtins/) will probably be useful. (It may be useful even if this patch does make sense, in case someone tries to use longer abbreviations, such as F&SF or AT&SF and follow them by a semi-colon.

Attachments (1)

ampersands.diff (643 bytes ) - added by Jerry 13 years ago.
Change unencoded_ampersands_re to encode &n; as &n; for D&D, R&D, etc.

Download all attachments as: .zip

Change History (2)

by Jerry, 13 years ago

Attachment: ampersands.diff added

Change unencoded_ampersands_re to encode &n; as &n; for D&D, R&D, etc.

comment:1 by Julien Phalip, 13 years ago

Resolution: wontfix
Status: newclosed

fix_ampersands does have some limitations in that it is not smart enough to distinguish real named entities from non-real ones. But only making it work with one-character words would really just be equivalent to fixing one particular symptom without addressing the core limitations. In this case, what you want to use is in fact django.utils.html.escape.

Note: See TracTickets for help on using tickets.
Back to Top