Code

Opened 3 years ago

Closed 3 years ago

#16066 closed Uncategorized (wontfix)

fix_ampersands does not convert abbreviations followed by a semi-colon

Reported by: Jerry Owned by: nobody
Component: Uncategorized Version: 1.3
Severity: Normal Keywords: ampersands fix_ampersands html.py
Cc: Triage Stage: Unreviewed
Has patch: yes Needs documentation: no
Needs tests: no Patch needs improvement: no
Easy pickings: no UI/UX:

Description

In django/utils/html.py, unencoded_ampersands_re will not convert ampersands if they are followed by at least one alphabetical character and a semicolon. There are no named entities with only a single character, but abbreviations of that form are common in some circles: D&D and R&D for example.

Each issue has adventures designed for early D&D; there’s the beginnings of a megadungeon in issue 2, “The Darkness Beneath”, and a lot of weirdness.

List of Our Mission in R&D; 1: Foster the creation of new business. 2: Create and accumulate advanced technologies. 3: Extend our value chain globally. 4: Fulfill our social responsibilities.

Assuming that it is safe to encode what look like one-character entities, the \w+ can be changed to \w{2,}.

There are no one-character entities listed on http://www.w3.org/TR/WD-html40-970708/sgml/entities.html; or on the less-canonical http://en.wikipedia.org/wiki/List_of_XML_and_HTML_character_entity_references and http://www.w3schools.com/tags/ref_entities.asp.

If it isn't safe to assume that there will not be one-character entities, a note in the documentation (http://docs.djangoproject.com/en/dev/ref/templates/builtins/) will probably be useful. (It may be useful even if this patch does make sense, in case someone tries to use longer abbreviations, such as F&SF or AT&SF and follow them by a semi-colon.

Attachments (1)

ampersands.diff (643 bytes) - added by Jerry 3 years ago.
Change unencoded_ampersands_re to encode &n; as &n; for D&D, R&D, etc.

Download all attachments as: .zip

Change History (2)

Changed 3 years ago by Jerry

Change unencoded_ampersands_re to encode &n; as &n; for D&D, R&D, etc.

comment:1 Changed 3 years ago by julien

  • Needs documentation unset
  • Needs tests unset
  • Patch needs improvement unset
  • Resolution set to wontfix
  • Status changed from new to closed

fix_ampersands does have some limitations in that it is not smart enough to distinguish real named entities from non-real ones. But only making it work with one-character words would really just be equivalent to fixing one particular symptom without addressing the core limitations. In this case, what you want to use is in fact django.utils.html.escape.

Add Comment

Modify Ticket

Change Properties
<Author field>
Action
as closed
as The resolution will be set. Next status will be 'closed'
The resolution will be deleted. Next status will be 'new'
Author


E-mail address and user name can be saved in the Preferences.

 
Note: See TracTickets for help on using tickets.