Opened 6 years ago

Closed 3 years ago

#11911 closed Bug (fixed)

urlizetrunc not taking into account last ')' of a link

Reported by: Stefan_Petrea Owned by: aaugustin
Component: Template system Version: master
Severity: Normal Keywords: urlizetrunc
Cc: Triage Stage: Design decision needed
Has patch: yes Needs documentation: no
Needs tests: no Patch needs improvement: yes
Easy pickings: no UI/UX: no

Description

Hi,

I'm using Django 1.1 , the release version.

The urlizetrunc filter is causing some problems , for example the following link
http://en.wikipedia.org/wiki/Sibiu(Romania)

is rendered as <a href="http://en.wikipedia.org/wiki/Sibiu%28Romania" rel="nofollow">http://en.wikipedia.org/wik...</a>)
(please notice the trailing ')' )
This is not correct , it should take the ')' inside the link.

(As a work-around I'm adding # to my links, which seems to work).

Best regards,

Stefan

Attachments (4)

11911-and-12183-tests.patch (1.8 KB) - added by morty 5 years ago.
Joint tests for tickets #11911 and #12183
11911-and-12183.patch (1.8 KB) - added by morty 5 years ago.
Joint patch for tickets #11911 and #12183
11911.patch (2.5 KB) - added by graham_king 4 years ago.
Split out just the part for this ticket. Added tests.
11911-2.patch (3.8 KB) - added by aaugustin 3 years ago.

Download all attachments as: .zip

Change History (13)

comment:1 Changed 5 years ago by russellm

  • Needs documentation unset
  • Needs tests unset
  • Patch needs improvement unset
  • Summary changed from [BUG] urlizetrunc not taking into account last ')' of a link (example given) to urlizetrunc not taking into account last ')' of a link
  • Triage Stage changed from Unreviewed to Accepted

Changed 5 years ago by morty

Joint tests for tickets #11911 and #12183

Changed 5 years ago by morty

Joint patch for tickets #11911 and #12183

comment:2 Changed 5 years ago by morty

  • Has patch set

I have attached tests and a patch that fixes this bug as well as #12183.

comment:3 Changed 4 years ago by julien

  • Severity set to Normal
  • Type set to Bug

Changed 4 years ago by graham_king

Split out just the part for this ticket. Added tests.

comment:4 Changed 4 years ago by graham_king

  • Easy pickings unset
  • milestone set to 1.4
  • Version changed from 1.1 to SVN

This is still a problem in current SVN. I've updated the patch to apply cleanly, to be just for this ticket, and to have tests. All the unit tests pass.

Version 0, edited 4 years ago by graham_king (next)

comment:5 Changed 4 years ago by trez

  • Patch needs improvement set
  • Triage Stage changed from Accepted to Design decision needed
  • UI/UX unset

It seems than the real problem come from the definition of punctuation_re:

TRAILING_PUNCTUATION = ['.', ',', ')', '>', '\n', '&gt;']

punctuation_re = re.compile('^(?P<lead>(?:%s)*)(?P<middle>.*?)(?P<trail>(?:%s)*)$' % \
    ('|'.join([re.escape(x) for x in LEADING_PUNCTUATION]),
    '|'.join([re.escape(x) for x in TRAILING_PUNCTUATION])))

The ')' character is not interpreted as a part of the url but as an enclosing character.
The patch provided solve the parenthesis problem but do not solve other cases for other unreserved characters in the TRAILING_PUNCTUATION list.

Valid character accoding to RFC 1738 named as extra are "!" | "*" | "'" | "(" | ")" | ","

For example: http://en.wikipedia.org/wiki/Sibiu(Romania). which is the valid (notice the dot at the end of the url)

To completely solve this problem, I think an in-depth rewrite of the punctation_re should be done which should be able to distinguish between
http://en.wikipedia.org/wiki/Sibiu(Romania) and (http://en.wikipedia.org/wiki/Sibiu(Romania))

A design decision is needed.

comment:6 Changed 4 years ago by graham_king

Could you provide an example of a live url where this is a problem, with characters other than parenthesis?

comment:7 Changed 4 years ago by jacob

  • milestone 1.4 deleted

Milestone 1.4 deleted

comment:8 Changed 3 years ago by aaugustin

  • Owner changed from nobody to aaugustin

Changed 3 years ago by aaugustin

comment:9 Changed 3 years ago by aaugustin

  • Resolution set to fixed
  • Status changed from new to closed

In [17362]:

Fixed #11911 -- Made the urlize filter smarter with closing punctuation.

Note: See TracTickets for help on using tickets.
Back to Top