Code

Opened 5 years ago

Closed 2 years ago

#11911 closed Bug (fixed)

urlizetrunc not taking into account last ')' of a link

Reported by: Stefan_Petrea Owned by: aaugustin
Component: Template system Version: master
Severity: Normal Keywords: urlizetrunc
Cc: Triage Stage: Design decision needed
Has patch: yes Needs documentation: no
Needs tests: no Patch needs improvement: yes
Easy pickings: no UI/UX: no

Description

Hi,

I'm using Django 1.1 , the release version.

The urlizetrunc filter is causing some problems , for example the following link
http://en.wikipedia.org/wiki/Sibiu(Romania)

is rendered as <a href="http://en.wikipedia.org/wiki/Sibiu%28Romania" rel="nofollow">http://en.wikipedia.org/wik...</a>)
(please notice the trailing ')' )
This is not correct , it should take the ')' inside the link.

(As a work-around I'm adding # to my links, which seems to work).

Best regards,

Stefan

Attachments (4)

11911-and-12183-tests.patch (1.8 KB) - added by morty 4 years ago.
Joint tests for tickets #11911 and #12183
11911-and-12183.patch (1.8 KB) - added by morty 4 years ago.
Joint patch for tickets #11911 and #12183
11911.patch (2.5 KB) - added by graham_king 3 years ago.
Split out just the part for this ticket. Added tests.
11911-2.patch (3.8 KB) - added by aaugustin 2 years ago.

Download all attachments as: .zip

Change History (13)

comment:1 Changed 4 years ago by russellm

  • Needs documentation unset
  • Needs tests unset
  • Patch needs improvement unset
  • Summary changed from [BUG] urlizetrunc not taking into account last ')' of a link (example given) to urlizetrunc not taking into account last ')' of a link
  • Triage Stage changed from Unreviewed to Accepted

Changed 4 years ago by morty

Joint tests for tickets #11911 and #12183

Changed 4 years ago by morty

Joint patch for tickets #11911 and #12183

comment:2 Changed 4 years ago by morty

  • Has patch set

I have attached tests and a patch that fixes this bug as well as #12183.

comment:3 Changed 3 years ago by julien

  • Severity set to Normal
  • Type set to Bug

Changed 3 years ago by graham_king

Split out just the part for this ticket. Added tests.

comment:4 Changed 3 years ago by graham_king

  • Easy pickings unset
  • milestone set to 1.4
  • Version changed from 1.1 to SVN

This is still a problem in current SVN, and the patch fixes it.

I've updated the patch to apply cleanly, to be just for this ticket, and to have tests. All the unit tests pass.

Last edited 3 years ago by graham_king (previous) (diff)

comment:5 Changed 3 years ago by trez

  • Patch needs improvement set
  • Triage Stage changed from Accepted to Design decision needed
  • UI/UX unset

It seems than the real problem come from the definition of punctuation_re:

TRAILING_PUNCTUATION = ['.', ',', ')', '>', '\n', '&gt;']

punctuation_re = re.compile('^(?P<lead>(?:%s)*)(?P<middle>.*?)(?P<trail>(?:%s)*)$' % \
    ('|'.join([re.escape(x) for x in LEADING_PUNCTUATION]),
    '|'.join([re.escape(x) for x in TRAILING_PUNCTUATION])))

The ')' character is not interpreted as a part of the url but as an enclosing character.
The patch provided solve the parenthesis problem but do not solve other cases for other unreserved characters in the TRAILING_PUNCTUATION list.

Valid character accoding to RFC 1738 named as extra are "!" | "*" | "'" | "(" | ")" | ","

For example: http://en.wikipedia.org/wiki/Sibiu(Romania). which is the valid (notice the dot at the end of the url)

To completely solve this problem, I think an in-depth rewrite of the punctation_re should be done which should be able to distinguish between
http://en.wikipedia.org/wiki/Sibiu(Romania) and (http://en.wikipedia.org/wiki/Sibiu(Romania))

A design decision is needed.

comment:6 Changed 3 years ago by graham_king

Could you provide an example of a live url where this is a problem, with characters other than parenthesis?

comment:7 Changed 3 years ago by jacob

  • milestone 1.4 deleted

Milestone 1.4 deleted

comment:8 Changed 2 years ago by aaugustin

  • Owner changed from nobody to aaugustin

Changed 2 years ago by aaugustin

comment:9 Changed 2 years ago by aaugustin

  • Resolution set to fixed
  • Status changed from new to closed

In [17362]:

Fixed #11911 -- Made the urlize filter smarter with closing punctuation.

Add Comment

Modify Ticket

Change Properties
<Author field>
Action
as closed
as The resolution will be set. Next status will be 'closed'
The resolution will be deleted. Next status will be 'new'
Author


E-mail address and user name can be saved in the Preferences.

 
Note: See TracTickets for help on using tickets.