Opened 3 years ago

Closed 2 years ago

#18152 closed Bug (wontfix)

urlize filter does not work correctly in combination with linebreaksbr filter

Reported by: Nasmon Owned by: Vladimir.Filonov
Component: Template system Version: 1.4
Severity: Normal Keywords:
Cc: Triage Stage: Accepted
Has patch: no Needs documentation: no
Needs tests: no Patch needs improvement: no
Easy pickings: no UI/UX: no

Description

Use this text with these filter in exactly THIS order in a template:

text = """
Lorem ipsum
http://some.com/web-page

Lorem ipsum
"""

{{ text|linebreaksbr|urlize }}

The output will include an additional, escaped >

linebreaksbr appends a <br /><br /> directly to the url, which then appears to get wrongly escaped in the subsequent urlize filter.

Change History (5)

comment:1 Changed 3 years ago by anonymous

  • Needs documentation unset
  • Needs tests unset
  • Patch needs improvement unset

I forgot to mention: Changing the filter order to {{ text|urlize|linebreaksbr }} works nicely.

comment:2 Changed 3 years ago by nmfm

  • Triage Stage changed from Unreviewed to Accepted

Using the trunk I get the following output:

Lorem ipsum  <a href="http://some.com/web-page%3Cbr" rel="nofollow">http://some.com/web-page<br</a> /><br />Lorem ipsum

Be sure to leave no extra spaces in text after the URL and before the linebreak.

comment:3 Changed 2 years ago by Vladimir.Filonov

  • Owner changed from nobody to Vladimir.Filonov
  • Status changed from new to assigned

comment:4 Changed 2 years ago by Vladimir.Filonov

This problem because urlize using words splitting to search links. For splitting it uses re.compile(r'(\s+)') - all common space symbols.

So, when first filter is linebreaks and second is urlize we have such steps:

# String is "Lorem ipsum\nhttp://some.com/web-page\nLorem ipsum"

  1. String is converted to "Lorem ipsum<br /> http://some.com/web-page<br />Lorem ipsum"
  2. String is splitted to chucks ['Lorem ipsum<br', '/>http://some.com/web-page<br', '/>Lorem ipsum']
  3. Nothing to urlize, because '/>http://some.com/web-page<br' doesn't looks like a link.

# String is "Lorem ipsum\n http://some.com/web-page\nLorem ipsum"

  1. String is converted to "Lorem ipsum<br /> http://some.com/web-page<br />Lorem ipsum"
  2. String is splitted to chucks ['Lorem ipsum<br', '/>', 'http://some.com/web-page<br', '/>Lorem ipsum']
  3. Link is 'http://some.com/web-page<br'

# String is "Lorem ipsum\n http://some.com/web-page \nLorem ipsum"\

  1. String is converted to "Lorem ipsum<br /> http://some.com/web-page<br />Lorem ipsum"
  2. String is splitted to chucks ['Lorem ipsum<br', '/>', 'http://some.com/web-page', '<br', '/>Lorem ipsum']
  3. Link is 'http://some.com/web-page'

So, spaces should be at both sides of URL for normal work.

I see two ways, but both of them are not clear for my opinion.

First is to add spaces around <br /> tags in linebreaksbr filter. But it will change a lot of output.
Second way is to change Regular Extension for splitting url. In fact we can split string using not only spaces, but all non-url symbols, according RCF1738.

Any ideas?

comment:5 Changed 2 years ago by aaugustin

  • Resolution set to wontfix
  • Status changed from assigned to closed

It looks like both filters work normally. urlize is designed to operate on plain text, not on HTML.

As pointed out in comment 1 the reverse order works: {{ foo|urlize|linebreaksbr }}.

Splitting on non-URLy characters is likely to cause more problems — for instance it'll URLize links within a <a href="..."> which is probably not desirable.

Note: See TracTickets for help on using tickets.
Back to Top