Opened 11 years ago

Closed 11 years ago

Last modified 11 years ago

#20364 closed Bug (fixed)

urlize incorrectly includes quotation marks in links

Reported by: EmilStenstrom Owned by: anonymous
Component: Template system Version: dev
Severity: Normal Keywords: urlize, quotes
Cc: piwonski.kris@… Triage Stage: Ready for checkin
Has patch: yes Needs documentation: no
Needs tests: no Patch needs improvement: no
Easy pickings: yes UI/UX: no

Description

>>> urlize("some text here \"hi@example.com\" and some text afterwards")
u'some text here <a href="mailto:"hi@example.com"">"hi@example.com"</a> and some text afterwards'

Note the extra quotes inside the links.

Luckily, this is solved quite easily, just do this:

>>> from django.utils import html
>>> html.TRAILING_PUNCTUATION += ['"', '\'']
>>> html.WRAPPING_PUNCTUATION += [('"', '"'), ('\'', '\'')]
>>> urlize("some text here \"hi@example.com\" and some text afterwards")
u'some text here "<a href="mailto:hi@example.com">hi@example.com</a>" and some text afterwards'

I suggest quotation marks are included in the *_PUNCTUATION variables in django.utils.html.

Attachments (1)

20364.diff (1.3 KB ) - added by Chris Piwoński <piwonski.kris@…> 11 years ago.
ticket patch

Download all attachments as: .zip

Change History (18)

comment:1 by EmilStenstrom, 11 years ago

I don't know if this matters, but django-rest-framework has both of the changes above added in their version of urlize: https://github.com/glic3rinu/django-rest-framework/commit/2c0363ddaec22ac54385f7e0c2e1401ed3ff0879

comment:2 by EmilStenstrom, 11 years ago

Similar ticket, that has to do with JSON somehow... https://code.djangoproject.com/ticket/17576

comment:3 by wim@…, 11 years ago

Triage Stage: UnreviewedAccepted

Hi Emil,

Thanks for reporting! Clear that we get erroneous output here. You are welcome to write a patch, if you want to.

by Chris Piwoński <piwonski.kris@…>, 11 years ago

Attachment: 20364.diff added

ticket patch

comment:4 by Chris Piwoński <piwonski.kris@…>, 11 years ago

Cc: piwonski.kris@… added
Has patch: set
Owner: changed from nobody to anonymous
Status: newassigned
Triage Stage: AcceptedReady for checkin

comment:5 by EmilStenstrom, 11 years ago

@Chris Piwoński: Thanks for a patch!

Can i suggest you add a few more tests? The first only works with the TRAILING_PUNCTUATION fix. The second seems to work with the WRAPPING_PUNCTUATION fix.

>>> html.urlize("some text here hi@example.com\" and some text afterwards")
u'some text here <a href="mailto:hi@example.com"">hi@example.com"</a> and some text afterwards'
>>> html.urlize("some text here \"hi@example.com and some text afterwards")
u'some text here <a href="mailto:"hi@example.com">"hi@example.com</a> and some text afterwards'

And maybe some tests for the single quote case too?

Version 0, edited 11 years ago by EmilStenstrom (next)

comment:6 by Chris Piwoński <piwonski.kris@…>, 11 years ago

Made tests for all six cases.

Changes are on github: https://github.com/kriestof/django/tree/ticket_20364

comment:7 by Claude Paroz, 11 years ago

Triage Stage: Ready for checkinAccepted

Sorry, but the RFC flag should never be set by the patch author.

comment:8 by EmilStenstrom, 11 years ago

Triage Stage: AcceptedReady for checkin

This looks great Chris. The new tests pass, and the code looks good.

comment:9 by Aymeric Augustin, 11 years ago

If you read #17576, which is linked from above, it provides you with a failing test case: adding a comma after the quote.

For instance I'm almost sure this doesn't work with the current patch.

urlize('Email us at "hi@example.com", or phone us at +xx.yy').

The problem here is that urlize cannot deal with multiple punctuation, and quotes are often followed by other punctuation. I'd rather not merge a half-broken feature, and in my humble opinion this ticket should be closed wontfix like #17576.

That said, urlize seems to attract tons of contributions, so maybe it's a very important part of Django and I should stop wontfixing such tickets... I'll let another committer make the decision!

Last edited 11 years ago by Aymeric Augustin (previous) (diff)

comment:10 by Krzysztof Jurewicz, 11 years ago

In case this ticket isn’t marked as wontfix, shouldn’t we also support localized quotation marks like „” and «»? There is a list of them available on Wikipedia.

comment:11 by Anssi Kääriäinen, 11 years ago

Triage Stage: Ready for checkinAccepted

I am not that good with HTML stuff, so I am not going to make wontfix call... But in any case this isn't Ready for checkin...

comment:12 by GilesGreenway, 11 years ago

aaugustin's example is in fact passed with flying colours after applying the patch:

>>> urlize('Email us at "hi@example.com", or phone us at +xx.yy')
u'Email us at "<a href="mailto:hi@example.com">hi@example.com</a>", or phone us at +xx.yy'
>>> 
>>> urlize('Visit us at "http://www.example.com", or phone us at +xx.yy')
u'Visit us at "<a href="http://www.example.com">http://www.example.com</a>", or phone us at +xx.yy'

I'll see if I can get a test written. (I'm a newbie contributor at the PyConUK Django sprint.)

comment:13 by GilesGreenway, 11 years ago

Test pushed, pull request made:

https://github.com/django/django/pull/1662

comment:14 by Daniele Procida, 11 years ago

Triage Stage: AcceptedReady for checkin

I've tested the patch at https://github.com/django/django/pull/1662 locally and it looks good to me. Commit/merge message needs to thank EmilStenstrom for the report and Chris Piwoński for the first patch, but other than that I think it's ready for checkin.

comment:15 by GilesGreenway, 11 years ago

Done. 'Hope this is suitably congratulatory:

https://github.com/django/django/pull/1662/commits

comment:16 by Florian Apolloner <florian@…>, 11 years ago

Resolution: fixed
Status: assignedclosed

In 6c06adad1dc45631c1c220d7f8fb531a9cf3ed55:

Fixed #20364 -- Changed urlize regexes to include quotation marks as punctation.

Thanks to EmilStenstrom for raising this, and to Chris Piwoński for all of the fixes and most of the tests.

comment:17 by EmilStenstrom, 11 years ago

I think far to few Ticket threads end with the reporter wrapping up the thread, so here goes:

YOU ARE AWESOME :D

Note: See TracTickets for help on using tickets.
Back to Top