#7355 closed (fixed)
Urlize function in django.utils.html does not properly work on https:// links
Reported by: | clint | Owned by: | nobody |
---|---|---|---|
Component: | Core (Other) | Version: | dev |
Severity: | Keywords: | ||
Cc: | Triage Stage: | Ready for checkin | |
Has patch: | yes | Needs documentation: | no |
Needs tests: | no | Patch needs improvement: | no |
Easy pickings: | UI/UX: |
Description (last modified by )
Just tested this out on the version of utils/html.py in [7569]
urlize() incorrectly adds a http:// to the beginning of a https link
To replicate:
>>> from django.utils.html import urlize >>> words = "Hi there https://www.google.com" >>> urlize(words) u'Hi there <a href="http://https://www.google.com">http://https://www.google.com</a>'
I did a search in trac for "urlize https" and came up with no hits. As far as I can tell this bug has been around for a while, not sure why no one would've caught this.
Patch: =================================================================== --- html.py (revision 7569) +++ html.py (working copy) @@ -99,7 +99,7 @@ lead, middle, trail = match.groups() if safe_input: middle = mark_safe(middle) - if middle.startswith('www.') or ('@' not in middle and not middle.startswith('http://') and \ + if middle.startswith('www.') or ('@' not in middle and not (middle.startswith('http://') or middle.startswith('https://')) and \ len(middle) > 0 and middle[0] in string.ascii_letters + string.digits and \ (middle.endswith('.org') or middle.endswith('.net') or middle.endswith('.com'))): middle = 'http://%s' % middle
And a test to check for this
Index: tests.py =================================================================== --- tests.py (revision 7569) +++ tests.py (working copy) @@ -166,6 +166,11 @@ >>> urlizetrunc(uri, 2) u'<a href="http://31characteruri.com/test/" rel="nofollow">...</a>' +# Check normal urlize +>>> url = 'https://google.com' +>>> urlize(url) +u'<a href="https://google.com" rel="nofollow">https://google.com</a>' + >>> wordcount('') 0
Attachments (3)
Change History (14)
comment:1 by , 16 years ago
Description: | modified (diff) |
---|
comment:2 by , 16 years ago
although it only seems to do this if you leave off the trailing slash
>>> urlize("https://www.google.com/") u'<a href="https://www.google.com/" rel="nofollow">https://www.google.com/</a>' >>> urlize("https://www.google.com") u'<a href="http://https://www.google.com" rel="nofollow">http://https://www.google.com</a>'
comment:3 by , 16 years ago
Could this mean that the real root of the issue is that the regexp that splits urls into lead,middle,and trailing doesn't work right if there's no trailing slash?
comment:4 by , 16 years ago
Actually it looks like if you have a trailing slash on your url, then:
middle = 'https://google.com/'
and the check done on line 104-106 fails:
if middle.startswith('www.') or ('@' not in middle and not middle.startswith('http://') and \ len(middle) > 0 and middle[0] in string.ascii_letters + string.digits and \ (middle.endswith('.org') or middle.endswith('.net') or middle.endswith('.com'))):
And it then matches the next set of conditions on line 108 and is handled properly:
if middle.startswith('http://') or middle.startswith('https://'):
However if you URL has no slash at the end of the URL:
middle = 'https://google.com'
Then you do not fail the first chunk (which it should, but does not because it only checks for http://), and an http:// is appended to the front of the valid https:// URL. So it would appear that my patch looks like it probably is the correct solution.
comment:5 by , 16 years ago
Triage Stage: | Unreviewed → Accepted |
---|
Clint, this looks good - thanks for tracking it down. Can you attach the patches as diffs (svn diff) and promote to ready for checkin?
comment:6 by , 16 years ago
Triage Stage: | Accepted → Ready for checkin |
---|
I've added them and I'm moving this to rdy for checkin
by , 16 years ago
Attachment: | urlize-https-fix-tests.diff added |
---|
comment:8 by , 16 years ago
Doh. Sorry about that. I cat'd them together. I hope that's similar to the format you need. (I don't have the tests fix deployed on the version of Django I'd updated, so I can't just pop out a single svn diff right now).
comment:9 by , 16 years ago
milestone: | → 1.0 |
---|
Thanks that makes things easier for commiters later as they only have to review and apply the latest patch in the ticket ;)
Marking 1.0 as this is a bug.
comment:10 by , 16 years ago
Resolution: | → fixed |
---|---|
Status: | new → closed |
Uh.. sorry for the horrible formatting...