Context Navigation

← Previous Ticket
Next Ticket →

#36586 closed Bug (invalid)

Escaping (ampersand) in browsable API URLs

Reported by:	J M	Owned by:
Component:	Template system	Version:	5.2
Severity:	Normal	Keywords:	urlize
Cc:		Triage Stage:	Unreviewed
Has patch:	no	Needs documentation:	no
Needs tests:	no	Patch needs improvement:	no
Easy pickings:	no	UI/UX:	no

Description

When URLs with an escaped character (specifically in my case, and ampersand) is rendered in the browsable API, in the href it is improperly unescaped. This may only apply to ampersands.

from django.utils.html import urlize
urlize('"tq": "http://api/foos/1/?p=1&times=1"')
'"tq": "<a href="http://api/foos/1/?p=1%C3%97%3D1">http://api/foos/1/?p=1&times=1</a>"'

Change History (3)

comment:1 by Natalia Bidart, 4 months ago

Component:	Uncategorized → Template system
Keywords:	urlize added
Resolution:	→ invalid
Status:	new → closed
Type:	Uncategorized → New feature

Hello J M, thank you for your ticket.

First of all, can you please clarify what do you mean with "browsable API"? This sounds like the django-rest-framework feature. Please note that this tracker is for Django core issues.

Secondly, regarding the urlize example you shared, the behavior occurs specifically when the URL contains × (the HTML entity for ×), rather than any arbitrary ampersand. This happens because urlize is designed to produce HTML-safe links, which may involve encoding characters in the URL to ensure valid HTML. Its purpose is linkification of text for safe display, not exact preservation of the raw URL string.

You can see the tests for this filter to understand better its scope and semantics: https://github.com/django/django/blob/main/tests/template_tests/filter_tests/test_urlize.py

Lastly, there are several user support channels available if you have further questions about how Django works: please refer to TicketClosingReasons/UseSupportChannels for ways to get help.

comment:2 by Natalia Bidart, 4 months ago

Type:	New feature → Bug

comment:3 by Bruno Alla, 3 months ago

To whoever finds this ticket...

I think the problem wasn't reported in the best way by OP. The issue was indeed caught in the browsable API in DRF, and we managed to isolate the problem with the following snippet:

>>> from django.utils.html import urlize
>>> urlize('http://example.com/foos/?page=2&timestamp=1')
'<a href="http://example.com/foos/?page=2%C3%97tamp%3D1">http://example.com/foos/?page=2&timestamp=1</a>'

The problem manifest by &timestamp=1 being translated to %C3%97tamp%3D1. I did't see the string × in that, so suspected a bug, potentially inherited from Python. Looking more closely at the Django implementation, it indeed relies heavily on the Python API html.unescape, which has the same behaviour:

>>> import html
>>> html.unescape('https://example.com/?page=1&timestamp=3')
'https://example.com/?page=1×tamp=3'

Searching the cPython issue tracker brought up this issue https://github.com/python/cpython/issues/85050 which says:

According to https://en.wikipedia.org/wiki/List_of_XML_and_HTML_character_entity_references#cite_ref-semicolon_1-64 the trailing semicolon can be omitted for the named entity "reg". That means "&reg" and "®" are equivalent.

So this working as per the spec.

Note: See TracTickets for help on using tickets.

Download in other formats:

Issues

Context Navigation

#36586 closed Bug (invalid)

Escaping (ampersand) in browsable API URLs

Description

Change History (3)

comment:1 by Natalia Bidart, 4 months ago

comment:2 by Natalia Bidart, 4 months ago

comment:3 by Bruno Alla, 3 months ago

Download in other formats:

Django Links

Learn More

Get Involved

Get Help

Follow Us

Support Us