Opened 20 months ago

Last modified 3 months ago

#28936 assigned Bug

simplify_regex should remove redundant escape sequences outside groups

Reported by: Cristi Vîjdea Owned by: Oliver Cleary
Component: contrib.admindocs Version: 2.0
Severity: Normal Keywords: simplify_regex path
Cc: ChillarAnand Triage Stage: Accepted
Has patch: no Needs documentation: no
Needs tests: no Patch needs improvement: no
Easy pickings: no UI/UX: yes

Description (last modified by Cristi Vîjdea)

django.contrib.admindocs.views.simplify_urls should clean up escapes found outside path parameters. Otherwise, broken URLs with backslashes can be generated and displayed.

This is readily apparent with Django 2's path(), which aggresively escapes everything outside a <parameter> specifier, resulting in a urlpattern with backslash-escaped forward slashes:

>>> simplify_regex(r"^(?P<sport_slug>\w+)/athletes/(?P<athlete_slug>\w+)/$")
'/<sport_slug>/athletes/<athlete_slug>/'
>>> simplify_regex(r"^(?P<sport_slug>\w+)\/athletes\/(?P<athlete_slug>\w+)\/$")
'/<sport_slug>\\/athletes\\/<athlete_slug>\\/'

The second example is what path() would generate in urlpatterns.

You can, for example, see this issue affecting django-rest-framework here.

Change History (4)

comment:1 Changed 20 months ago by Cristi Vîjdea

Description: modified (diff)

comment:2 Changed 20 months ago by ChillarAnand

Cc: ChillarAnand added
Triage Stage: UnreviewedAccepted

comment:3 Changed 3 months ago by Oliver Cleary

Owner: changed from nobody to Oliver Cleary
Status: newassigned

comment:4 Changed 3 months ago by Oliver Cleary

I have a PR for this ticket, however I am not sure it really needs to be fixed.

The referenced DRF ticket was resolved by changing the usage of the simplify_regex function to match the usage by Django, which is to pass in the paths route string directly, rather than using it's generated regex pattern.

Additionally, the given example does not seem to be correct as path does not escape forward slashes.

>>> path('<slug:sports_slug>/athletes/<slug:athletes_slug>/', lambda: None).pattern.regex.pattern
'^(?P<sports_slug>[-a-zA-Z0-9_]+)/athletes/(?P<athletes_slug>[-a-zA-Z0-9_]+)/$'

Testing the example in the DRF ticket however does exhibit the issue:

>>> path('^api/token-auth/', lambda: None).pattern.regex.pattern
'^\\^api/token\\-auth/$'
>>> simplify_regex(r'^\^api/token\-auth/$')
'/\\api/token\\-auth/'

With the fix in the PR the special characters are unescaped, and the ^?$ are only stripped if not escaped.

PR

Last edited 3 months ago by Oliver Cleary (previous) (diff)
Note: See TracTickets for help on using tickets.
Back to Top