Opened 9 years ago

Closed 9 years ago

#24013 closed Bug (fixed)

reverse() escapes unreserved characters supplied by prefix

Reported by: Thom Gerdes Owned by: Bas Peschier
Component: Core (URLs) Version: 1.6
Severity: Normal Keywords: ams2015
Cc: Triage Stage: Ready for checkin
Has patch: yes Needs documentation: no
Needs tests: no Patch needs improvement: no
Easy pickings: no UI/UX: no

Description

This is very similar to ticket #22223. The escaping of unreserved characters still occurs if the unreserved characters are supplied by the prefix.

For example:

In [3]: set_script_prefix("http://www.example.com/")

In [4]: reverse('test', args=['foo:bar'])
Out[4]: u'http%3A//www.example.com/foo:bar'

I believe the same changes to the urlquote call for normalizing the prefix applies the same as the fix in https://github.com/django/django/commit/e167e96cfea670422ca75d0b35fe7c4195f25b63

This is also a regression in 1.6.

Attachments (1)

24013-test.diff (923 bytes ) - added by Tim Graham 9 years ago.

Download all attachments as: .zip

Change History (8)

comment:1 by Tim Graham, 9 years ago

I haven't seen the need to set script prefix as an absolute URI (or documentation that says doing so is valid). Could you confirm in what cases this happens? Thanks!

comment:2 by Thom Gerdes, 9 years ago

Actually that use case is probably an abuse of the [undocumented] set_script_prefix method, fostered by years of community advice (ex, https://djangosnippets.org/snippets/1713/ which you should really use request.build_absolute_uri() for instead).

To attempt to justify myself, the context where we use set_script_prefix() is outside of the wsgi process, for example in celery tasks or in management commands where we still want to generate absolute URLs.

Let me amend it to a use case that does matter and requires a fix here: if your script prefix has this class of character in it. For example a apache/nginx proxying the location "/script:name/" to the django project.

And to refer to some django docs: you can force this with FORCE_SCRIPT_NAME.

So if I do:

# settings.py
FORCE_SCRIPT_NAME="/script:name/"

# views.py
def test_view(request, arg):
    # This view is registered with
    #url(r'^(.*)$', 'test_app.views.test_view', name='test')
    return  HttpResponse(reverse("test", args=["foo:bar"]))

The output is

/script%3Aname/foo:bar

expected is

/script:name/foo:bar

by Tim Graham, 9 years ago

Attachment: 24013-test.diff added

comment:3 by Tim Graham, 9 years ago

Triage Stage: UnreviewedAccepted

Makes sense. A regression test that currently fails is attached. I couldn't confirm it's a regression in 1.6 since the test fails with this error on 1.5:

======================================================================
ERROR: test_script_name_escaping (regressiontests.urlpatterns_reverse.tests.ReverseShortcutTests)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/home/tim/code/django/tests/regressiontests/urlpatterns_reverse/tests.py", line 277, in test_script_name_escaping
    reverse('optional', args=['foo:bar']),
  File "/home/tim/code/django/django/core/urlresolvers.py", line 522, in reverse
    return iri_to_uri(resolver._reverse_with_prefix(view, prefix, *args, **kwargs))
  File "/home/tim/code/django/django/core/urlresolvers.py", line 415, in _reverse_with_prefix
    candidate = (prefix_norm + result) % dict(zip(prefix_args + params, unicode_args))
ValueError: unsupported format character 'A' (0x41) at index 9

But maybe that's just an effect of the particular test. We can likely backport to 1.7 if we have a patch to fix the issue (1.6 is only receiving security fixes at this time).

comment:4 by Bas Peschier, 9 years ago

Keywords: ams2015 added
Owner: changed from nobody to Bas Peschier
Status: newassigned

comment:5 by Bas Peschier, 9 years ago

So the problem is very similar, as mentioned, to #22223 and the solution is also very similar: treat the prefix as another path segment and use the same safe-argument when calling urlquote 1]. This makes the test for #20022 fail, but that test is incorrect and can be easily fixed (it should test escaping of %, not ~, which is an unreserved char)

There is another problem however: all url patterns should be valid regex, but the script prefix does not have to be (or does it?). When reverse tries to use the prefix, it assumes it needs to normalize it and thus that it is a valid regex. This breaks normalize when you add a singular ) which is a valid path part, but not valid regex.

The prefix-argument is undocumented and defaults to the script prefix. So we can either assume all prefixes can be treated as plain strings, or we can regex-escape get_script_prefix when reverse populates the prefix 2].

1]: https://github.com/django/django/blob/master/django/core/urlresolvers.py#L448
2]: https://github.com/django/django/blob/master/django/core/urlresolvers.py#L532

comment:6 by Tim Graham, 9 years ago

Has patch: set
Triage Stage: AcceptedReady for checkin

PR from bpeschier looks good.

comment:7 by Tim Graham <timograham@…>, 9 years ago

Resolution: fixed
Status: assignedclosed

In c9f1a129:

Fixed #24013 -- Fixed escaping of reverse() prefix.

Prefix was treated as a part of the url pattern, which it is not.
Improved tests to conform with RFC 3986 which allows certain
characters in path segments without being escaped.

Note: See TracTickets for help on using tickets.
Back to Top