Opened 4 years ago

Closed 4 years ago

Last modified 9 months ago

#15718 closed Bug (wontfix)

Django unquotes urls and not able to distinguish %2F and /

Reported by: fed239 Owned by: nobody
Component: Core (Other) Version: 1.2
Severity: Normal Keywords: urls, url resolver, unquote, %2F
Cc: Triage Stage: Unreviewed
Has patch: no Needs documentation: no
Needs tests: no Patch needs improvement: no
Easy pickings: no UI/UX: no

Description

I've found that in basehttp.py there is a line

env['PATH_INFO'] = urllib.unquote(path)

It replaces all URL-escaped symbols with original symbols. This leads to a situation that you can not properly handle urls with quoted symbols in your urls.py. For example url http://example.com/blah%2Fblah%2Fblah/ will be matched by regexp /(\w+)/(\w+)/(\w+)/$

Under apache with mod_wsgi this seems to lead to even more interesting problem. When %2F is present in URL, request is not handled by django and user gets 404 error directly from apache. Try http://www.djangoproject.com/%2F

Change History (9)

comment:1 Changed 4 years ago by fed239

  • Needs documentation unset
  • Needs tests unset
  • Patch needs improvement unset

After investigation I've found that the 2nd issue (404 error directly from apache) is not related to django and can be avoided by adding "AllowEncodedSlashes On" into apache config. Unfortunately apache replaces %2f with / itself, so the behavior is exactly the same as in simple http server provided by django. In Apache 2.2.18 (which is not released yet, i guess), AllowEncodedSlashes allows value NoDecode. With the value NoDecode, such URLs are accepted, but encoded slashes are not decoded but left in their encoded state. Meanwhile I'm using the workaround

        request_uri = force_unicode(environ.get('REQUEST_URI', u'/'))
        if u'?' in request_uri:
            path_info,query = request_uri.split('?',1)
        else:
            path_info,query = request_uri,''

instead of original

        path_info = force_unicode(environ.get('PATH_INFO', u'/'))

in core/handlers/wsgi.py

comment:2 Changed 4 years ago by lukeplant

  • Type set to Bug

comment:3 Changed 4 years ago by lukeplant

  • Severity set to Normal

comment:4 Changed 4 years ago by jacob

  • Resolution set to wontfix
  • Status changed from new to closed

I'm fairly sure that this is present in the dev server specifically because it mimics Apache's behavior -- as you've discovered. Changing this would mean that the dev server would behave differently than production servers.

In fact, poking further, this is more or less enshrined by the WSGI spec -- it's expected that you'll need to re-quote the path if you need to construct the originally given URL.

Further, this would be a devestatingly difficult-to-debug backwards-incompatible change.

Given all that, I'm marking this wontfix: there's simply no real upside to making this change.

comment:5 Changed 4 years ago by anonymous

  • Easy pickings unset

I don't agree that there is no upside. Currently URL http://example.com/A%2fB/C/ will match pattern ^([^/]+)/([^/]+)/([^/]+)/$ instead of expected ^([^/]+)/([^/]+)/$

This restricts usage of URL patterns.

comment:6 follow-ups: Changed 9 months ago by gst

  • UI/UX unset

I've ran into the exact same issue :/

The main problem I see is that, as far as I understand actually, django compares the url in its url-decoded form against each possible regex pattern. So the problems we are encountering with '/' encoded url value (%2F). Though I could be wrong 'cause I've not went to check django code.
If I'm not wrong about this:

Wouldn't there be a possibility to tell django to compare some url regex pattern against the original url value in its non-decoded form ??

regards,

gst.

comment:7 in reply to: ↑ 6 Changed 9 months ago by gst

Replying to gst:

Wouldn't there be a possibility to tell django to compare some url regex pattern against the original url value in its non-decoded form ??

that would be a feature request, what about if I try to make a patch about it ? would it have chances to be at least reviewed ?

comment:8 Changed 9 months ago by claudep

Any patch with tests is worth a review. But of course, we cannot promise it will be accepted.

comment:9 in reply to: ↑ 6 Changed 9 months ago by gst

Replying to gst:

I've ran into the exact same issue :/

The main problem I see is that, as far as I understand actually, django compares the url in its url-decoded form against each possible regex pattern. So the problems we are encountering with '/' encoded url value (%2F). Though I could be wrong 'cause I've not went to check django code.

The other possible work around, is to url-encode twice the different parts of the url (so that '/' would be compared as '%2F' when compared to all the url regex patterns and then no more problem also) that you want to reach and then to decode them once in the view.
Though it seems rather special.

Note: See TracTickets for help on using tickets.
Back to Top