Opened 16 years ago
Closed 14 years ago
#11903 closed Bug (invalid)
WSGIRequest.path not quoted properly
| Reported by: | ianb | Owned by: | Fabián Ezequiel Gallina |
|---|---|---|---|
| Component: | HTTP handling | Version: | 1.1 |
| Severity: | Normal | Keywords: | |
| Cc: | ianb@… | Triage Stage: | Design decision needed |
| Has patch: | yes | Needs documentation: | no |
| Needs tests: | no | Patch needs improvement: | no |
| Easy pickings: | no | UI/UX: | no |
Description
WSGIRequest.__init__ contains the code:
self.path = '%s%s' % (script_name, path_info)
Both script_name and path_info are url-decoded. That is, if you request /Foo%20bar then PATH_INFO will be '/Foo bar' -- to get the accurate path you have to re-encode both values.
Attachments (2)
Change History (16)
comment:1 by , 16 years ago
| milestone: | → 1.2 |
|---|---|
| Triage Stage: | Unreviewed → Accepted |
comment:2 by , 16 years ago
| Owner: | changed from to |
|---|---|
| Status: | new → assigned |
comment:3 by , 16 years ago
| Owner: | changed from to |
|---|---|
| Status: | assigned → new |
by , 16 years ago
| Attachment: | 11903.diff added |
|---|
comment:4 by , 16 years ago
| Has patch: | set |
|---|---|
| Needs tests: | set |
comment:5 by , 16 years ago
| Component: | Uncategorized → HTTP handling |
|---|
comment:7 by , 16 years ago
| Owner: | changed from to |
|---|
comment:8 by , 16 years ago
| Patch needs improvement: | unset |
|---|
The proposed approach was not correct, urlencode works with a two-element tuples or a dictionary. urlquote should be used for it since script_name and path_info are strings.
The attached patch contains the correction for it and a test.
comment:9 by , 16 years ago
| milestone: | 1.2 |
|---|---|
| Triage Stage: | Accepted → Design decision needed |
It seems like this could introduce backwards compatible issues (even though from a quick look at the docs there's no specific mention of quoted/unquoted when referring to request.path.
Is there some standard which the proposal to quote request.path would follow? I couldn't find any reference in pep333.
This also creates a disparate situation between path and path_info. Applications may be using both, and to have one quoted and the other not seems odd. And since path_info is used by django's url resolution it may cause problems quoting that.
In any case, this isn't a regression and probably needs some discussion, so I'm bumping out of the already-late 1.2 phase.
comment:10 by , 16 years ago
The quoting of PATH_INFO is specified in the CGI specification, which PEP 333 refers to. This is also true for mod_python (and Apache generally).
comment:11 by , 15 years ago
| Severity: | → Normal |
|---|---|
| Type: | → Bug |
comment:12 by , 15 years ago
| Needs tests: | unset |
|---|
comment:12 by , 15 years ago
| Needs tests: | unset |
|---|
comment:13 by , 14 years ago
| Easy pickings: | unset |
|---|---|
| Resolution: | → invalid |
| Status: | new → closed |
| UI/UX: | unset |
I believe the current behavior is correct. Django handles the encoding / decoding wherever necessary and provides unicode objects to the programmer.
request.path is unicode and has no reason to be url-encoded. (In the code quoted in the original report, path_info is unicode, which guarantees that self.path is unicode.)
This is a custom API of Django, which means we aren't bound by the WSGI or CGI spec there (while we are for request.META['PATH_INFO']).
To sum up, if I'm typing "www.mysite.com/foo bar/" in my browser, the browser will issue a request for "/foo%20bar/", but Django will convert that back to u"/foo bar/".
Patch for #11903