Opened 15 years ago

Closed 12 years ago

#11903 closed Bug (invalid)

WSGIRequest.path not quoted properly

Reported by: ianb Owned by: Fabián Ezequiel Gallina
Component: HTTP handling Version: 1.1
Severity: Normal Keywords:
Cc: ianb@… Triage Stage: Design decision needed
Has patch: yes Needs documentation: no
Needs tests: no Patch needs improvement: no
Easy pickings: no UI/UX: no

Description

WSGIRequest.__init__ contains the code:

        self.path = '%s%s' % (script_name, path_info)

Both script_name and path_info are url-decoded. That is, if you request /Foo%20bar then PATH_INFO will be '/Foo bar' -- to get the accurate path you have to re-encode both values.

Attachments (2)

11903.diff (906 bytes ) - added by krisneuharth 14 years ago.
Patch for #11903
bug11903.patch (1.7 KB ) - added by Fabián Ezequiel Gallina 14 years ago.
patch with test for bug 11903

Download all attachments as: .zip

Change History (16)

comment:1 by Russell Keith-Magee, 14 years ago

milestone: 1.2
Triage Stage: UnreviewedAccepted

comment:2 by anonymous, 14 years ago

Owner: changed from nobody to anonymous
Status: newassigned

comment:3 by krisneuharth, 14 years ago

Owner: changed from anonymous to krisneuharth
Status: assignednew

by krisneuharth, 14 years ago

Attachment: 11903.diff added

Patch for #11903

comment:4 by krisneuharth, 14 years ago

Has patch: set
Needs tests: set

comment:5 by Russell Keith-Magee, 14 years ago

Component: UncategorizedHTTP handling

comment:6 by Jacob, 14 years ago

Patch needs improvement: set

Need a test for this.

comment:7 by anonymous, 14 years ago

Owner: changed from krisneuharth to Fabián Ezequiel Gallina

by Fabián Ezequiel Gallina, 14 years ago

Attachment: bug11903.patch added

patch with test for bug 11903

comment:8 by Fabián Ezequiel Gallina, 14 years ago

Patch needs improvement: unset

The proposed approach was not correct, urlencode works with a two-element tuples or a dictionary. urlquote should be used for it since script_name and path_info are strings.

The attached patch contains the correction for it and a test.

comment:9 by Chris Beaven, 14 years ago

milestone: 1.2
Triage Stage: AcceptedDesign decision needed

It seems like this could introduce backwards compatible issues (even though from a quick look at the docs there's no specific mention of quoted/unquoted when referring to request.path.

Is there some standard which the proposal to quote request.path would follow? I couldn't find any reference in pep333.

This also creates a disparate situation between path and path_info. Applications may be using both, and to have one quoted and the other not seems odd. And since path_info is used by django's url resolution it may cause problems quoting that.

In any case, this isn't a regression and probably needs some discussion, so I'm bumping out of the already-late 1.2 phase.

comment:10 by ianb, 14 years ago

The quoting of PATH_INFO is specified in the CGI specification, which PEP 333 refers to. This is also true for mod_python (and Apache generally).

comment:11 by Peter Baumgartner, 13 years ago

Severity: Normal
Type: Bug

comment:12 by anonymous, 13 years ago

Needs tests: unset

comment:12 by anonymous, 13 years ago

Needs tests: unset

comment:13 by Aymeric Augustin, 12 years ago

Easy pickings: unset
Resolution: invalid
Status: newclosed
UI/UX: unset

I believe the current behavior is correct. Django handles the encoding / decoding wherever necessary and provides unicode objects to the programmer.

request.path is unicode and has no reason to be url-encoded. (In the code quoted in the original report, path_info is unicode, which guarantees that self.path is unicode.)

This is a custom API of Django, which means we aren't bound by the WSGI or CGI spec there (while we are for request.META['PATH_INFO']).

To sum up, if I'm typing "www.mysite.com/foo bar/" in my browser, the browser will issue a request for "/foo%20bar/", but Django will convert that back to u"/foo bar/".

Note: See TracTickets for help on using tickets.
Back to Top