Opened 10 years ago

Closed 9 years ago

Last modified 8 years ago

#3553 closed (wontfix)

urlpatterns uses urllib.quote/unquote should use urllib.quote/unquote_plus

Reported by: justin-django@… Owned by: Graham King
Component: Core (Other) Version: 0.95
Severity: Keywords: sprintsept14
Cc: Triage Stage: Design decision needed
Has patch: no Needs documentation: no
Needs tests: yes Patch needs improvement: no
Easy pickings: UI/UX:


When using the + character to represent spaces in urls it is impossible to differentiate between a literal + and an escaped %2B in the url.


urlpatterns = (r'/search/(?<query>.*)$,
def search(request, query):
  print "query: "+query


Visiting causes django to print foo++bar


Visitin should cause django to print foo+ bar


I assume that django is using urllib.quote/unqoute Changing it to use urllib.quote_plus and urllib.unqoute_plus would produce the expected behavior and not have any side affects

Attachments (1)

3553.diff (6.4 KB) - added by Graham King 9 years ago.
Uses urllib.unquote_plus for dev server, mod_python and fastcgi, and all the tests pass

Download all attachments as: .zip

Change History (13)

comment:1 Changed 10 years ago by Chris Beaven

Triage Stage: UnreviewedDesign decision needed

Well it would have a side-effect of breaking anyone's existing code if their code relied on the current behaviour...

comment:2 Changed 10 years ago by justin-django@…

probably the best method is to have a init param default to urllib.un/quote and be able to specify any func... (excuse the misterminology python newb)

comment:3 Changed 10 years ago by James Bennett

Component: UncategorizedCore framework
Keywords: sprintsept14 added

I believe all of this changed with the Unicode merge (since various helper functions had to be written to handle non-ASCII data); can someone confirm whether this is still an issue?

comment:4 Changed 10 years ago by Jacob

Triage Stage: Design decision neededAccepted

comment:5 Changed 9 years ago by loppear

Verified against trunk r7264, using:

def search(request, query):
    response = HttpResponse()
    response.write("query: %s" % query)
    return response

urlpatterns = patterns('', (r'q/(?P<query>.*)$', search),)

does print "query: foo++bar" for /q/foo%2B+bar

comment:6 Changed 9 years ago by Matt McClanahan

milestone: 1.0
Needs tests: set

Verified against [8776].

comment:7 Changed 9 years ago by Jacob

milestone: 1.0post-1.0

comment:8 Changed 9 years ago by anonymous

Owner: changed from nobody to anonymous
Status: newassigned

comment:9 Changed 9 years ago by Graham King

Owner: changed from anonymous to Graham King
Status: assignednew

comment:10 Changed 9 years ago by Graham King

Status: newassigned
Triage Stage: AcceptedDesign decision needed

I've uploaded a patch which solves this, but read on.
On the dev server it's a one liner.

On mod_python and fastcgi, for a path of '/q/foo%2B+bar' the environment hands you '/q/foo++bar' in PATH_INFO. I've had to use REQUEST_URI (req.unparsed_uri in mod_python) instead of PATH_INFO.
If those smart people at mod_python and fastcgi/flup are converting the %2B and + this way, I don't feel comfortable doing it differently. I'm setting this back to 'Design Decision Needed' for advice.

For what it's worth I've tested the mod_python version of this patch on a medium to large work project, and it doesn't seem to cause any problems.
I've just noticed this patch seems to break a whole bunch of unit tests. Oops. Investigating.

Changed 9 years ago by Graham King

Attachment: 3553.diff added

Uses urllib.unquote_plus for dev server, mod_python and fastcgi, and all the tests pass

comment:11 Changed 9 years ago by Graham King

Resolution: wontfix
Status: assignedclosed

Attached a patch that will allow you to do this, but I don't recommend you do. It will probably break mod_rewrite. The answer is don't put the query in the path. Change your URL structure to be /search?q=foo%2B+bar.

+ encodes a space in a _query_ (coming from a form), not in a _path_. In a path %20 should be used. Create a document called 'my test.html' on a local Apache install and try browsing to http://localhost/my+test.html -> 404, replace the + with %20 and it works.

When Apache hands Django the PATH_INFO, the %2B has been correctly converted to a +, and the other plus is left as it because, in the path, it doesn't mean anything special. So foo%2B+bar correctly becomes foo++bar

Look at, where there is both a band called '44' and one called '+44': --> 44 --> 44 --> 44 --> +44

They had to double encode it, replacing the % with %25. That's what you'll have to do to use + as a character and a space in the path. Otherwise shift the query to the query component or the URL.

comment:12 Changed 8 years ago by (none)

milestone: post-1.0

Milestone post-1.0 deleted

Note: See TracTickets for help on using tickets.
Back to Top