#3414 closed (fixed)
middleware/common.py and SCGI bug - string index out of range (caused by missing PATH_INFO)
Reported by: | Owned by: | nobody | |
---|---|---|---|
Component: | Core (Other) | Version: | dev |
Severity: | Keywords: | ||
Cc: | real.human@…, richard.davies@… | Triage Stage: | Accepted |
Has patch: | yes | Needs documentation: | no |
Needs tests: | no | Patch needs improvement: | no |
Easy pickings: | no | UI/UX: | no |
Description
I wanted to use Cherokee with SCGI to test my site but I get this exception when trying to view it in the browser (/ main page):
Traceback (most recent call last): File "/usr/lib64/python2.4/site-packages/flup-0.5-py2.4.egg/flup/server/scgi_base.py", line 185, in run File "/usr/lib64/python2.4/site-packages/flup-0.5-py2.4.egg/flup/server/scgi_base.py", line 456, in handler File "/usr/lib64/python2.4/site-packages/django/core/handlers/wsgi.py", line 148, in __call__ response = self.get_response(request.path, request) File "/usr/lib64/python2.4/site-packages/django/core/handlers/base.py", line 59, in get_response response = middleware_method(request) File "/usr/lib64/python2.4/site-packages/django/middleware/common.py", line 40, in process_request if settings.APPEND_SLASH and (old_url[1][-1] != '/') and ('.' not in old_url[1].split('/')[-1]): IndexError: string index out of range
Django + SCGI + Cherokee worked for me some time ago without any problems. Now on 0.9.5.1 it throws this exception.
Attachments (4)
Change History (28)
comment:1 by , 18 years ago
comment:2 by , 18 years ago
Triage Stage: | Unreviewed → Accepted |
---|
comment:4 by , 18 years ago
Happens on Litespeed 3.0 too. For me the APPEND_SLASH has no effect either way, True or False.
comment:5 by , 18 years ago
I think the issue here is that Django is not allowing an empty value for PATH_INFO. I have run into this problem and found out that Cherokee is passing in an empty value for PATH_INFO. According to the CGI/1.1 specification (RFC 38756 4.1.5 http://www.ietf.org/rfc/rfc3875) PATH_INFO can have an empty value. I am not sure how this will effect the usage of APPEND_SLASH.
comment:7 by , 17 years ago
Version: | 0.95 → SVN |
---|
I can confirm that this happens on the latest build (r1857) of lighttpd. Luckily however, the APPEND_SLASH=False in settings.py took care of the error.
Flup: SVN (May 28, 2007)
Django: SVN (May 5th-ish, 2007)
comment:8 by , 17 years ago
About the bug:
- Doesn't occurs on my development environment:
- Ubuntu Feisty 7.04 (Linux 2.6.20-15)
- Lighttpd-1.4.13 + Fastcgi
- Flup revision 2126
- Django revision 4463
- Occurs on my production :-( environment: (Textdrive hosting)
- FreeBSD 6.2-STABLE
- Lighttpd-1.4.13 + Fastcgi
- Flup revision 2126
- Django revision 4463
- Value of
APPEND_SLASH
doesn't change anything. - The patch for
wsgi.py
shown on ticket:2407 doesn't solve this.
About the patch: it fixes the issue for me but I don't have experience with CGI/1.1 specification, so this doesn't intends to be authoritative.
Not tested with django-trunk.
by , 17 years ago
Attachment: | wsgi.patch added |
---|
comment:9 by , 17 years ago
Has patch: | set |
---|
comment:10 by , 17 years ago
Cc: | added |
---|
comment:11 by , 17 years ago
Resolution: | → fixed |
---|---|
Status: | new → closed |
follow-up: 14 comment:12 by , 17 years ago
Resolution: | fixed |
---|---|
Status: | closed → reopened |
Still reproducible. The quick-and-dirty path attachment:wsgi_path_from_many_params.diff works for me.
by , 17 years ago
Attachment: | path_info_wsgi.diff added |
---|
comment:14 by , 17 years ago
Replying to Jordi Funollet <jordi.f@ati.es>:
Still reproducible. The quick-and-dirty path attachment:wsgi_path_from_many_params.diff works for me.
That patch removes the path section of the base url from REDIRECT_URL and REQUEST_URI. So, for example, say my project is at http://example.com/my_project/. If I go to http://example.com/my_project/admin/, I get redirected to http://example.com/admin/. If I go to a page in my project that doesn't exist, let's say http://example.com/my_project/bob/, the "Request URL" on the "Page not found" screen is http://example.com/bob/.
As noted by at #3762 by michael@…, the hacks to the WSGI script from http://code.google.com/p/modwsgi/wiki/IntegrationWithDjango make it all work as it should. Seeing that, I made an even simpler patch: attachment:path_info_wsgi.diff.
comment:15 by , 16 years ago
I believe this is related to #285, but all of the hubbub there suggests to me that my one-line patch probably isn't the solution to everyone's problem. Still working for me, though!
by , 16 years ago
Attachment: | wsgi_path_from_many_params_2.diff.txt added |
---|
Update to wsgi_path_from_many_params.diff which better handles QUERY_STRING
comment:16 by , 16 years ago
Cc: | added |
---|
I run Django with Lighttpd, using the error-handler-404 mechanism borrowed from the standard Rails config for this web server (http://github.com/rails/rails/tree/master/railties/configs/lighttpd.conf)
When run in this manner, Lighttpd does not set PATH_INFO (http://trac.lighttpd.net/trac/wiki/FrequentlyAskedQuestions#Whatkindofenvironmentdoesserver.error-handler-404setup), so I have been using Jordi's attachment:wsgi_path_from_many_params.diff, which worked well at first for me, to take self.path from REQUEST_URI in the absence of PATH_INFO.
More recently, when using query strings, I note that REQUEST_URI includes the query string (e.g. "/script/?foo=bar"), whereas self.path should not. I am therefore posting an update to Jordi's patch which correctly strips out the query string from REQUEST_URI before setting self.path. When used in this mode, Lighttpd also does not set QUERY_STRING itself, so I also take the opportunity to set QUERY_STRING based on REQUEST_URI if it is not already present.
comment:17 by , 16 years ago
The setup being espoused in comment 16 looks like a terrible way to set up a webserver. I'm not sure we really want to pollute the main code with anything extra just to handle that case. It's using a 404 error path to try and do normal (non-error) handling. The Django docs already explain how to use lighttpd with fastcgi without needing to corrupt an error handler that is intended for an entirely different purpose.
Fortunately, it won't be impossible to work this way, since you can always subclass the WSGI handler and write your own handler for this situation which is even further from a proper WSGI environment than Django normally expects. But I doubt that I'm going to include this in core right at the moment, since it's not an approach we should be encouraging and it's a lot of extra poking into environment variables to work around something (and we would have to maintain it forever).
comment:18 by , 16 years ago
Resolution: | → fixed |
---|---|
Status: | reopened → closed |
(In [8015]) Changed/fixed the way Django handles SCRIPT_NAME and PATH_INFO (or
equivalents). Basically, URL resolving will only use the PATH_INFO and the
SCRIPT_NAME will be prepended by reverse() automatically. Allows for more
portable development and installation. Also exposes SCRIPT_NAME in the
HttpRequest instance.
There are a number of cases where things don't work completely transparently,
so mod_python and fastcgi users should read the relevant docs.
comment:19 by , 16 years ago
Let me quickly explain the logic of the setup that I mentioned in comment 16 (note that this copies the _standard_ way to set up Rails with Lighttpd, it isn't something that I invented myself!).
Compared to the Django lighttpd config at http://www.djangoproject.com/documentation/fastcgi/#lighttpd-setup , the directories are "inside out".
That config has MEDIA_ROOT/MEDIA_URL as a subdirectory of the site on the webserver, and uses url rewriting to handle media such as favicon.ico, robots.txt, etc. which are expected in the top level or otherwise outside the media subdirectory.
The Rails-style config has MEDIA_ROOT/MEDIA_URL as the root-level directory of the site on the webserver, and then connect the error-handler-404 to Django - this means that if a file is there then it gets served whilst if it is not then the URL falls through for Django to handle. So, favicon, etc can just be put directly in their right place.
comment:20 by , 16 years ago
Resolution: | fixed |
---|---|
Status: | closed → reopened |
Summary: | middleware/common.py and SCGI bug - string index out of range → middleware/common.py and SCGI bug - string index out of range (caused by missing PATH_INFO) |
Regardless of the rights or wrongs of the Lighttpd approach in my comment 16, the target of this ticket is to find a solution for cases where PATH_INFO is not set in the incoming environment (first identified in comment 5 in the case of Cherokee). [8015] fixes #285, but still assumes that PATH_INFO is correctly set in the incoming environment, so cannot close this ticket.
comment:21 by , 16 years ago
Resolution: | → fixed |
---|---|
Status: | reopened → closed |
Having looked a bit at the fix for #285 and done a bit of research, I believe the correct conclusion here is:
"Doctor, it hurts when I configure my server this way!"
"Well, then, don't configure your server that way."
If you set things up in such a way that PATH_INFO
is not available to Django or has been mangled prior to handing off to Django, I don't think Django can help you much, really; we can't magically reconstruct information that wasn't given to us in the first place.
comment:22 by , 16 years ago
A note for anyone following this thread, or experiencing this problem - Flup 1.0.1 has been released, and now attempts to generate PATH_INFO if it is missing. This means that these problems are now handled before reaching Django.
comment:23 by , 16 years ago
Note: the WSGI spec allows PATH_INFO to be empty or missing; specifically:
"This may be an empty string, if the request URL targets the application root and does NOT have a trailing slash." (emph. added)
And WSGI servers are allowed to omit PATH_INFO (and various other variables) if they are an empty string.
IIUC, this means that [8105] doesn't correctly handle the case where someone goes to "foo.com/django" (no trailing '/'), because it wrongly assumes that a missing PATH_INFO is a '/'. Per the WSGI spec, a missing PATH_INFO is in fact an empty string. That means that relative URLs at the root of a Django site would not work correctly under servers that omit an empty PATH_INFO.
Whether the OP issue here is a configuration problem is irrelevant to this piece: it is perfectly legal for a WSGI server to omit PATH_INFO if it's an empty string, and its omission means that it's an EMPTY string, not a '/'.
Conversely, if a WSGI server is ommitting PATH_INFO when PATH_INFO should be a "/" (i.e. the URL was "foo.com/django/" with a trailing "/"), then that server is seriously broken and should be fixed. (But I'm not seeing anything here that suggests this is actually the case.)
Either way, however, the code that's defaulting a missing PATH_INFO to "/" appears to be quite wrong: either creating a bug or masking one somewhere else.
comment:24 by , 16 years ago
Posting to a closed ticket is a good way to make sure a comment gets overlooked. Fortunately, in this case I saw it go by, so I've opened #9435 to make sure any inconsistencies are tidied up.
I've seen something similar under FastCGI; it only happens when
APPEND_SLASH
is true, and it seems to be something to do withPATH_INFO
not being passed in properly. I've had this come up and so have some other folks I've talked to, so there's definitely an issue, I'm just not sure where it is.