#5738 closed Uncategorized (fixed)
django fails on defective unicode strings appearing in the url
Reported by: | Owned by: | nobody | |
---|---|---|---|
Component: | HTTP handling | Version: | dev |
Severity: | Normal | Keywords: | |
Cc: | Triage Stage: | Accepted | |
Has patch: | yes | Needs documentation: | no |
Needs tests: | no | Patch needs improvement: | yes |
Easy pickings: | no | UI/UX: | no |
Description
problem happens with any django site (version does not matter),
the best backtrace one can get here :-)
Attachments (1)
Change History (15)
comment:1 by , 17 years ago
comment:2 by , 17 years ago
Has patch: | set |
---|
comment:3 by , 17 years ago
Component: | Core framework → HTTP handling |
---|---|
Triage Stage: | Unreviewed → Accepted |
comment:4 by , 17 years ago
Resolution: | → fixed |
---|---|
Status: | new → closed |
comment:5 by , 17 years ago
Patch needs improvement: | set |
---|---|
Resolution: | fixed |
Status: | closed → reopened |
I am not sure whether that was the correct fix, because now things like
http://www.djangoproject.com/d%aao%aaw%aan%aal%aao%aaa%aad%aa/
work too...
comment:6 by , 17 years ago
Resolution: | → fixed |
---|---|
Status: | reopened → closed |
It isn't actually a bug that that example works. It's harmless. We can live with this fix.
comment:7 by , 17 years ago
As an addendum to the previous comment (I hit "post" too fast), the alternative is to automatically return an HTTP 400 status code in this case. But I think what we're doing is a reasonable approach to the problem.
comment:8 by , 17 years ago
Resolution: | fixed |
---|---|
Status: | closed → reopened |
that is exactly what I would have expected - a 404 page...
comment:9 by , 17 years ago
You absolutely should not get a 404 from that. If you think that the request is bad, the correct status is "HTTP 400 Bad Request".
comment:10 by , 17 years ago
Thinking about this a lot more, I'm not totally happy with the fix in [6475], but it's a line-ball a bit. The problem is that although UTF-8 is strongly recommended as the encoding for non-ASCII data, it's not actually codified in any spec until quite recently (RFC 2396 leaves things wide open, for example). Only in RFC 3986 were things made clear for IRI to URI encoding.
In the interim, systems were deployed that spit out non-UTF-8 encoded URIs.
So I'm going to commit a change that passes back a 400 response for malformed input (non-UTF-8) but also makes it easier to override the request class, so if somebody is dealing with a legacy system, they can subclass WSGIRequestor or ModPythonRequest to handle decoding the URI however they need to.
comment:11 by , 17 years ago
Resolution: | → fixed |
---|---|
Status: | reopened → closed |
For some reason, the auto-closer didn't trigger. [6550] has the latest commit for this ticket.
comment:12 by , 15 years ago
Resolution: | fixed |
---|---|
Status: | closed → reopened |
It seems this bug is still present.
See:
http://www.djangoproject.com/你好 -> Error 500
http://www.djangoproject.com/~%A9 -> Empty page
In my opinion, both of the urls should return a 404.
comment:13 by , 15 years ago
Resolution: | → fixed |
---|---|
Status: | reopened → closed |
This ticket was fixed, if the original problem had really not been addressed it would have been reopened quickly, not 2+ years later.
The fact that djangoproject.com generates a 500 server error on some page doesn't necessarily imply a bug in base Django. The application code for the website could equally well be at fault. I tried that url on one of my own projects and it works properly, unless it causes a redirect to the login page, in which case the attempt to send a redirect then fails. I believe this problem is already covered by another open ticket (search the tracker on redirect and non-ASCII or unicode or something).
For the 2nd one I get a 400 status returned, which seems correct to me.
comment:14 by , 12 years ago
Easy pickings: | unset |
---|---|
Severity: | → Normal |
Type: | → Uncategorized |
UI/UX: | unset |
#16541 was a duplicate.
That's a quite annoying thing. Especially because it happens outside the debugging system so this could expose internal information in the mod_python / flup traceback. The fix would be using an 'ignore' or 'replace' fallback in the unicode conversion.