Opened 17 years ago

Closed 14 years ago

Last modified 11 years ago

#5738 closed Uncategorized (fixed)

django fails on defective unicode strings appearing in the url

Reported by: Soeren Sonnenburg <bugreports@…> Owned by: nobody
Component: HTTP handling Version: dev
Severity: Normal Keywords:
Cc: Triage Stage: Accepted
Has patch: yes Needs documentation: no
Needs tests: no Patch needs improvement: yes
Easy pickings: no UI/UX: no

Description

problem happens with any django site (version does not matter),

the best backtrace one can get here :-)

http://www.djangoproject.com/~%A9

Attachments (1)

unicode_url_bug.patch (1.1 KB ) - added by Armin Ronacher 17 years ago.
fix

Download all attachments as: .zip

Change History (15)

comment:1 by Armin Ronacher, 17 years ago

That's a quite annoying thing. Especially because it happens outside the debugging system so this could expose internal information in the mod_python / flup traceback. The fix would be using an 'ignore' or 'replace' fallback in the unicode conversion.

by Armin Ronacher, 17 years ago

Attachment: unicode_url_bug.patch added

fix

comment:2 by anonymous, 17 years ago

Has patch: set

comment:3 by James Bennett, 17 years ago

Component: Core frameworkHTTP handling
Triage Stage: UnreviewedAccepted

comment:4 by Adrian Holovaty, 17 years ago

Resolution: fixed
Status: newclosed

(In [6475]) Fixed #5738 -- Fixed bug with defective Unicode strings in a URL

comment:5 by Soeren Sonnenburg <bugreports@…>, 17 years ago

Patch needs improvement: set
Resolution: fixed
Status: closedreopened

I am not sure whether that was the correct fix, because now things like

http://www.djangoproject.com/d%aao%aaw%aan%aal%aao%aaa%aad%aa/

work too...

comment:6 by Malcolm Tredinnick, 17 years ago

Resolution: fixed
Status: reopenedclosed

It isn't actually a bug that that example works. It's harmless. We can live with this fix.

comment:7 by Malcolm Tredinnick, 17 years ago

As an addendum to the previous comment (I hit "post" too fast), the alternative is to automatically return an HTTP 400 status code in this case. But I think what we're doing is a reasonable approach to the problem.

comment:8 by Soeren Sonnenburg <bugreports@…>, 17 years ago

Resolution: fixed
Status: closedreopened

that is exactly what I would have expected - a 404 page...

comment:9 by James Bennett, 17 years ago

You absolutely should not get a 404 from that. If you think that the request is bad, the correct status is "HTTP 400 Bad Request".

comment:10 by Malcolm Tredinnick, 17 years ago

Thinking about this a lot more, I'm not totally happy with the fix in [6475], but it's a line-ball a bit. The problem is that although UTF-8 is strongly recommended as the encoding for non-ASCII data, it's not actually codified in any spec until quite recently (RFC 2396 leaves things wide open, for example). Only in RFC 3986 were things made clear for IRI to URI encoding.

In the interim, systems were deployed that spit out non-UTF-8 encoded URIs.

So I'm going to commit a change that passes back a 400 response for malformed input (non-UTF-8) but also makes it easier to override the request class, so if somebody is dealing with a legacy system, they can subclass WSGIRequestor or ModPythonRequest to handle decoding the URI however they need to.

comment:11 by Malcolm Tredinnick, 17 years ago

Resolution: fixed
Status: reopenedclosed

For some reason, the auto-closer didn't trigger. [6550] has the latest commit for this ticket.

comment:12 by jeremb, 14 years ago

Resolution: fixed
Status: closedreopened

It seems this bug is still present.

See:

http://www.djangoproject.com/你好 -> Error 500

http://www.djangoproject.com/~%A9 -> Empty page

In my opinion, both of the urls should return a 404.

comment:13 by Karen Tracey, 14 years ago

Resolution: fixed
Status: reopenedclosed

This ticket was fixed, if the original problem had really not been addressed it would have been reopened quickly, not 2+ years later.

The fact that djangoproject.com generates a 500 server error on some page doesn't necessarily imply a bug in base Django. The application code for the website could equally well be at fault. I tried that url on one of my own projects and it works properly, unless it causes a redirect to the login page, in which case the attempt to send a redirect then fails. I believe this problem is already covered by another open ticket (search the tracker on redirect and non-ASCII or unicode or something).

For the 2nd one I get a 400 status returned, which seems correct to me.

comment:14 by Aymeric Augustin, 11 years ago

Easy pickings: unset
Severity: Normal
Type: Uncategorized
UI/UX: unset

#16541 was a duplicate.

Note: See TracTickets for help on using tickets.
Back to Top