Opened 15 years ago

Closed 13 years ago

Last modified 10 years ago

#5738 closed Uncategorized (fixed)

django fails on defective unicode strings appearing in the url

Reported by: Soeren Sonnenburg <bugreports@…> Owned by: nobody
Component: HTTP handling Version: dev
Severity: Normal Keywords:
Cc: Triage Stage: Accepted
Has patch: yes Needs documentation: no
Needs tests: no Patch needs improvement: yes
Easy pickings: no UI/UX: no

Description

problem happens with any django site (version does not matter),

the best backtrace one can get here :-)

http://www.djangoproject.com/~%A9

Attachments (1)

unicode_url_bug.patch (1.1 KB) - added by Armin Ronacher 15 years ago.
fix

Download all attachments as: .zip

Change History (15)

comment:1 Changed 15 years ago by Armin Ronacher

That's a quite annoying thing. Especially because it happens outside the debugging system so this could expose internal information in the mod_python / flup traceback. The fix would be using an 'ignore' or 'replace' fallback in the unicode conversion.

Changed 15 years ago by Armin Ronacher

Attachment: unicode_url_bug.patch added

fix

comment:2 Changed 15 years ago by anonymous

Has patch: set

comment:3 Changed 15 years ago by James Bennett

Component: Core frameworkHTTP handling
Triage Stage: UnreviewedAccepted

comment:4 Changed 15 years ago by Adrian Holovaty

Resolution: fixed
Status: newclosed

(In [6475]) Fixed #5738 -- Fixed bug with defective Unicode strings in a URL

comment:5 Changed 15 years ago by Soeren Sonnenburg <bugreports@…>

Patch needs improvement: set
Resolution: fixed
Status: closedreopened

I am not sure whether that was the correct fix, because now things like

http://www.djangoproject.com/d%aao%aaw%aan%aal%aao%aaa%aad%aa/

work too...

comment:6 Changed 15 years ago by Malcolm Tredinnick

Resolution: fixed
Status: reopenedclosed

It isn't actually a bug that that example works. It's harmless. We can live with this fix.

comment:7 Changed 15 years ago by Malcolm Tredinnick

As an addendum to the previous comment (I hit "post" too fast), the alternative is to automatically return an HTTP 400 status code in this case. But I think what we're doing is a reasonable approach to the problem.

comment:8 Changed 15 years ago by Soeren Sonnenburg <bugreports@…>

Resolution: fixed
Status: closedreopened

that is exactly what I would have expected - a 404 page...

comment:9 Changed 15 years ago by James Bennett

You absolutely should not get a 404 from that. If you think that the request is bad, the correct status is "HTTP 400 Bad Request".

comment:10 Changed 15 years ago by Malcolm Tredinnick

Thinking about this a lot more, I'm not totally happy with the fix in [6475], but it's a line-ball a bit. The problem is that although UTF-8 is strongly recommended as the encoding for non-ASCII data, it's not actually codified in any spec until quite recently (RFC 2396 leaves things wide open, for example). Only in RFC 3986 were things made clear for IRI to URI encoding.

In the interim, systems were deployed that spit out non-UTF-8 encoded URIs.

So I'm going to commit a change that passes back a 400 response for malformed input (non-UTF-8) but also makes it easier to override the request class, so if somebody is dealing with a legacy system, they can subclass WSGIRequestor or ModPythonRequest to handle decoding the URI however they need to.

comment:11 Changed 15 years ago by Malcolm Tredinnick

Resolution: fixed
Status: reopenedclosed

For some reason, the auto-closer didn't trigger. [6550] has the latest commit for this ticket.

comment:12 Changed 13 years ago by jeremb

Resolution: fixed
Status: closedreopened

It seems this bug is still present.

See:

http://www.djangoproject.com/你好 -> Error 500

http://www.djangoproject.com/~%A9 -> Empty page

In my opinion, both of the urls should return a 404.

comment:13 Changed 13 years ago by Karen Tracey

Resolution: fixed
Status: reopenedclosed

This ticket was fixed, if the original problem had really not been addressed it would have been reopened quickly, not 2+ years later.

The fact that djangoproject.com generates a 500 server error on some page doesn't necessarily imply a bug in base Django. The application code for the website could equally well be at fault. I tried that url on one of my own projects and it works properly, unless it causes a redirect to the login page, in which case the attempt to send a redirect then fails. I believe this problem is already covered by another open ticket (search the tracker on redirect and non-ASCII or unicode or something).

For the 2nd one I get a 400 status returned, which seems correct to me.

comment:14 Changed 10 years ago by Aymeric Augustin

Easy pickings: unset
Severity: Normal
Type: Uncategorized
UI/UX: unset

#16541 was a duplicate.

Note: See TracTickets for help on using tickets.
Back to Top