Opened 9 years ago

Closed 7 years ago

Last modified 4 years ago

#5738 closed Uncategorized (fixed)

django fails on defective unicode strings appearing in the url

Reported by: Soeren Sonnenburg <bugreports@…> Owned by: nobody
Component: HTTP handling Version: master
Severity: Normal Keywords:
Cc: Triage Stage: Accepted
Has patch: yes Needs documentation: no
Needs tests: no Patch needs improvement: yes
Easy pickings: no UI/UX: no

Description

problem happens with any django site (version does not matter),

the best backtrace one can get here :-)

http://www.djangoproject.com/~%A9

Attachments (1)

unicode_url_bug.patch (1.1 KB) - added by Armin Ronacher 9 years ago.
fix

Download all attachments as: .zip

Change History (15)

comment:1 Changed 9 years ago by Armin Ronacher

Needs documentation: unset
Needs tests: unset
Patch needs improvement: unset

That's a quite annoying thing. Especially because it happens outside the debugging system so this could expose internal information in the mod_python / flup traceback. The fix would be using an 'ignore' or 'replace' fallback in the unicode conversion.

Changed 9 years ago by Armin Ronacher

Attachment: unicode_url_bug.patch added

fix

comment:2 Changed 9 years ago by anonymous

Has patch: set

comment:3 Changed 9 years ago by James Bennett

Component: Core frameworkHTTP handling
Triage Stage: UnreviewedAccepted

comment:4 Changed 9 years ago by Adrian Holovaty

Resolution: fixed
Status: newclosed

(In [6475]) Fixed #5738 -- Fixed bug with defective Unicode strings in a URL

comment:5 Changed 9 years ago by Soeren Sonnenburg <bugreports@…>

Patch needs improvement: set
Resolution: fixed
Status: closedreopened

I am not sure whether that was the correct fix, because now things like

http://www.djangoproject.com/d%aao%aaw%aan%aal%aao%aaa%aad%aa/

work too...

comment:6 Changed 9 years ago by Malcolm Tredinnick

Resolution: fixed
Status: reopenedclosed

It isn't actually a bug that that example works. It's harmless. We can live with this fix.

comment:7 Changed 9 years ago by Malcolm Tredinnick

As an addendum to the previous comment (I hit "post" too fast), the alternative is to automatically return an HTTP 400 status code in this case. But I think what we're doing is a reasonable approach to the problem.

comment:8 Changed 9 years ago by Soeren Sonnenburg <bugreports@…>

Resolution: fixed
Status: closedreopened

that is exactly what I would have expected - a 404 page...

comment:9 Changed 9 years ago by James Bennett

You absolutely should not get a 404 from that. If you think that the request is bad, the correct status is "HTTP 400 Bad Request".

comment:10 Changed 9 years ago by Malcolm Tredinnick

Thinking about this a lot more, I'm not totally happy with the fix in [6475], but it's a line-ball a bit. The problem is that although UTF-8 is strongly recommended as the encoding for non-ASCII data, it's not actually codified in any spec until quite recently (RFC 2396 leaves things wide open, for example). Only in RFC 3986 were things made clear for IRI to URI encoding.

In the interim, systems were deployed that spit out non-UTF-8 encoded URIs.

So I'm going to commit a change that passes back a 400 response for malformed input (non-UTF-8) but also makes it easier to override the request class, so if somebody is dealing with a legacy system, they can subclass WSGIRequestor or ModPythonRequest to handle decoding the URI however they need to.

comment:11 Changed 9 years ago by Malcolm Tredinnick

Resolution: fixed
Status: reopenedclosed

For some reason, the auto-closer didn't trigger. [6550] has the latest commit for this ticket.

comment:12 Changed 7 years ago by jeremb

Resolution: fixed
Status: closedreopened

It seems this bug is still present.

See:

http://www.djangoproject.com/你好 -> Error 500

http://www.djangoproject.com/~%A9 -> Empty page

In my opinion, both of the urls should return a 404.

comment:13 Changed 7 years ago by Karen Tracey

Resolution: fixed
Status: reopenedclosed

This ticket was fixed, if the original problem had really not been addressed it would have been reopened quickly, not 2+ years later.

The fact that djangoproject.com generates a 500 server error on some page doesn't necessarily imply a bug in base Django. The application code for the website could equally well be at fault. I tried that url on one of my own projects and it works properly, unless it causes a redirect to the login page, in which case the attempt to send a redirect then fails. I believe this problem is already covered by another open ticket (search the tracker on redirect and non-ASCII or unicode or something).

For the 2nd one I get a 400 status returned, which seems correct to me.

comment:14 Changed 4 years ago by Aymeric Augustin

Easy pickings: unset
Severity: Normal
Type: Uncategorized
UI/UX: unset

#16541 was a duplicate.

Note: See TracTickets for help on using tickets.
Back to Top