Opened 7 years ago

Closed 5 years ago

Last modified 2 years ago

#5738 closed Uncategorized (fixed)

django fails on defective unicode strings appearing in the url

Reported by: Soeren Sonnenburg <bugreports@…> Owned by: nobody
Component: HTTP handling Version: master
Severity: Normal Keywords:
Cc: Triage Stage: Accepted
Has patch: yes Needs documentation: no
Needs tests: no Patch needs improvement: yes
Easy pickings: no UI/UX: no

Description

problem happens with any django site (version does not matter),

the best backtrace one can get here :-)

http://www.djangoproject.com/~%A9

Attachments (1)

unicode_url_bug.patch (1.1 KB) - added by Armin Ronacher 7 years ago.
fix

Download all attachments as: .zip

Change History (15)

comment:1 Changed 7 years ago by Armin Ronacher

  • Needs documentation unset
  • Needs tests unset
  • Patch needs improvement unset

That's a quite annoying thing. Especially because it happens outside the debugging system so this could expose internal information in the mod_python / flup traceback. The fix would be using an 'ignore' or 'replace' fallback in the unicode conversion.

Changed 7 years ago by Armin Ronacher

fix

comment:2 Changed 7 years ago by anonymous

  • Has patch set

comment:3 Changed 7 years ago by ubernostrum

  • Component changed from Core framework to HTTP handling
  • Triage Stage changed from Unreviewed to Accepted

comment:4 Changed 7 years ago by adrian

  • Resolution set to fixed
  • Status changed from new to closed

(In [6475]) Fixed #5738 -- Fixed bug with defective Unicode strings in a URL

comment:5 Changed 7 years ago by Soeren Sonnenburg <bugreports@…>

  • Patch needs improvement set
  • Resolution fixed deleted
  • Status changed from closed to reopened

I am not sure whether that was the correct fix, because now things like

http://www.djangoproject.com/d%aao%aaw%aan%aal%aao%aaa%aad%aa/

work too...

comment:6 Changed 7 years ago by mtredinnick

  • Resolution set to fixed
  • Status changed from reopened to closed

It isn't actually a bug that that example works. It's harmless. We can live with this fix.

comment:7 Changed 7 years ago by mtredinnick

As an addendum to the previous comment (I hit "post" too fast), the alternative is to automatically return an HTTP 400 status code in this case. But I think what we're doing is a reasonable approach to the problem.

comment:8 Changed 7 years ago by Soeren Sonnenburg <bugreports@…>

  • Resolution fixed deleted
  • Status changed from closed to reopened

that is exactly what I would have expected - a 404 page...

comment:9 Changed 7 years ago by ubernostrum

You absolutely should not get a 404 from that. If you think that the request is bad, the correct status is "HTTP 400 Bad Request".

comment:10 Changed 7 years ago by mtredinnick

Thinking about this a lot more, I'm not totally happy with the fix in [6475], but it's a line-ball a bit. The problem is that although UTF-8 is strongly recommended as the encoding for non-ASCII data, it's not actually codified in any spec until quite recently (RFC 2396 leaves things wide open, for example). Only in RFC 3986 were things made clear for IRI to URI encoding.

In the interim, systems were deployed that spit out non-UTF-8 encoded URIs.

So I'm going to commit a change that passes back a 400 response for malformed input (non-UTF-8) but also makes it easier to override the request class, so if somebody is dealing with a legacy system, they can subclass WSGIRequestor or ModPythonRequest to handle decoding the URI however they need to.

comment:11 Changed 7 years ago by mtredinnick

  • Resolution set to fixed
  • Status changed from reopened to closed

For some reason, the auto-closer didn't trigger. [6550] has the latest commit for this ticket.

comment:12 Changed 5 years ago by jeremb

  • Resolution fixed deleted
  • Status changed from closed to reopened

It seems this bug is still present.

See:

http://www.djangoproject.com/你好 -> Error 500

http://www.djangoproject.com/~%A9 -> Empty page

In my opinion, both of the urls should return a 404.

comment:13 Changed 5 years ago by kmtracey

  • Resolution set to fixed
  • Status changed from reopened to closed

This ticket was fixed, if the original problem had really not been addressed it would have been reopened quickly, not 2+ years later.

The fact that djangoproject.com generates a 500 server error on some page doesn't necessarily imply a bug in base Django. The application code for the website could equally well be at fault. I tried that url on one of my own projects and it works properly, unless it causes a redirect to the login page, in which case the attempt to send a redirect then fails. I believe this problem is already covered by another open ticket (search the tracker on redirect and non-ASCII or unicode or something).

For the 2nd one I get a 400 status returned, which seems correct to me.

comment:14 Changed 2 years ago by aaugustin

  • Easy pickings unset
  • Severity set to Normal
  • Type set to Uncategorized
  • UI/UX unset

#16541 was a duplicate.

Note: See TracTickets for help on using tickets.
Back to Top