Changes between Version 7 and Version 8 of UnicodeInDjango
- 04/23/06 10:21:58 (8 years ago)
v7 v8 9 9 * the HTTPResponse sending machinery needs to do the unicode to DEFAULT_CHARSET translation 10 10 * the HTTPRequest creation process needs to turn outside strings into unicode strings, using the provided charset (if given) or defaulting to DEFAULT_CHARSET (as that is what was sent to the browser when the form was transmitted) 11 * special casing: what happens with GET parameters? those don't provide charsets, what should we do if DEFAULT_ENCODING is utf-8, but the GET parameters aren't valid utf-8? The clean way would be to throw an exception (like with all other places, too) 11 * Special casing: what happens with GET parameters? those don't provide charsets, what should we do if DEFAULT_ENCODING is utf-8, but the GET parameters aren't valid utf-8? The clean way would be to throw an exception (like with all other places, too) 12 * The current URI spec ([http://www.ietf.org/rfc/rfc3986.txt RFC 3986]) clearly states that all URIs must be encoded according to UTF-8 so we can assume that this is the case. If this causes a !UnicodeDecodeError it makes sense to fall back on windows-1252 or latin-1. Has anyone taken a look at Mark Pilgrim's [http://chardet.feedparser.org/ Universal Encoding Detector]? - Noah Slater 12 13 * internal usage of str() needs to be checked and supposedly changed over to unicode() usage 13 14 * debugging stuff needs to use repr() on strings, not str() (or use unicode() and let the HTTP response handling stuff handle the conversion - most debugging stuff is working with the response machinery anyway) 14 15 * mail sending functions need to do the right thing with the MIME type 15 16 * we should decide wether to normalize the input unicode data so that at the database or application level we can match strings regardless of their decomposition (see the standard lib’s [http://docs.python.org/lib/module-unicodedata.html unicodedata module ] with its `normalize()` function). I would go for NFC, if there’s consensus around normalizing. 16 * Lazy evaluated method calls do not currently work with Unicode return values, see #1664. 17 * Lazy evaluated method calls do not currently work with Unicode return values, see #1664. 17 18 18 19 Please either complete the above list or add headlines with more detailed discussions of the points above. Please only post results here, discussion should take place on the django-developer list.