Version 5 (modified by Malcolm Tredinnick, 12 years ago) (diff)

Updated current status.

The unicode branch

This branch aims to make Django's internals fully Unicode-aware.

How to get the branch

svn co

See our branch policy for full information on how to use a branch.


The main goals of this branch are:

  • Make it easier for developers to work with non-ASCII character data when working with Django.
  • Be more consistent in our string handling behaviour inside Django (see StringEncoding for details on this).

Upon completion, you will be able to pass around unicode strings anywhere inside Django (or between Django and developer applications).

Note that we are not trying to switch to forcing everybody to only use unicode strings. You will also be able to pass around bytestrings and Django will assume they are UTF-8 encoded (we have to make an assumption because there is no way to tell what the encoding is otherwise). This feature means that a large chunk of existing code that uses Django will continue to work unchanged.


The branch was created on April 7, 2007.

Todo Items

The various pieces will be converted in roughly the following order:

  1. Template rendering (Done in [4971])
  2. Database I/O (Done in [4971] for postgresql, postgresql_psycopg2, mysql, mysql_old and sqlite backends)
    • Needs testing for servers/tables that are not in UTF-8 or ASCII encoding. The theory is that the client connection for each backend should be automatically converting everything to UTF-8 or Unicode objects (depends on backend), but this needs verifying. (Ivan Sagalaev and Malcolm have tested this feature a fair bit with various database servers, but more stress tests would be nice.)
  3. Model class support (Done in [5057])
  4. Form input encoding (Done in [5192])
  5. Other output methods (if necessary; unchecked as yet)
    • syndication
    • serialisation
    • Google sitemaps
  6. Other contrib modules

We also need to look at the i18n support functions (in django.utils.translation):

  • Decide on usage of gettext() versus ugettext() in a number of places.
  • Look at rewriting gettext_lazy() so that it acts as a better string and unicode proxy.

Finally, some documentation needs to be written describing good practices for creating unicode-aware Django apps.

Back to Top