Django

Code

Changeset 5597

Show
Ignore:
Timestamp:
07/03/07 13:29:56 (1 year ago)
Author:
adrian
Message:

unicode: Made some documentation edits and inconsequential typo fixes throughout code

Files:

Legend:

Unmodified
Added
Removed
Modified
Copied
Moved
  • django/branches/unicode/django/contrib/contenttypes/models.py

    r5255 r5597  
    1515            ct = CONTENT_TYPE_CACHE[key] 
    1616        except KeyError: 
    17             # The unicode() is needed around opts.verbose_name because it might 
     17            # The smart_unicode() is needed around opts.verbose_name_raw because it might 
    1818            # be a django.utils.functional.__proxy__ object. 
    1919            ct, created = self.model._default_manager.get_or_create(app_label=key[0], 
  • django/branches/unicode/django/contrib/syndication/feeds.py

    r5251 r5597  
    88def add_domain(domain, url): 
    99    if not url.startswith('http://'): 
    10         # 'url' must already be ASCII and URL-quoted, so no need for encodign 
     10        # 'url' must already be ASCII and URL-quoted, so no need for encoding 
    1111        # conversions here. 
    1212        url = u'http://%s%s' % (domain, url) 
  • django/branches/unicode/django/http/__init__.py

    r5580 r5597  
    5151        """ 
    5252        Sets the encoding used for GET/POST accesses. If the GET or POST 
    53         dictionary has already been created it is removed and recreated on the 
     53        dictionary has already been created, it is removed and recreated on the 
    5454        next access (so that it is decoded correctly). 
    5555        """ 
     
    102102 
    103103    Values retrieved from this class are converted from the default encoding to 
    104     unicode (this is done on retrieval, rather than input to avoid breaking 
     104    unicode (this is done on retrieval, rather than input, to avoid breaking 
    105105    references or mutating referenced objects). 
    106106    """ 
  • django/branches/unicode/django/template/defaultfilters.py

    r5461 r5597  
    117117def slugify(value): 
    118118    "Converts to lowercase, removes non-alpha chars and converts spaces to hyphens" 
    119     # Don't compile patterns as unicode because \w then would mean any letter. Slugify is effectively an asciiization. 
     119    # Don't compile patterns as unicode because \w then would mean any letter. 
     120    # Slugify is effectively a conversion to ASCII. 
    120121    value = re.sub('[^\w\s-]', '', value).strip().lower() 
    121122    return re.sub('[-\s]+', '-', value) 
  • django/branches/unicode/docs/i18n.txt

    r5340 r5597  
    6969~~~~~~~~~~~~~~~~~~~~ 
    7070 
    71 Specify a translation string by using the function ``ugettext()``. Since you 
    72 may well be typing this a lot, it's often worthwhile importing it as a shorter 
    73 alias and ``_`` is a very common choice. 
     71Specify a translation string by using the function ``ugettext()``. It's 
     72convention to import this as a shorter alias, ``_``, to save typing. 
    7473 
    7574.. note:: 
     
    7877    not to follow this practice, for a couple of reasons: 
    7978 
    80       1. For international character set (unicode) support, you really wanting 
    81          to be using ``ugettext()``, rather than ``gettext()``. Sometimes, you 
    82          should be using ``ugettext_lazy()`` as the default translation method 
    83          for a particular file. By not installing ``_`` directly, the 
    84          developer has to think about which is the most appropriate function 
    85          to use. 
    86  
    87       2. Python's interactive shell uses ``_`` to represent "the previous 
    88          result". This is also used in doctest tests and having ``_()`` causes 
    89          interference. Explicitly importing ``ugettext()`` as ``_()`` avoids 
    90          this problem. 
     79      1. For international character set (Unicode) support, ``ugettext()`` is 
     80         more useful than ``gettext()``. Sometimes, you should be using 
     81         ``ugettext_lazy()`` as the default translation method for a particular 
     82         file. Without ``_()`` in the global namespace, the developer has to 
     83         think about which is the most appropriate translation function. 
     84 
     85      2. The underscore character (``_``) is used to represent "the previous 
     86         result" in Python's interactive shell and doctest tests. Installing a 
     87         global ``_()`` function causes interference. Explicitly importing 
     88         ``ugettext()`` as ``_()`` avoids this problem. 
    9189 
    9290In this example, the text ``"Welcome to my site."`` is marked as a translation 
     
    9997        return HttpResponse(output) 
    10098 
    101 Obviously you could code this without using the alias. This example is 
     99Obviously, you could code this without using the alias. This example is 
    102100identical to the previous one:: 
    103101 
     
    301299 
    302300Using ``ugettext_lazy()`` and ``ungettext_lazy()`` to mark strings in models 
    303 and utility functions is a common operation. When you are working with these 
     301and utility functions is a common operation. When you're working with these 
    304302objects elsewhere in your code, you should ensure that you don't accidentally 
    305303convert them to strings, because they should be converted as late as possible 
     
    329327-------------------------- 
    330328 
    331 There are a lot of useful utility functions in Django (particularly in 
    332 ``django.utils``) that take a string as their first argument and do something 
    333 to that string. These functions are used by template filters as well as 
    334 directly in other code. 
    335  
    336 If you write your own similar functions, you will rapidly come across the 
    337 problem of what to do when the first argument is a lazy translation object. 
    338 You don't want to convert it to a string immediately, because you may be using 
    339 this function outside of a view (and hence the current thread's locale setting 
    340 will not be correct). For cases like this, the 
    341 ``django.utils.functional.allow_lazy()`` decorator will be useful. It modifies 
    342 the function so that *if* it is called with a lazy translation as the first 
    343 argument, the function evaluation is delayed until it needs to be converted to 
    344 a string. 
     329Django offers many utility functions (particularly in ``django.utils``) that 
     330take a string as their first argument and do something to that string. These 
     331functions are used by template filters as well as directly in other code. 
     332 
     333If you write your own similar functions and deal with translations, you'll  
     334face the problem of what to do when the first argument is a lazy translation 
     335object. You don't want to convert it to a string immediately, because you might 
     336be using this function outside of a view (and hence the current thread's locale 
     337setting will not be correct). 
     338 
     339For cases like this, use the  ``django.utils.functional.allow_lazy()`` 
     340decorator. It modifies the function so that *if* it's called with a lazy 
     341translation as the first argument, the function evaluation is delayed until it 
     342needs to be converted to a string. 
    345343 
    346344For example:: 
     
    354352 
    355353The ``allow_lazy()`` decorator takes, in addition to the function to decorate, 
    356 a number of extra arguments specifying the type(s) that the original function 
    357 can return. Usually, it will be enough to just include ``unicode`` here and 
    358 ensure that your function returns Unicode strings. 
     354a number of extra arguments (``*args``) specifying the type(s) that the 
     355original function can return. Usually, it's enough to include ``unicode`` here 
     356and ensure that your function returns only Unicode strings. 
    359357 
    360358Using this decorator means you can write your function and assume that the 
  • django/branches/unicode/docs/templates.txt

    r5531 r5597  
    10451045 
    10461046Converts an IRI (Internationalized Resource Identifier) to a string that is 
    1047 suitable for including in a URL. This is necessary if you are trying to use 
     1047suitable for including in a URL. This is necessary if you're trying to use 
    10481048strings containing non-ASCII characters in a URL. 
    10491049 
    1050 You can use this filter after you have used the ``urlencode`` filter on a 
    1051 string, without harm
     1050It's safe to use this filter on a string that has already gone through the 
     1051``urlencode`` filter
    10521052 
    10531053join 
  • django/branches/unicode/docs/tutorial01.txt

    r5580 r5597  
    493493admin. 
    494494 
    495 .. admonition:: Why ``__unicode__`` and not ``__str__``? 
    496  
    497     If you are wondering why we add a ``__unicode__()`` method, rather than a 
    498     simple ``__str__()`` method, it is because Django models will contain 
    499     unicode strings by default. The values returned from the database, for 
    500     example, are all unicode strings. In most cases, your code should be 
    501     prepared to handle non-ASCII characters and this is a litle fiddly in 
    502     ``__str__()`` methods, since you have to worry about which encoding to 
    503     use, amongst other things. If you create a ``__unicode__()`` method, 
    504     Django will provide a ``__str__()`` method that calls your 
    505     ``__unicode__()`` and then converts the result to UTF-8 strings when 
    506     required. So ``unicode(p)`` will return a unicode string and ``str(p)`` 
    507     will return a normal string, with the characters encoded as UTF-8 when 
    508     necessary.. 
     495.. admonition:: Why ``__unicode__()`` and not ``__str__()``? 
     496 
     497        If you're familiar with Python, you might be in the habit of adding 
     498        ``__str__()`` methods to your classes, not ``__unicode__()`` methods. 
     499    We use ``__unicode__()`` here because Django models deal with Unicode by 
     500    default. All data stored in your database is converted to Unicode when it's 
     501    returned. 
     502 
     503        Django models have a default ``__str__()`` method that calls ``__unicode__()`` 
     504        and converts the result to a UTF-8 bytestring. This means that ``unicode(p)`` 
     505        will return a Unicode string, and ``str(p)`` will return a normal string, 
     506        with characters encoded as UTF-8. 
     507 
     508        If all of this is jibberish to you, just remember to add ``__unicode__()`` 
     509        methods to your models. With any luck, things should Just Work for you. 
    509510 
    510511Note these are normal Python methods. Let's add a custom method, just for 
  • django/branches/unicode/docs/unicode.txt

    r5342 r5597  
    99templates, models and the database. 
    1010 
    11 This files describes some things to be aware of if you are writing applications 
    12 which do not only use ASCII-encoded data
     11This document tells you what you need to know if you're writing applications 
     12that use data or templates that are encoded in something other than ASCII
    1313 
    1414Creating the database 
    1515===================== 
     16 
    1617Make sure your database is configured to be able to store arbitrary string 
    1718data. Normally, this means giving it an encoding of UTF-8 or UTF-16. If you use 
    18 a more restrictive encoding -- for example, latin1 (iso8859-1) -- there will be 
    19 some characters that you cannot store in the database and information will be 
    20 lost. 
    21  
    22  * For MySQL users, refer to the `MySQL manual`_ (section 10.3.2 for MySQL 5.1) 
    23    for details on how to set or alter the database character set encoding. 
    24  
    25  * For PostgreSQL users, refer to the `PostgreSQL manual`_ (section 21.2.2 in 
     19a more restrictive encoding -- for example, latin1 (iso8859-1) -- you won't be 
     20able to store certain characters in the database, and information will be lost. 
     21 
     22 * MySQL users, refer to the `MySQL manual`_ (section 10.3.2 for MySQL 5.1) for 
     23   details on how to set or alter the database character set encoding. 
     24 
     25 * PostgreSQL users, refer to the `PostgreSQL manual`_ (section 21.2.2 in 
    2626   PostgreSQL 8) for details on creating databases with the correct encoding. 
    2727 
    28  * For SQLite users, there is nothing you need to do. SQLite always uses UTF-8 
     28 * SQLite users, there is nothing you need to do. SQLite always uses UTF-8 
    2929   for internal encoding. 
    3030 
     
    3838handled transparently. 
    3939 
     40For more, see the section "The database API" below. 
     41 
    4042General string handling 
    4143======================= 
    4244 
    43 Whenever you use strings with Django, you have two choices. You can use Unicode 
    44 strings or you can use normal strings (sometimes called bytestrings) that are 
    45 encoded using UTF-8. 
     45Whenever you use strings with Django -- e.g., in database lookups, template 
     46rendering or anywhere else -- you have two choices for encoding those strings. 
     47You can use Unicode strings, or you can use normal strings (sometimes called 
     48"bytestrings") that are encoded using UTF-8. 
    4649 
    4750.. warning:: 
    48     A bytestring does not carry any information with it about its encoding. So 
    49     we have to make an assumption and Django assumes that all bytestrings are 
    50     in UTF-8. If you pass a string to Django that has been encoded in some 
    51     other format, things will go wrong in interesting ways. Usually Django will 
    52     raise a UnicodeDecodeError at some point. 
    53  
    54 If your code only uses ASCII data, you are quite safe to simply use your normal 
    55 strings (since ASCII is a subset of UTF-8) and pass them around at will. 
    56  
    57 Do not be fooled into thinking that if your ``DEFAULT_CHARSET`` setting is set 
    58 to something other than ``utf-8`` you can use that encoding in your 
    59 bytestrings!  The ``DEFAULT_CHARSET`` only applies to the strings generated as 
    60 the result of template rendering (and email). Django will always assume UTF-8 
     51    A bytestring does not carry any information with it about its encoding. 
     52    For that reason, we have to make an assumption, and Django assumes that all 
     53    bytestrings are in UTF-8. 
     54 
     55    If you pass a string to Django that has been encoded in some other format, 
     56    things will go wrong in interesting ways. Usually, Django will raise a 
     57    ``UnicodeDecodeError`` at some point. 
     58 
     59If your code only uses ASCII data, it's safe to use your normal strings, 
     60passing them around at will, because ASCII is a subset of UTF-8. 
     61 
     62Don't be fooled into thinking that if your ``DEFAULT_CHARSET`` setting is set 
     63to something other than ``'utf-8'`` you can use that other encoding in your 
     64bytestrings! ``DEFAULT_CHARSET`` only applies to the strings generated as 
     65the result of template rendering (and e-mail). Django will always assume UTF-8 
    6166encoding for internal bytestrings. The reason for this is that the 
    6267``DEFAULT_CHARSET`` setting is not actually under your control (if you are the 
    63 application developer). It is under the control of the person installing and 
    64 using your application and if they choose a different setting, your code must 
    65 still continue to work. Ergo, it cannot rely on that setting. 
     68application developer). It's under the control of the person installing and 
     69using your application -- and if that person chooses a different setting, your 
     70code must still continue to work. Ergo, it cannot rely on that setting. 
    6671 
    6772In most cases when Django is dealing with strings, it will convert them to 
    68 Unicode strings before doing anything else. So if you pass in a bytestring, be 
    69 prepared to receive a Unicode string back in the result. 
    70  
    71 .. _lazy translation: 
     73Unicode strings before doing anything else. So, as a general rule, if you pass 
     74in a bytestring, be prepared to receive a Unicode string back in the result. 
    7275 
    7376Translated strings 
    7477------------------ 
    7578 
    76 There is actually a third type of string-like object you may encounter when 
    77 using Django. If you are using the internationalization features of Django, 
    78 there is the concept of a "lazy translation". This is a string that has been 
    79 marked as translated, but the actual result is not determined until the object 
    80 is used in a string. This is useful because the locale that should be used for 
    81 the translation will not be known until the string is used, even though the 
    82 string might have originally been created when the code was first imported. 
     79Aside from Unicode strings and bytestrings, there's a third type of string-like 
     80object you may encounter when using Django. The framework's 
     81internationalization features introduce the concept of a "lazy translation" -- 
     82a string that has been marked as translated but whose actual translation result 
     83isn't determined until the object is used in a string. This feature is useful 
     84in cases where the translation locale is unknown until the string is used, even 
     85though the string might have originally been created when the code was first 
     86imported. 
    8387 
    8488Normally, you won't have to worry about lazy translations. Just be aware that 
    8589if you examine an object and it claims to be a 
    8690``django.utils.functional.__proxy__`` object, it is a lazy translation. 
    87 Calling ``unicode()`` with the translation as the argument will generate a 
    88 string in the current locale. 
     91Calling ``unicode()`` with the lazy translation as the argument will generate a 
     92Unicode string in the current locale. 
    8993 
    9094For more details about lazy translation objects, refer to the 
     
    9397.. _internationalization: ../i18n/#lazy-translation 
    9498 
    95 .. _utility functions: 
    96  
    9799Useful utility functions 
    98100------------------------ 
    99101 
    100 Since some string operations come up again and again, Django ships with a few 
    101 useful functions that should make working with unicode and bytestring objects 
     102Because some string operations come up again and again, Django ships with a few 
     103useful functions that should make working with Unicode and bytestring objects 
    102104a bit easier. 
    103105 
     
    106108 
    107109The ``django.utils.encoding`` module contains a few functions that are handy 
    108 for converting back and forth between unicode and bytestrings. 
     110for converting back and forth between Unicode and bytestrings. 
    109111 
    110112    * ``smart_unicode(s, encoding='utf-8', errors='strict')`` converts its 
    111       input to unicode string. The ``encoding`` parameter specifies the input 
    112       encoding of any bytestring -- Django uses this internally when 
    113       processing form input data, for example, which might not be UTF-8 
    114       encoded. The ``errors`` parameter takes any of the values that are 
    115       accepted by Python's ``unicode()`` function for its error handling. 
     113      input to a Unicode string. The ``encoding`` parameter specifies the input 
     114      encoding. (For example, Django uses this internally when processing form 
     115      input data, which might not be UTF-8 encoded.) The ``errors`` parameter 
     116      takes any of the values that are accepted by Python's ``unicode()`` 
     117      function for its error handling. 
    116118 
    117119      If you pass ``smart_unicode()`` an object that has a ``__unicode__`` 
     
    120122    * ``force_unicode(s, encoding='utf-8', errors='strict')`` is identical to 
    121123      ``smart_unicode()`` in almost all cases. The difference is when the 
    122       first argument is a `lazy translation`_ instance. Whilst 
     124      first argument is a `lazy translation`_ instance. While 
    123125      ``smart_unicode()`` preserves lazy translations, ``force_unicode()`` 
    124       forces those objects to a unicode string (causing the translation to 
    125       occur). Normally, you will want to use ``smart_unicode()``. However, 
    126       ``force_unicode()`` is useful in filters and template tags when you 
    127       absolutely must have a string to work with, not just something that can 
     126      forces those objects to a Unicode string (causing the translation to 
     127      occur). Normally, you'll want to use ``smart_unicode()``. However, 
     128      ``force_unicode()`` is useful in template tags and filters that 
     129      absolutely *must* have a string to work with, not just something that can 
    128130      be converted to a string. 
    129131 
    130132    * ``smart_str(s, encoding='utf-8', strings_only=False, errors='strict')`` 
    131133      is essentially the opposite of ``smart_unicode()``. It forces the first 
    132       argument to a string. The ``strings_only`` parameter, if set to True, 
     134      argument to a bytestring. The ``strings_only`` parameter, if set to True, 
    133135      will result in Python integers, booleans and ``None`` not being 
    134136      converted to a string (they keep their original types). This is slightly 
    135137      different semantics from Python's builtin ``str()`` function, but the 
    136       difference is needed in a few places internally. 
    137  
    138 Normally, you will only need to use ``smart_unicode()``. Call it as early as 
    139 possible on any input data that might be either a unicode or bytestring and 
    140 from then on you can treat the result as always being unicode. 
    141  
    142 .. _uri_and_iri: 
     138      difference is needed in a few places within Django's internals. 
     139 
     140Normally, you'll only need to use ``smart_unicode()``. Call it as early as 
     141possible on any input data that might be either Unicode or a bytestring, and 
     142from then on, you can treat the result as always being Unicode. 
    143143 
    144144URI and IRI handling 
     
    147147Web frameworks have to deal with URLs (which are a type of URI_). One 
    148148requirement of URLs is that they are encoded using only ASCII characters. 
    149 However, in an international environment, you will often need to construct a 
    150 URL from an IRI_ (very loosely speaking, a URI that can contain unicode 
    151 characters). Getting the quoting and conversion from IRI to URI correct can be 
    152 a little tricky, so Django provides some assistance. 
     149However, in an international environment, you might need to construct a 
     150URL from an IRI_ -- very loosely speaking, a URI that can contain Unicode 
     151characters. Quoting and converting an IRI to URI can be a little tricky, so 
     152Django provides some assistance. 
    153153 
    154154    * The function ``django.utils.encoding.iri_to_uri()`` implements the 
     
    159159      ``django.utils.http.urlquote_plus()`` are versions of Python's standard 
    160160      ``urllib.quote()`` and ``urllib.quote_plus()`` that work with non-ASCII 
    161       characters (the data is converted to UTF-8 prior to encoding). 
    162  
    163 These two groups of functions have slightly different purposes and it i
     161      characters. (The data is converted to UTF-8 prior to encoding.) 
     162 
     163These two groups of functions have slightly different purposes, and it'
    164164important to keep them straight. Normally, you would use ``urlquote()`` on the 
    165165individual portions of the IRI or URI path so that any reserved characters 
     
    169169 
    170170.. note:: 
    171     It isn't completely correct to say that ``iri_to_uri()`` implements the 
    172     full algorithm in the IRI specification. It does not perform the 
    173     international domain name encoding portion of the algorithm (at the 
    174     moment). 
     171    Technically, it isn't correct to say that ``iri_to_uri()`` implements the 
     172    full algorithm in the IRI specification. It doesn't (yet) perform the 
     173    international domain name encoding portion of the algorithm. 
    175174 
    176175The ``iri_to_uri()`` function will not change ASCII characters that are 
     
    209208====== 
    210209 
    211 Because all strings are returned from the database as unicode strings, model 
     210Because all strings are returned from the database as Unicode strings, model 
    212211fields that are character based (CharField, TextField, URLField, etc) will 
    213 contain unicode values when Django retrieves the model from the database. This 
    214 is always the case, even if the data could fit into an ASCII string. 
    215  
    216 As always, you can pass in bytestrings when creating a model or populating a 
    217 field and Django will convert it to unicode when it needs to. 
     212contain Unicode values when Django retrieves data from the database. This 
     213is *always* the case, even if the data could fit into an ASCII bytestring. 
     214 
     215You can pass in bytestrings when creating a model or populating a field, and 
     216Django will convert it to Unicode when it needs to. 
    218217 
    219218Choosing between ``__str__()`` and ``__unicode__()`` 
    220 ----------------------------------------------------- 
    221  
    222 One consequence of using unicode by default is that you have to take some care 
    223 when printing data from the model. In particular, rather than writing a 
    224 ``__str__()`` method, it is recommended to write a ``__unicode__()`` method for 
    225 your model. In the ``__unicode__()`` method, you can quite safely return the 
    226 values of all your fields without having to worry about whether they fit into a 
    227 bytestring or not (the result of ``__str__()`` is *always* a bytestring, even 
    228 if you accidentally try to return a unicode object). 
    229  
    230 You can still create a ``__str__()`` method on your models if you wish, of 
    231 course. However, Django's ``Model`` base class automatically provides you with 
    232 a ``__str__()`` method that calls your ``__unicode__()`` method and then 
    233 encodes the result correctly into UTF-8. So you would normally only create a 
    234 ``__unicode__()`` method and let Django handle the coercion to a bytestring 
    235 when required. 
     219---------------------------------------------------- 
     220 
     221One consequence of using Unicode by default is that you have to take some care 
     222when printing data from the model. 
     223 
     224In particular, rather than giving your model a ``__str__()`` method, we 
     225recommended you implement a ``__unicode__()`` method. In the ``__unicode__()`` 
     226method, you can quite safely return the values of all your fields without 
     227having to worry about whether they fit into a bytestring or not. (The way 
     228Python works, the result of ``__str__()`` is *always* a bytestring, even if you 
     229accidentally try to return a Unicode object). 
     230 
     231You can still create a ``__str__()`` method on your models if you want, of 
     232course, but you shouldn't need to do this unless you have a good reason. 
     233Django's ``Model`` base class automatically provides a ``__str__()`` 
     234implementation that calls ``__unicode__()`` and encodes the result into UTF-8. 
     235This means you'll normally only need to implement a ``__unicode__()`` method 
     236and let Django handle the coercion to a bytestring when required. 
    236237 
    237238Taking care in ``get_absolute_url()`` 
    238239------------------------------------- 
    239240 
    240 URLs can only contain ASCII characters. If you are constructing a URL from 
    241 pieces of data that might be non-ASCII, you must be careful to encode the 
    242 results in a way that is suitable for a URL. If you are using the 
    243 ``django.db.models.permalink()`` decorator, this is handled automatically by 
    244 the decorator. 
    245  
    246 If you are constructing the URL manually, you need to take care of the 
    247 encoding yourself. Normally, this would involve a combination of the 
    248 ``iri_to_uri()`` and ``urlquote()`` functions that were documented above_. For 
    249 example:: 
     241URLs can only contain ASCII characters. If you're constructing a URL from 
     242pieces of data that might be non-ASCII, be careful to encode the results in a 
     243way that is suitable for a URL. The ``django.db.models.permalink()`` decorator 
     244handles this for you automatically. 
     245 
     246If you're constructing a URL manually (i.e., *not* using the ``permalink()`` 
     247decorator), you'll need to take care of the encoding yourself. In this case, 
     248use the ``iri_to_uri()`` and ``urlquote()`` functions that were documented 
     249above_. For example:: 
    250250 
    251251    from django.utils.encoding import iri_to_uri 
     
    266266================ 
    267267 
    268 You can happily pass unicode strings or bytestrings as arguments to 
     268You can pass either Unicode strings or UTF-8 bytestrings as arguments to 
    269269``filter()`` methods and the like in the database API. The following two 
    270270querysets are identical:: 
     
    273273    qs = People.objects.filter(name__contains='\xc3\85') # UTF-8 encoding of Å 
    274274 
    275  
    276275Templates 
    277276========= 
    278277 
    279 As usual, templates can be created from unicode or bytestrings. However, they 
    280 can also be created by reading a file from disk and this creates a slight 
    281 complication: not all filesystems store their data encoded as UTF-8. If your 
    282 template files are not stored with a UTF-8 encoding, set the ``FILE_CHARSET`` 
    283 setting to the encoding of the on-disk files. When Django reads in a template 
    284 file it will convert the data from this encoding to unicode. 
    285  
    286 When a template is rendered for sending out as an HTML document or an e-mail, 
    287 it may be convenient to use an encoding other than UTF-8. You should set the 
    288 ``DEFAULT_CHARSET`` parameter to control the rendered template encoding (the 
    289 default setting is utf-8). 
     278You can use either Unicode or bytestrings when creating templates manually:: 
     279 
     280        from django.template import Template 
     281        t1 = Template('This is a bytestring template.') 
     282        t2 = Template(u'This is a Unicode template.') 
     283 
     284But the common case is to read templates from the filesystem, and this creates 
     285a slight complication: not all filesystems store their data encoded as UTF-8. 
     286If your template files are not stored with a UTF-8 encoding, set the ``FILE_CHARSET`` 
     287setting to the encoding of the files on disk. When Django reads in a template 
     288file, it will convert the data from this encoding to Unicode. (``FILE_CHARSET`` 
     289is set to ``'utf-8'`` by default.) 
     290 
     291The ``DEFAULT_CHARSET`` setting controls the encoding of rendered templates. 
     292This is set to UTF-8 by default. 
    290293 
    291294Template tags and filters 
     
    300303      places. Tag rendering and filter calls occur as the template is being 
    301304      rendered, so there is no advantage to postponing the conversion of lazy 
    302       transation objects into strings any longer. It is easier to work solely 
    303       with Unicode strings at this point. 
     305      translation objects into strings. It's easier to work solely with Unicode 
     306      strings at that point. 
    304307 
    305308E-mail 
    306309====== 
    307310 
    308 Django's email framework (in ``django.core.mail``) supports unicode 
    309 transparently. You can use unicode data in the message bodies and any headers. 
    310 However, you must still respect the requirements of the email specifications, 
    311 so, for example, email addresses should use ASCII characters. The following 
    312 code is certainly possible (demonstrating the everything except e-mail 
    313 addresses can be non-ASCII):: 
     311Django's e-mail framework (in ``django.core.mail``) supports Unicode 
     312transparently. You can use Unicode data in the message bodies and any headers. 
     313However, you're still obligated to respect the requirements of the e-mail 
     314specifications, so, for example, e-mail addresses should use only ASCII 
     315characters. 
     316 
     317The following code example demonstrates that everything except e-mail addresses 
     318can be non-ASCII:: 
    314319 
    315320    from django.core.mail import EmailMessage 
     
    321326    EmailMessage(subject, body, sender, recipients).send() 
    322327 
    323  
    324328Form submission 
    325329=============== 
    326330 
    327 HTML form submission is a tricky area. There is no guarantee that the 
    328 submission will include encoding information. 
     331HTML form submission is a tricky area. There's no guarantee that the 
     332submission will include encoding information, which means the framework might 
     333have to guess at the encoding of submitted data. 
    329334 
    330335Django adopts a "lazy" approach to decoding form data. The data in an 
     
    332337the data is not decoded at all. Only the ``HttpRequest.GET`` and 
    333338``HttpRequest.POST`` data structures have any decoding applied to them. Those 
    334 two fields will return their members as unicode data. All other members will 
    335 be returned exactly as they were submitted by the client. 
     339two fields will return their members as Unicode data. All other attributes and 
     340methods of ``HttpRequest`` return data exactly as it was submitted by the 
     341client. 
    336342 
    337343By default, the ``DEFAULT_CHARSET`` setting is used as the assumed encoding 
     
    347353 
    348354You can even change the encoding after having accessed ``request.GET`` or 
    349 ``request.POST`` and all subsequent accesses will use the new encoding. 
    350  
    351 It will typically be very rare that you would need to worry about changing the 
    352 form encoding. However, if you are talking to a legacy system or a system 
    353 beyond your control with particular ideas about encoding, you do have a way to 
    354 control the decoding of the data. 
    355  
    356 For request features such as file uploads, no automatic decoding takes place, 
    357 because those attributes are normally treated as collections of bytes, rather 
    358 than strings. Any decoding would alter the meaning of the stream of bytes. 
    359  
     355``request.POST``, and all subsequent accesses will use the new encoding. 
     356 
     357Most developers won't need to worry about changing form encoding, but this is 
     358a useful feature for applications that talk to legacy systems whose encoding 
     359you cannot control. 
     360 
     361Django does not decode the data of file uploads, because that data is normally 
     362treated as collections of bytes, rather than strings. Any automatic decoding 
     363there would alter the meaning of the stream of bytes.