Django

Code

Ticket #4030 (new)

Opened 1 year ago

Last modified 3 weeks ago

internationalization - auto translation of LANGUAGES

Reported by: temp@barnettweb.net Assigned to: nobody
Component: Internationalization Version: SVN
Keywords: LANGUAGES settings.py Cc:
Triage Stage: Accepted Has patch: 1
Needs documentation: 1 Needs tests: 0
Patch needs improvement: 1

Description

Hi

When I access LANGUAGES in a template using: {% get_available_languages as LANGUAGES %}

And then access the plain text language name in a template, like: {{ LANGUAGES.1.1 }}

The plain text name appears automatically translated into the current language.

For many uses this is not very useful as it is preferred to display the language name in its native language, so the user who speaks that language may recognize it and choose that option. I.e. If you are an English-speaker and viewing a Spanish language page you'll will prefer to see an option that says "English" rather than "Inglés". The way Django is currently configured one has to put in a workaround to create this behavior (personally I changed the plain text language name slightly to defeat the translation...)

Attachments

language-local-name.diff (2.7 kB) - added by akaihola on 09/14/07 06:45:39.
patch for new template tag which provides language names in both the current language and the language itself
make-language-names.diff (3.7 kB) - added by akaihola on 09/22/07 16:26:07.
implements mtredinnick's suggestion: a script for generating a local language name module and a fixed template tag
4030_language_info.diff (20.9 kB) - added by akaihola on 04/23/08 06:59:57.
implementation of the 2008-04-22 design, with tests and documentation
4030_language_info_v2.diff (20.8 kB) - added by akaihola on 04/23/08 13:15:20.
removed fallback for missing language_info.py, improved make-language-info.py

Change History

04/14/07 18:55:18 changed by Simon G. <dev@simon.net.nz>

  • needs_better_patch changed.
  • stage changed from Unreviewed to Accepted.
  • needs_tests changed.
  • needs_docs changed.

04/26/07 09:49:18 changed by mtredinnick

You're right about the need to support that use-case. We also need to be have the current behaviour.

The language names are stored as their English versions (like every other string in Django) and that shouldn't change. So, one problem to solve is how to get the right translation strings in the first place. It may be possible to rummage through every PO file for the right name at startup and cache it in trans_real.py.

The second thing is how to make this accessible to the caller. That is probably best done by a function available under django.util.translation. It could be a different function to get_available_languages() because we need both behaviours anyway and that would avoid any unnecessary backwards-incompatibility issues.

An interesting problem to solve here is what if the resulting string (of all the languages) is not representable in the output encoding. For example, you will have trouble showing Chinese characters in Russia's KOI8-R encoding. That's something we can solve in the unicode branch, I guess, but it's worth paying attention to.

09/14/07 03:53:32 changed by akaihola

  • owner changed from nobody to anonymous.
  • status changed from new to assigned.

Checking out to see how to fix this

09/14/07 05:48:22 changed by akaihola

  • owner changed from anonymous to akaihola.
  • status changed from assigned to new.

Oops, that anonymous was me.

09/14/07 06:45:39 changed by akaihola

  • attachment language-local-name.diff added.

patch for new template tag which provides language names in both the current language and the language itself

09/14/07 06:48:58 changed by Simon G. <dev@simon.net.nz>

  • has_patch set to 1.
  • stage changed from Accepted to Ready for checkin.

(follow-up: ↓ 10 ) 09/14/07 06:51:18 changed by akaihola

In my patch, I add a new template tag for this in order to not break backwards compatibility.

I have these questions in my mind about this:

  • is a separate template tag a good idea?
  • should we worry about performance if we assume this tag could be used on every page of a site in the base template -- is activating each language in a loop a heavy operation?
  • should the resulting list be cached?
  • could the translations be fetched without activating each language in its turn?
  • is the list of dicts return value a sensible choice, or should we return a list of tuples like in {% get_available_languages %} and only include the language name in the language itself?

09/14/07 06:53:16 changed by akaihola

Simon, should we add documentation and tests before checking in the patch? And should we provide unit tests? What about my concerns above?

09/14/07 07:13:19 changed by akaihola

#5446 suggests a db-based country/language list which would offer a different solution for this ticket.

09/14/07 08:11:35 changed by akaihola

  • owner changed from akaihola to nobody.

The try-except block was added because of the bug fixed in [6185]. Should it be removed or is it good to have a safety net like this? In case of a defect .po file it would automatically set name_local to the language name in the active language instead of the language itself.

I'm un-claiming this bug for now and moving on.

(in reply to: ↑ 6 ) 09/16/07 06:05:17 changed by mtredinnick

  • needs_better_patch set to 1.
  • stage changed from Ready for checkin to Accepted.
  • needs_docs set to 1.

Replying to akaihola:

* is a separate template tag a good idea?

Yes.

* should we worry about performance if we assume this tag could be used on every page of a site in the base template -- is activating each language in a loop a heavy operation?

We should worry. See below; I don't like the current solution.

* is the list of dicts return value a sensible choice, or should we return a list of tuples like in {% get_available_languages %} and only include the language name in the language itself?

I like the dictionary idea. Let's go with that.

Okay, now to the bigger problem ... I don't like the approach here. It's relatively heavyweight to load up every MO file just to access one string from each. What I would rather do is have an offline process (something like make-messages.py and compile-messages.py) that extracts out the strings and the Unicode string that is the translation and just writes it into a file we can import. Write out a Python dictionary to file, for example. Let's create a little tool for django/bin/

We can regenerate that file from time to time and check it into the source.

Let's also have a Python function (in django.utils.translation, I guess) that returns a Python dictionary of these languages -- mapping locale to (English name, translated name), say -- so that we can use it for choice lists in forms and models and other stuff like that.

Really sorry to ask for big changes like this after your work so far, akaihola.

09/22/07 16:26:07 changed by akaihola

  • attachment make-language-names.diff added.

implements mtredinnick's suggestion: a script for generating a local language name module and a fixed template tag

09/22/07 16:34:41 changed by akaihola

No tests or documentation for the above patch yet. And I'm not sure about the dictionary key names. I actually would prefer 'code', 'name' and 'local_name', but for some reason after staring at Django's i18n code I came up with 'language_code', 'name' and 'name_local'.

Doesn't this kind of obsolete {%get_available_languages%}? Language names in the current language are provided by this template tag, too, and it's much more natural to say {{language.language_code}} and {{language.name}} in the template than {{language.0}} and {{language.1}}.

09/22/07 16:42:14 changed by akaihola

Note that the django.utils.translation utility function for generating a choice list isn't yet included in the patch. Malcolm, could you give an example of the needed choices list format just to be sure? Should that list be pre-computed as well in the language_names module as a (premature? :-) optimization?

03/17/08 14:35:18 changed by thauber

  • owner changed from nobody to thauber.
  • status changed from new to assigned.

03/17/08 15:02:21 changed by thauber

  • owner changed from thauber to nobody.
  • status changed from assigned to new.

(follow-up: ↓ 16 ) 04/22/08 14:17:48 changed by akaihola

I went through this trying to think about the requirements from a web developer's standpoint and came up with a slightly modified API:

  • get a "language info dictionary" for the given language code:
       >>> from django.utils.translation import get_language_info
       >>> get_language_info('de')
       {'language_code': 'de',
        'name': 'German',
        'name_local': 'Deutsch',
        'bidi': False' # True for bi-directional languages
       }
    
  • get a list of info dicts for languages specified in settings.LANGUAGES (should we provide a helper function for this?):
       >>> [get_language_info(l[0]) for l in settings.LANGUAGES]
    
  • in templates, iterate languages as specified in settings.LANGUAGES and get language info dicts (RequestContext and the i18n context processor required):
       {% get_language_info_list for LANGUAGES as langs %}
       {% for l in langs %}
         {{ l.language_code }}: {{ l.name_local }}
       {% endfor %}
    
  • iterate a custom list of language codes:
       {% get_language_info_list for a_list_of_language_codes as langs %}
    
    The problem here is that LANGUAGES is a tuple of tuples (a reference to settings.LANGUAGES inserted to the context by the i18n context processor), whereas a user-supplied list would probably contain just language codes as strings. The template tag could automagically handle both if that's not too vague.

  • get the info dict for a single language:
       {% get_language_info for LANGUAGE_CODE as lang %}
       {% get_language_info for some_other_language_code as lang %}
       {% get_language_info for "pl" as lang %}
       {{ lang.language_code }}: {{ lang.name_local }}
    
  • alternate syntax with filters:
    • {{ LANGUAGE_CODE|language_name }} ("German")
    • {{ LANGUAGE_CODE|language_name_local }} ("Deutsch")
    • {{ LANGUAGE_CODE|bidi }} (False)
  • the data would be generated by django/bin/make-language-info.py as the language_info dictionary in django/conf/language_info.py
  • the language_info dictionary would map language codes to info dictionaries

I believe this approach would provide provide more practical tools for the developer than a list of dicts and a template tag for retrieving the whole list as discussed before.

I do still have a couple of open questions about this plan:

  • Is it ok if {% get_language_info_list %} automatically handles both lists of language codes and lists of tuples (as in settings.LANGUAGES)?
  • What if the {% get_language_info_list %} tag was eliminated and a single {% get_language_info %} tag returned either a single language info dict or a list of dicts depending on the type of the argument? Or would it be more confusing than useful?
  • Is the get_language_info() function needed, or is it sufficient to be able to say
       >>> from django.conf.language_info import language_info
       >>> i = language_info['de']
    
  • Should only either template tags or filters be provided, not both?

04/23/08 06:59:57 changed by akaihola

  • attachment 4030_language_info.diff added.

implementation of the 2008-04-22 design, with tests and documentation

(in reply to: ↑ 15 ) 04/23/08 07:13:12 changed by akaihola

Replying to akaihola:

I went through this trying to think about the requirements from a web developer's standpoint and came up with a slightly modified API:

The attachment above implements this design. In addition,

  • if language_info.py is incomplete or missing, django.utils.translation.get_language_info() still works and updates the in-memory cache for each requested language, and a warning is issued with instructions to run make-language-info.py;
  • there are tests for the get_language_info() function, the template tags and the template filters;
  • some documentation is added to i18n.txt; and
  • the example language selection form in i18n.txt is updated to use local language names.

(follow-up: ↓ 18 ) 04/23/08 07:45:24 changed by mtredinnick

Anything that loops through every language is still not going to be appropriate here. That's a massive amount of memory usage, because every single MO file is loaded into memory. It's also not going to be particularly fast. And it provides more than one way to do something. Short version: don't do that. :-)

Instead, let's just have the one dictionary and that's all (I'm sort of so-so about the script to generate it. But, leave that in for the moment; it's probably the right thing to do). There's no need for all the fallbacks for a missing file or anything, though. That just means Django isn't correctly installed and that's not our problem. What other files haven't they installed?

Secondly, doesn't your solution possible have a localisation problem? Specifically, if we load the language info into a template, I think we should also include the name of the language in the current locale. If I'm using that dictionary in Python code I probably need the English name (e.g. for a form field value), but in a template it might be optional (probably harmless, though). Definitely needed in the currently active locale, though (for the title attribute, for example).

I haven't had time to look at the rest of your proposal yet, but it's nice to see some action here. These are my thoughts from an initial read through. I'll have a bit more of a think about this shortly, but I suspect it's getting pretty close.

04/23/08 13:15:20 changed by akaihola

  • attachment 4030_language_info_v2.diff added.

removed fallback for missing language_info.py, improved make-language-info.py

(in reply to: ↑ 17 ) 04/23/08 14:14:57 changed by akaihola

Replying to mtredinnick:

Anything that loops through every language is still not going to be appropriate here. That's a massive amount of memory usage, because every single MO file is loaded into memory. It's also not going to be particularly fast. And it provides more than one way to do something. Short version: don't do that. :-)

Ok, replaced that part with a simple ImportError.

Instead, let's just have the one dictionary and that's all (I'm sort of so-so about the script to generate it. But, leave that in for the moment; it's probably the right thing to do).

Ok, it's still there with a couple of improvements.

Secondly, doesn't your solution possible have a localisation problem? Specifically, if we load the language info into a template, I think we should also include the name of the language in the current locale.

Ah, there's no shortcut for that currently, but doing that is as simple as

{% get_language_info for LANGUAGE_CODE as lang %}
{% trans lang.name %}

Would it make sense to always dynamically add the name of the language in the current locale to the info dict, no matter if it's needed or not?

Actually, then I'd change the key names (assuming Finnish as the active language in this example):

{'language_code': 'pl',
 'name_english': 'Polish',
 'name_local': 'Polski',
 'name': 'puola',
 'bidi': False}

The filters would be accordingly:

  • {{ "pl"|language_name_english }} ("Polish")
  • {{ "pl"|language_name_local }} ("Polski")
  • {{ "pl"|language_name }} ("puola")
  • {{ "pl"|bidi }} (False)

I haven't had time to look at the rest of your proposal yet, but it's nice to see some action here. These are my thoughts from an initial read through. I'll have a bit more of a think about this shortly, but I suspect it's getting pretty close.

Nice if I'm working with something useful for others as well.

I noticed i18n is the topic and you the guest on the latest TWID, and you blogged about it as well. I'll listen and take a look – it's good that an important subject like this gets attention.


Add/Change #4030 (internationalization - auto translation of LANGUAGES)




Change Properties
Action