Opened 18 years ago

Closed 14 years ago

Last modified 13 years ago

#4030 closed Uncategorized (fixed)

internationalization - auto translation of LANGUAGES

Reported by: temp@… Owned by: nobody
Component: Internationalization Version: dev
Severity: Normal Keywords: LANGUAGES settings.py
Cc: nreilly@…, Gonzalo Saavedra, jim@…, hupf@… Triage Stage: Accepted
Has patch: yes Needs documentation: no
Needs tests: no Patch needs improvement: no
Easy pickings: no UI/UX: no

Description

Hi

When I access LANGUAGES in a template using:
{% get_available_languages as LANGUAGES %}

And then access the plain text language name in a template, like:
{{ LANGUAGES.1.1 }}

The plain text name appears automatically translated into the current language.

For many uses this is not very useful as it is preferred to display the language name in its native language, so the user who speaks that language may recognize it and choose that option. I.e. If you are an English-speaker and viewing a Spanish language page you'll will prefer to see an option that says "English" rather than "Inglés". The way Django is currently configured one has to put in a workaround to create this behavior (personally I changed the plain text language name slightly to defeat the translation...)

Attachments (8)

language-local-name.diff (2.7 KB ) - added by Antti Kaihola 17 years ago.
patch for new template tag which provides language names in both the current language and the language itself
make-language-names.diff (3.7 KB ) - added by Antti Kaihola 17 years ago.
implements mtredinnick's suggestion: a script for generating a local language name module and a fixed template tag
4030_language_info.diff (20.9 KB ) - added by Antti Kaihola 17 years ago.
implementation of the 2008-04-22 design, with tests and documentation
4030_language_info_v2.diff (20.8 KB ) - added by Antti Kaihola 17 years ago.
removed fallback for missing language_info.py, improved make-language-info.py
4030_language_info_v2_r8347.diff (20.0 KB ) - added by Antti Kaihola 16 years ago.
Patch adapted for Django r8347 (a bit after 1.0a2)
4030_language_info_v2_r8347_updated_languages.diff (20.2 KB ) - added by Antti Kaihola 16 years ago.
Updated the pre-generated language information dictionary according to current Django languages
4030_language_info_v2_r10639.diff (21.3 KB ) - added by Antti Kaihola 16 years ago.
updated patch for revision 10639 (1.1beta1+)
4030-rc0.diff (28.0 KB ) - added by Ramiro Morales 14 years ago.
First release cadidate patch for this ticket. Entirely based on work by akaihola.

Download all attachments as: .zip

Change History (34)

comment:1 by Simon G. <dev@…>, 18 years ago

Triage Stage: UnreviewedAccepted

comment:2 by Malcolm Tredinnick, 18 years ago

You're right about the need to support that use-case. We also need to be have the current behaviour.

The language names are stored as their English versions (like every other string in Django) and that shouldn't change. So, one problem to solve is how to get the right translation strings in the first place. It may be possible to rummage through every PO file for the right name at startup and cache it in trans_real.py.

The second thing is how to make this accessible to the caller. That is probably best done by a function available under django.util.translation. It could be a different function to get_available_languages() because we need both behaviours anyway and that would avoid any unnecessary backwards-incompatibility issues.

An interesting problem to solve here is what if the resulting string (of all the languages) is not representable in the output encoding. For example, you will have trouble showing Chinese characters in Russia's KOI8-R encoding. That's something we can solve in the unicode branch, I guess, but it's worth paying attention to.

comment:3 by Antti Kaihola, 17 years ago

Owner: changed from nobody to anonymous
Status: newassigned

Checking out to see how to fix this

comment:4 by Antti Kaihola, 17 years ago

Owner: changed from anonymous to Antti Kaihola
Status: assignednew

Oops, that anonymous was me.

by Antti Kaihola, 17 years ago

Attachment: language-local-name.diff added

patch for new template tag which provides language names in both the current language and the language itself

comment:5 by Simon G. <dev@…>, 17 years ago

Has patch: set
Triage Stage: AcceptedReady for checkin

comment:6 by Antti Kaihola, 17 years ago

In my patch, I add a new template tag for this in order to not break backwards compatibility.

I have these questions in my mind about this:

  • is a separate template tag a good idea?
  • should we worry about performance if we assume this tag could be used on every page of a site in the base template -- is activating each language in a loop a heavy operation?
  • should the resulting list be cached?
  • could the translations be fetched without activating each language in its turn?
  • is the list of dicts return value a sensible choice, or should we return a list of tuples like in {% get_available_languages %} and only include the language name in the language itself?

comment:7 by Antti Kaihola, 17 years ago

Simon, should we add documentation and tests before checking in the patch? And should we provide unit tests? What about my concerns above?

comment:8 by Antti Kaihola, 17 years ago

#5446 suggests a db-based country/language list which would offer a different solution for this ticket.

comment:9 by Antti Kaihola, 17 years ago

Owner: changed from Antti Kaihola to nobody

The try-except block was added because of the bug fixed in [6185]. Should it be removed or is it good to have a safety net like this? In case of a defect .po file it would automatically set name_local to the language name in the active language instead of the language itself.

I'm un-claiming this bug for now and moving on.

in reply to:  6 comment:10 by Malcolm Tredinnick, 17 years ago

Needs documentation: set
Patch needs improvement: set
Triage Stage: Ready for checkinAccepted

Replying to akaihola:

  • is a separate template tag a good idea?

Yes.

  • should we worry about performance if we assume this tag could be used on every page of a site in the base template -- is activating each language in a loop a heavy operation?

We should worry. See below; I don't like the current solution.

  • is the list of dicts return value a sensible choice, or should we return a list of tuples like in {% get_available_languages %} and only include the language name in the language itself?

I like the dictionary idea. Let's go with that.

Okay, now to the bigger problem ... I don't like the approach here. It's relatively heavyweight to load up every MO file just to access one string from each. What I would rather do is have an offline process (something like make-messages.py and compile-messages.py) that extracts out the strings and the Unicode string that is the translation and just writes it into a file we can import. Write out a Python dictionary to file, for example. Let's create a little tool for django/bin/

We can regenerate that file from time to time and check it into the source.

Let's also have a Python function (in django.utils.translation, I guess) that returns a Python dictionary of these languages -- mapping locale to (English name, translated name), say -- so that we can use it for choice lists in forms and models and other stuff like that.

Really sorry to ask for big changes like this after your work so far, akaihola.

by Antti Kaihola, 17 years ago

Attachment: make-language-names.diff added

implements mtredinnick's suggestion: a script for generating a local language name module and a fixed template tag

comment:11 by Antti Kaihola, 17 years ago

No tests or documentation for the above patch yet. And I'm not sure about the dictionary key names. I actually would prefer 'code', 'name' and 'local_name', but for some reason after staring at Django's i18n code I came up with 'language_code', 'name' and 'name_local'.

Doesn't this kind of obsolete {%get_available_languages%}? Language names in the current language are provided by this template tag, too, and it's much more natural to say {{language.language_code}} and {{language.name}} in the template than {{language.0}} and {{language.1}}.

comment:12 by Antti Kaihola, 17 years ago

Note that the django.utils.translation utility function for generating a choice list isn't yet included in the patch. Malcolm, could you give an example of the needed choices list format just to be sure? Should that list be pre-computed as well in the language_names module as a (premature? :-) optimization?

comment:13 by thauber, 17 years ago

Owner: changed from nobody to thauber
Status: newassigned

comment:14 by thauber, 17 years ago

Owner: changed from thauber to nobody
Status: assignednew

comment:15 by Antti Kaihola, 17 years ago

I went through this trying to think about the requirements from a web developer's standpoint and came up with a slightly modified API:

  • get a "language info dictionary" for the given language code:
    >>> from django.utils.translation import get_language_info
    >>> get_language_info('de')
    {'language_code': 'de',
     'name': 'German',
     'name_local': 'Deutsch',
     'bidi': False' # True for bi-directional languages
    }
    
  • get a list of info dicts for languages specified in settings.LANGUAGES (should we provide a helper function for this?):
    >>> [get_language_info(l[0]) for l in settings.LANGUAGES]
    
  • in templates, iterate languages as specified in settings.LANGUAGES and get language info dicts (RequestContext and the i18n context processor required):
    {% get_language_info_list for LANGUAGES as langs %}
    {% for l in langs %}
      {{ l.language_code }}: {{ l.name_local }}
    {% endfor %}
    
  • iterate a custom list of language codes:
    {% get_language_info_list for a_list_of_language_codes as langs %}
    
    The problem here is that LANGUAGES is a tuple of tuples (a reference to settings.LANGUAGES inserted to the context by the i18n context processor), whereas a user-supplied list would probably contain just language codes as strings. The template tag could automagically handle both if that's not too vague.

  • get the info dict for a single language:
    {% get_language_info for LANGUAGE_CODE as lang %}
    {% get_language_info for some_other_language_code as lang %}
    {% get_language_info for "pl" as lang %}
    {{ lang.language_code }}: {{ lang.name_local }}
    
  • alternate syntax with filters:
    • {{ LANGUAGE_CODE|language_name }} ("German")
    • {{ LANGUAGE_CODE|language_name_local }} ("Deutsch")
    • {{ LANGUAGE_CODE|bidi }} (False)
  • the data would be generated by django/bin/make-language-info.py as the language_info dictionary in django/conf/language_info.py
  • the language_info dictionary would map language codes to info dictionaries

I believe this approach would provide provide more practical tools for the developer
than a list of dicts and a template tag for retrieving the whole list as discussed before.

I do still have a couple of open questions about this plan:

  • Is it ok if {% get_language_info_list %} automatically handles both lists of language codes and lists of tuples (as in settings.LANGUAGES)?
  • What if the {% get_language_info_list %} tag was eliminated and a single {% get_language_info %} tag returned either a single language info dict or a list of dicts depending on the type of the argument? Or would it be more confusing than useful?
  • Is the get_language_info() function needed, or is it sufficient to be able to say
    >>> from django.conf.language_info import language_info
    >>> i = language_info['de']
    
  • Should only either template tags or filters be provided, not both?

by Antti Kaihola, 17 years ago

Attachment: 4030_language_info.diff added

implementation of the 2008-04-22 design, with tests and documentation

in reply to:  15 comment:16 by Antti Kaihola, 17 years ago

Replying to akaihola:

I went through this trying to think about the requirements from a web developer's standpoint and came up with a slightly modified API:

The attachment above implements this design. In addition,

  • if language_info.py is incomplete or missing, django.utils.translation.get_language_info() still works and updates the in-memory cache for each requested language, and a warning is issued with instructions to run make-language-info.py;
  • there are tests for the get_language_info() function, the template tags and the template filters;
  • some documentation is added to i18n.txt; and
  • the example language selection form in i18n.txt is updated to use local language names.

comment:17 by Malcolm Tredinnick, 17 years ago

Anything that loops through every language is still not going to be appropriate here. That's a massive amount of memory usage, because every single MO file is loaded into memory. It's also not going to be particularly fast. And it provides more than one way to do something. Short version: don't do that. :-)

Instead, let's just have the one dictionary and that's all (I'm sort of so-so about the script to generate it. But, leave that in for the moment; it's probably the right thing to do). There's no need for all the fallbacks for a missing file or anything, though. That just means Django isn't correctly installed and that's not our problem. What other files haven't they installed?

Secondly, doesn't your solution possible have a localisation problem? Specifically, if we load the language info into a template, I think we should also include the name of the language in the current locale. If I'm using that dictionary in Python code I probably need the English name (e.g. for a form field value), but in a template it might be optional (probably harmless, though). Definitely needed in the currently active locale, though (for the title attribute, for example).

I haven't had time to look at the rest of your proposal yet, but it's nice to see some action here. These are my thoughts from an initial read through. I'll have a bit more of a think about this shortly, but I suspect it's getting pretty close.

by Antti Kaihola, 17 years ago

Attachment: 4030_language_info_v2.diff added

removed fallback for missing language_info.py, improved make-language-info.py

in reply to:  17 comment:18 by Antti Kaihola, 17 years ago

Replying to mtredinnick:

Anything that loops through every language is still not going to be appropriate here. That's a massive amount of memory usage, because every single MO file is loaded into memory. It's also not going to be particularly fast. And it provides more than one way to do something. Short version: don't do that. :-)

Ok, replaced that part with a simple ImportError.

Instead, let's just have the one dictionary and that's all (I'm sort of so-so about the script to generate it. But, leave that in for the moment; it's probably the right thing to do).

Ok, it's still there with a couple of improvements.

Secondly, doesn't your solution possible have a localisation problem? Specifically, if we load the language info into a template, I think we should also include the name of the language in the current locale.

Ah, there's no shortcut for that currently, but doing
that is as simple as

{% get_language_info for LANGUAGE_CODE as lang %}
{% trans lang.name %}

Would it make sense to always dynamically add the name
of the language in the current locale to the info dict,
no matter if it's needed or not?

Actually, then I'd change the key names (assuming Finnish
as the active language in this example):

{'language_code': 'pl',
 'name_english': 'Polish',
 'name_local': 'Polski',
 'name': 'puola',
 'bidi': False}

The filters would be accordingly:

  • {{ "pl"|language_name_english }} ("Polish")
  • {{ "pl"|language_name_local }} ("Polski")
  • {{ "pl"|language_name }} ("puola")
  • {{ "pl"|bidi }} (False)

I haven't had time to look at the rest of your proposal yet, but it's nice to see some action here. These are my thoughts from an initial read through. I'll have a bit more of a think about this shortly, but I suspect it's getting pretty close.

Nice if I'm working with something useful for others
as well.

I noticed i18n is the topic and you the guest on the
latest TWID, and you blogged about it as well. I'll
listen and take a look – it's good that an important
subject like this gets attention.

by Antti Kaihola, 16 years ago

Patch adapted for Django r8347 (a bit after 1.0a2)

by Antti Kaihola, 16 years ago

Updated the pre-generated language information dictionary according to current Django languages

comment:19 by anonymous, 16 years ago

Cc: nreilly@… added

comment:20 by Gonzalo Saavedra, 16 years ago

Cc: Gonzalo Saavedra added

comment:21 by Jim Garrison, 16 years ago

Cc: jim@… added

by Antti Kaihola, 16 years ago

updated patch for revision 10639 (1.1beta1+)

comment:22 by Antti Kaihola, 16 years ago

Just updated the patch for Django revision 10639.

Bah, Trac doesn't display the patch properly. Quoting peritus from his comment to #9289:

The trac patch-viewer has problems showing patches from "git diff", which is an acceptable format for patches according to http://docs.djangoproject.com/en/dev/internals/contributing/#patch-style

Download the patch and view it with your favourite text editor and you will see the correct file names.

comment:23 by anonymous, 15 years ago

Cc: hupf@… added

by Ramiro Morales, 14 years ago

Attachment: 4030-rc0.diff added

First release cadidate patch for this ticket. Entirely based on work by akaihola.

comment:24 by Ramiro Morales, 14 years ago

Needs documentation: unset
Patch needs improvement: unset

I've uploaded a new (RC 0, I intend to commit this ASAP) patch, updating the great work made by Antti Kaihola (akaihola) with the following changes:

  • Removed the standalone make-lang-info.py tool. We've tried in the past to minimize the number of such kind of commands because that mean we or downstream maintainers need to track another 'program', create man pages for them, etc. I've moved that functionality to a management command.
  • The name name of the command is makelanginfo, can be changed if deemed not completely appropriate.
  • (minor) Renamed the dictionary containing the languages metadata in django.conf.locale.language_info from language_info to lang_info
  • Moved the location of the language_info.py from django/conf/ to django/conf/locale/. The new location seemed more appropriate but I don't know if having a single .py file there among the translations subdirs (and in the future our .pot` files) is totally correct.
  • In django/utils/translation/__init__.py, moved the import of the dynamically generated django.conf.locale.language_info.lang_info dictionary from the module level to inside the get_language_info function. This removes the circular import in the management command that previously was solved by creating first an empty language_info.py. This also makes unnecessary to force the .py -> .pyc compilation after the final .py file is created.
  • Changed the description of the command in a few places to put emphasis in the fact that this is a command generally not used by final users but rather by the Django developers. I think that once we have this in place, we can add the ability to handle additional metadata about languages outside of the Django tree as part of fixing #14461.
  • Added documentation (section in django-admin management command document, django-admin.1 man page blurb)

Open Questions:

  • It is OK to have automatic generation of a language_info.py under the Django tree that is later loaded as part of the I18N infrastructure? Or would it be better to e.g. creating a JSON file (performance wise, I think it is possible to cache its loading per process at runtime as done in other parts of the framework.)
  • Should we also add a wrapper function to django.utils.translation to also allow access to the full lang_info dictionary from Python code?
  • Should we move the get_language_info function from django/utils/translation/__init__.py to trans_real.py and trans_null.py like other functions there? If so, what should it return when I18N is turned off (trans_null.py)?

Reviews welcome!

comment:25 by Jannis Leidel, 14 years ago

Resolution: fixed
Status: newclosed

(In [14894]) Fixed #4030 -- Added ability to translate language names. Thanks to Antti Kaihola and Ramiro Morales for the initial patch.

comment:26 by anonymous, 13 years ago

Easy pickings: unset
Severity: Normal
Type: Uncategorized
UI/UX: unset

I just want to comment that users of a website will see a mess when the fonts needed have not been installed onto his computer. I have considered this problem for my own websites (which are highly localised) and I really only found one good way around it.

So the problem is that if you use many languages, and present the text as fonts, and the users do not have all of the fonts required by those languages installed, then there will be a mess on the page. This is in my opinion completely unacceptable. Wikipedia is one example where this mess regularly occurs - you know they have the language list on the edge of the page and the mess appears there.

One good way to solve this problem is by making such language selection have images instead of HTML text. This solution works very well. You can have a textual language selection list in English always present at the top of all pages, but you can additionally have a language selection page where you can have the image buttons and those would have the names of the languages in the respective languages. This way you will not get any mess displayed to the users. An example of this idea in use is at the NHK World website.

It is just something to consider for your own website design work and not directly about Django.

Note: See TracTickets for help on using tickets.
Back to Top