#4030 closed Uncategorized (fixed)
internationalization - auto translation of LANGUAGES
Reported by: | Owned by: | nobody | |
---|---|---|---|
Component: | Internationalization | Version: | dev |
Severity: | Normal | Keywords: | LANGUAGES settings.py |
Cc: | nreilly@…, Gonzalo Saavedra, jim@…, hupf@… | Triage Stage: | Accepted |
Has patch: | yes | Needs documentation: | no |
Needs tests: | no | Patch needs improvement: | no |
Easy pickings: | no | UI/UX: | no |
Description
Hi
When I access LANGUAGES in a template using:
{% get_available_languages as LANGUAGES %}
And then access the plain text language name in a template, like:
{{ LANGUAGES.1.1 }}
The plain text name appears automatically translated into the current language.
For many uses this is not very useful as it is preferred to display the language name in its native language, so the user who speaks that language may recognize it and choose that option. I.e. If you are an English-speaker and viewing a Spanish language page you'll will prefer to see an option that says "English" rather than "Inglés". The way Django is currently configured one has to put in a workaround to create this behavior (personally I changed the plain text language name slightly to defeat the translation...)
Attachments (8)
Change History (34)
comment:1 by , 18 years ago
Triage Stage: | Unreviewed → Accepted |
---|
comment:2 by , 18 years ago
comment:3 by , 17 years ago
Owner: | changed from | to
---|---|
Status: | new → assigned |
Checking out to see how to fix this
comment:4 by , 17 years ago
Owner: | changed from | to
---|---|
Status: | assigned → new |
Oops, that anonymous was me.
by , 17 years ago
Attachment: | language-local-name.diff added |
---|
patch for new template tag which provides language names in both the current language and the language itself
comment:5 by , 17 years ago
Has patch: | set |
---|---|
Triage Stage: | Accepted → Ready for checkin |
follow-up: 10 comment:6 by , 17 years ago
In my patch, I add a new template tag for this in order to not break backwards compatibility.
I have these questions in my mind about this:
- is a separate template tag a good idea?
- should we worry about performance if we assume this tag could be used on every page of a site in the base template -- is activating each language in a loop a heavy operation?
- should the resulting list be cached?
- could the translations be fetched without activating each language in its turn?
- is the list of dicts return value a sensible choice, or should we return a list of tuples like in {% get_available_languages %} and only include the language name in the language itself?
comment:7 by , 17 years ago
Simon, should we add documentation and tests before checking in the patch? And should we provide unit tests? What about my concerns above?
comment:8 by , 17 years ago
#5446 suggests a db-based country/language list which would offer a different solution for this ticket.
comment:9 by , 17 years ago
Owner: | changed from | to
---|
The try-except block was added because of the bug fixed in [6185]. Should it be removed or is it good to have a safety net like this? In case of a defect .po file it would automatically set name_local to the language name in the active language instead of the language itself.
I'm un-claiming this bug for now and moving on.
comment:10 by , 17 years ago
Needs documentation: | set |
---|---|
Patch needs improvement: | set |
Triage Stage: | Ready for checkin → Accepted |
Replying to akaihola:
- is a separate template tag a good idea?
Yes.
- should we worry about performance if we assume this tag could be used on every page of a site in the base template -- is activating each language in a loop a heavy operation?
We should worry. See below; I don't like the current solution.
- is the list of dicts return value a sensible choice, or should we return a list of tuples like in {% get_available_languages %} and only include the language name in the language itself?
I like the dictionary idea. Let's go with that.
Okay, now to the bigger problem ... I don't like the approach here. It's relatively heavyweight to load up every MO file just to access one string from each. What I would rather do is have an offline process (something like make-messages.py and compile-messages.py) that extracts out the strings and the Unicode string that is the translation and just writes it into a file we can import. Write out a Python dictionary to file, for example. Let's create a little tool for django/bin/
We can regenerate that file from time to time and check it into the source.
Let's also have a Python function (in django.utils.translation, I guess) that returns a Python dictionary of these languages -- mapping locale to (English name, translated name), say -- so that we can use it for choice lists in forms and models and other stuff like that.
Really sorry to ask for big changes like this after your work so far, akaihola.
by , 17 years ago
Attachment: | make-language-names.diff added |
---|
implements mtredinnick's suggestion: a script for generating a local language name module and a fixed template tag
comment:11 by , 17 years ago
No tests or documentation for the above patch yet. And I'm not sure about the dictionary key names. I actually would prefer 'code', 'name' and 'local_name', but for some reason after staring at Django's i18n code I came up with 'language_code', 'name' and 'name_local'.
Doesn't this kind of obsolete {%get_available_languages%}? Language names in the current language are provided by this template tag, too, and it's much more natural to say {{language.language_code}} and {{language.name}} in the template than {{language.0}} and {{language.1}}.
comment:12 by , 17 years ago
Note that the django.utils.translation utility function for generating a choice list isn't yet included in the patch. Malcolm, could you give an example of the needed choices list format just to be sure? Should that list be pre-computed as well in the language_names module as a (premature? :-) optimization?
comment:13 by , 17 years ago
Owner: | changed from | to
---|---|
Status: | new → assigned |
comment:14 by , 17 years ago
Owner: | changed from | to
---|---|
Status: | assigned → new |
follow-up: 16 comment:15 by , 17 years ago
I went through this trying to think about the requirements from a web developer's standpoint and came up with a slightly modified API:
- get a "language info dictionary" for the given language code:
>>> from django.utils.translation import get_language_info >>> get_language_info('de') {'language_code': 'de', 'name': 'German', 'name_local': 'Deutsch', 'bidi': False' # True for bi-directional languages }
- get a list of info dicts for languages specified in
settings.LANGUAGES
(should we provide a helper function for this?):>>> [get_language_info(l[0]) for l in settings.LANGUAGES]
- in templates, iterate languages as specified in
settings.LANGUAGES
and get language info dicts (RequestContext
and thei18n
context processor required):{% get_language_info_list for LANGUAGES as langs %} {% for l in langs %} {{ l.language_code }}: {{ l.name_local }} {% endfor %}
- iterate a custom list of language codes:
{% get_language_info_list for a_list_of_language_codes as langs %}
The problem here is thatLANGUAGES
is a tuple of tuples (a reference tosettings.LANGUAGES
inserted to the context by thei18n
context processor), whereas a user-supplied list would probably contain just language codes as strings. The template tag could automagically handle both if that's not too vague.
- get the info dict for a single language:
{% get_language_info for LANGUAGE_CODE as lang %} {% get_language_info for some_other_language_code as lang %} {% get_language_info for "pl" as lang %} {{ lang.language_code }}: {{ lang.name_local }}
- alternate syntax with filters:
{{ LANGUAGE_CODE|language_name }}
("German"
){{ LANGUAGE_CODE|language_name_local }}
("Deutsch"
){{ LANGUAGE_CODE|bidi }}
(False
)
- the data would be generated by
django/bin/make-language-info.py
as thelanguage_info
dictionary indjango/conf/language_info.py
- the
language_info
dictionary would map language codes to info dictionaries
I believe this approach would provide provide more practical tools for the developer
than a list of dicts and a template tag for retrieving the whole list as discussed before.
I do still have a couple of open questions about this plan:
- Is it ok if
{% get_language_info_list %}
automatically handles both lists of language codes and lists of tuples (as insettings.LANGUAGES
)? - What if the
{% get_language_info_list %}
tag was eliminated and a single{% get_language_info %}
tag returned either a single language info dict or a list of dicts depending on the type of the argument? Or would it be more confusing than useful? - Is the get_language_info() function needed, or is it sufficient to be able to say
>>> from django.conf.language_info import language_info >>> i = language_info['de']
- Should only either template tags or filters be provided, not both?
by , 17 years ago
Attachment: | 4030_language_info.diff added |
---|
implementation of the 2008-04-22 design, with tests and documentation
comment:16 by , 17 years ago
Replying to akaihola:
I went through this trying to think about the requirements from a web developer's standpoint and came up with a slightly modified API:
The attachment above implements this design. In addition,
- if
language_info.py
is incomplete or missing,django.utils.translation.get_language_info()
still works and updates the in-memory cache for each requested language, and a warning is issued with instructions to runmake-language-info.py
; - there are tests for the
get_language_info()
function, the template tags and the template filters; - some documentation is added to
i18n.txt
; and - the example language selection form in
i18n.txt
is updated to use local language names.
follow-up: 18 comment:17 by , 17 years ago
Anything that loops through every language is still not going to be appropriate here. That's a massive amount of memory usage, because every single MO file is loaded into memory. It's also not going to be particularly fast. And it provides more than one way to do something. Short version: don't do that. :-)
Instead, let's just have the one dictionary and that's all (I'm sort of so-so about the script to generate it. But, leave that in for the moment; it's probably the right thing to do). There's no need for all the fallbacks for a missing file or anything, though. That just means Django isn't correctly installed and that's not our problem. What other files haven't they installed?
Secondly, doesn't your solution possible have a localisation problem? Specifically, if we load the language info into a template, I think we should also include the name of the language in the current locale. If I'm using that dictionary in Python code I probably need the English name (e.g. for a form field value), but in a template it might be optional (probably harmless, though). Definitely needed in the currently active locale, though (for the title attribute, for example).
I haven't had time to look at the rest of your proposal yet, but it's nice to see some action here. These are my thoughts from an initial read through. I'll have a bit more of a think about this shortly, but I suspect it's getting pretty close.
by , 17 years ago
Attachment: | 4030_language_info_v2.diff added |
---|
removed fallback for missing language_info.py, improved make-language-info.py
comment:18 by , 17 years ago
Replying to mtredinnick:
Anything that loops through every language is still not going to be appropriate here. That's a massive amount of memory usage, because every single MO file is loaded into memory. It's also not going to be particularly fast. And it provides more than one way to do something. Short version: don't do that. :-)
Ok, replaced that part with a simple ImportError.
Instead, let's just have the one dictionary and that's all (I'm sort of so-so about the script to generate it. But, leave that in for the moment; it's probably the right thing to do).
Ok, it's still there with a couple of improvements.
Secondly, doesn't your solution possible have a localisation problem? Specifically, if we load the language info into a template, I think we should also include the name of the language in the current locale.
Ah, there's no shortcut for that currently, but doing
that is as simple as
{% get_language_info for LANGUAGE_CODE as lang %} {% trans lang.name %}
Would it make sense to always dynamically add the name
of the language in the current locale to the info dict,
no matter if it's needed or not?
Actually, then I'd change the key names (assuming Finnish
as the active language in this example):
{'language_code': 'pl', 'name_english': 'Polish', 'name_local': 'Polski', 'name': 'puola', 'bidi': False}
The filters would be accordingly:
{{ "pl"|language_name_english }}
("Polish"){{ "pl"|language_name_local }}
("Polski"){{ "pl"|language_name }}
("puola"){{ "pl"|bidi }}
(False)
I haven't had time to look at the rest of your proposal yet, but it's nice to see some action here. These are my thoughts from an initial read through. I'll have a bit more of a think about this shortly, but I suspect it's getting pretty close.
Nice if I'm working with something useful for others
as well.
I noticed i18n is the topic and you the guest on the
latest TWID, and you blogged about it as well. I'll
listen and take a look – it's good that an important
subject like this gets attention.
by , 16 years ago
Attachment: | 4030_language_info_v2_r8347.diff added |
---|
Patch adapted for Django r8347 (a bit after 1.0a2)
by , 16 years ago
Attachment: | 4030_language_info_v2_r8347_updated_languages.diff added |
---|
Updated the pre-generated language information dictionary according to current Django languages
comment:19 by , 16 years ago
Cc: | added |
---|
comment:20 by , 16 years ago
Cc: | added |
---|
comment:21 by , 16 years ago
Cc: | added |
---|
by , 16 years ago
Attachment: | 4030_language_info_v2_r10639.diff added |
---|
updated patch for revision 10639 (1.1beta1+)
comment:22 by , 16 years ago
Just updated the patch for Django revision 10639.
Bah, Trac doesn't display the patch properly. Quoting peritus from his comment to #9289:
The trac patch-viewer has problems showing patches from "git diff", which is an acceptable format for patches according to http://docs.djangoproject.com/en/dev/internals/contributing/#patch-style
Download the patch and view it with your favourite text editor and you will see the correct file names.
comment:23 by , 15 years ago
Cc: | added |
---|
by , 14 years ago
Attachment: | 4030-rc0.diff added |
---|
First release cadidate patch for this ticket. Entirely based on work by akaihola.
comment:24 by , 14 years ago
Needs documentation: | unset |
---|---|
Patch needs improvement: | unset |
I've uploaded a new (RC 0, I intend to commit this ASAP) patch, updating the great work made by Antti Kaihola (akaihola) with the following changes:
- Removed the standalone
make-lang-info.py tool
. We've tried in the past to minimize the number of such kind of commands because that mean we or downstream maintainers need to track another 'program', create man pages for them, etc. I've moved that functionality to a management command. - The name name of the command is
makelanginfo
, can be changed if deemed not completely appropriate. - (minor) Renamed the dictionary containing the languages metadata in
django.conf.locale.language_info
fromlanguage_info
tolang_info
- Moved the location of the
language_info.py
fromdjango/conf/
todjango/conf/locale/
. The new location seemed more appropriate but I don't know if having a single.py
file there among the translations subdirs (and in the future our .pot` files) is totally correct. - In
django/utils/translation/__init__.py
, moved the import of the dynamically generateddjango.conf.locale.language_info.lang_info
dictionary from the module level to inside theget_language_info
function. This removes the circular import in the management command that previously was solved by creating first an emptylanguage_info.py
. This also makes unnecessary to force the .py -> .pyc compilation after the final .py file is created. - Changed the description of the command in a few places to put emphasis in the fact that this is a command generally not used by final users but rather by the Django developers. I think that once we have this in place, we can add the ability to handle additional metadata about languages outside of the Django tree as part of fixing #14461.
- Added documentation (section in django-admin management command document,
django-admin.1
man page blurb)
Open Questions:
- It is OK to have automatic generation of a
language_info.py
under the Django tree that is later loaded as part of the I18N infrastructure? Or would it be better to e.g. creating a JSON file (performance wise, I think it is possible to cache its loading per process at runtime as done in other parts of the framework.) - Should we also add a wrapper function to django.utils.translation to also allow access to the full
lang_info
dictionary from Python code? - Should we move the
get_language_info
function fromdjango/utils/translation/__init__.py
totrans_real.py
andtrans_null.py
like other functions there? If so, what should it return when I18N is turned off (trans_null.py
)?
Reviews welcome!
comment:25 by , 14 years ago
Resolution: | → fixed |
---|---|
Status: | new → closed |
comment:26 by , 13 years ago
Easy pickings: | unset |
---|---|
Severity: | → Normal |
Type: | → Uncategorized |
UI/UX: | unset |
I just want to comment that users of a website will see a mess when the fonts needed have not been installed onto his computer. I have considered this problem for my own websites (which are highly localised) and I really only found one good way around it.
So the problem is that if you use many languages, and present the text as fonts, and the users do not have all of the fonts required by those languages installed, then there will be a mess on the page. This is in my opinion completely unacceptable. Wikipedia is one example where this mess regularly occurs - you know they have the language list on the edge of the page and the mess appears there.
One good way to solve this problem is by making such language selection have images instead of HTML text. This solution works very well. You can have a textual language selection list in English always present at the top of all pages, but you can additionally have a language selection page where you can have the image buttons and those would have the names of the languages in the respective languages. This way you will not get any mess displayed to the users. An example of this idea in use is at the NHK World website.
It is just something to consider for your own website design work and not directly about Django.
You're right about the need to support that use-case. We also need to be have the current behaviour.
The language names are stored as their English versions (like every other string in Django) and that shouldn't change. So, one problem to solve is how to get the right translation strings in the first place. It may be possible to rummage through every PO file for the right name at startup and cache it in trans_real.py.
The second thing is how to make this accessible to the caller. That is probably best done by a function available under django.util.translation. It could be a different function to get_available_languages() because we need both behaviours anyway and that would avoid any unnecessary backwards-incompatibility issues.
An interesting problem to solve here is what if the resulting string (of all the languages) is not representable in the output encoding. For example, you will have trouble showing Chinese characters in Russia's KOI8-R encoding. That's something we can solve in the unicode branch, I guess, but it's worth paying attention to.