Opened 6 years ago

Last modified 9 months ago

#11688 new New feature

verbose_name should allow dynamical translation based on a number

Reported by: mitar Owned by: nobody
Component: Internationalization Version: 1.1
Severity: Normal Keywords:
Cc: mmitar@…, sirexas@…, 4glitch@…, shaib Triage Stage: Accepted
Has patch: yes Needs documentation: no
Needs tests: no Patch needs improvement: yes
Easy pickings: no UI/UX: no

Description

verbose_name of meta data for models should allow dynamical translation based on a number of elements through ungettext and not just two possibilities of verbose_name and verbose_name_plural which is not enough for languages with complex plural forms.

Django could check if verbose_name is a function and call it with a number of elements as a parameter. This would also be backwards compatible.

It should also allow a seconds parameter, a context, as I have described in #11686, because in some languages counting depends also on a context (like case of a counted noun). Probably it should be set to something like a view name by default (or something else which would correlate with grammatical context). Or at least it should be possible to specify it in a model definition to differentiate between different translations of the same word for model name and some other use of this word.

Attachments (2)

11688-verbose_name_plural_evolution-1.diff (77.9 KB) - added by ramiro 4 years ago.
First iteration of a proposed implementation of this feature
11688-verbose_name_plural_evolution-2.diff (85.1 KB) - added by ramiro 4 years ago.
Moved get_verbose_name() method to ._meta Options class, updated and enhanced documentation

Download all attachments as: .zip

Change History (21)

comment:1 Changed 6 years ago by ubernostrum

  • Needs documentation unset
  • Needs tests unset
  • Patch needs improvement unset
  • Resolution set to duplicate
  • Status changed from new to closed

Duplicate of #794, #2202 and #5997. Please search existing tickets before filing new ones.

comment:2 Changed 6 years ago by mtredinnick

  • Resolution duplicate deleted
  • Status changed from closed to reopened

This isn't a dupe of the previous tickets, that are talking about template tags and something a bit undefined (in the case of #5997). Worth having open, although I'm not sure the proposed solution is the right one. We already know how to translate things with plural forms; we need to be able to pass in the right number at the right time. This also requires solving the translating of model names, which is something Ramiro Morales and I have been talking about as a "must do someday" thing for a while now.

comment:3 Changed 5 years ago by russellm

  • Triage Stage changed from Unreviewed to Accepted

comment:4 Changed 4 years ago by julien

  • Severity set to Normal
  • Type set to New feature

Changed 4 years ago by ramiro

First iteration of a proposed implementation of this feature

comment:5 Changed 4 years ago by ramiro

  • Easy pickings unset
  • Has patch set
  • UI/UX unset

Changed 4 years ago by ramiro

Moved get_verbose_name() method to ._meta Options class, updated and enhanced documentation

comment:6 Changed 3 years ago by sirex

  • Cc sirexas@… added
  • Patch needs improvement set

I just merged this patch with latest master branch, result is committed to cloned branch at https://github.com/sirex/django

All tests passes.

I'm planning to look at this issue more in details, test it more deeply. Also, I'm planning to extend it not only to support multiple plural forms, but also different model name forms by context where model name is used. For example, Lithuanian language (same with Russian), has 7 noun forms, each form is used depending on context, for example „Add book“ in Lithuanian is displayed as „Pridėti knyga“ (in first form), but instead, second form should be used: „Pridėti knygą“. Without these forms, Django admin looks really corrupted.

I'm planning to fix these context issues by using pgettext: https://docs.djangoproject.com/en/dev/topics/i18n/translation/#contextual-markers

comment:7 Changed 3 years ago by sirex

Ofter some research, found that it is quite hard to make dynamic model names
display correctly in all languages.

Actually most languages divides nouns to singular, plural, numeric and
grammatical case forms.

Trickiest part is grammatical cases, actually most languages has same
grammatical cases described here:

http://en.wikipedia.org/wiki/Grammatical_case

Overall there are these grammatical cases:

  • nominative
  • accusative
  • dative
  • ablative
  • genitive
  • vocative
  • locative
  • instrumental

Each grammatical case has different singular, plural and numeric forms.

After trying to put all this in model class, decided, that this heavily
violates KISS principle, because, each model must describe all possible
variations of these forms.

So in my opinion, this feature must be implemented using automatically
generated strings to .po files leaving models with just one verbose name:

from django.db import models
from django.utils.translation import Noun

class Book(models.Model):
    title = models.CharField(max_length=128)

    class Meta:
        verbose_name = Noun('book')

After running ./manage.py makemessage -l lt, these strigs must be
generated:

# nominative
msgid "book"
msgstr "knyga"

msgctxt "plural"
msgid "books"
msgstr "knygos"

msgctxt "numeric"
msgid "book"
msgid_plural "books"
msgstr[0] "knyga"
msgstr[1] "knygos"
msgstr[2] "knygų"


# accusative
msgctxt "accusative"
msgid "book"
msgstr "knygą"

msgctxt "accusative plural"
msgid "books"
msgstr "knygas"

msgctxt "accusative numberic"
msgid "book"
msgid_plural "books"
msgstr[0] "knygą"
msgstr[1] "knygas"
msgstr[2] "knygų"


# dative
msgctxt "dative"
msgid "book"
msgstr "knygai"

msgctxt "dative plural"
msgid "books"
msgstr "knygoms"

msgctxt "dative numberic"
msgid "book"
msgid_plural "books"
msgstr[0] "knygai"
msgstr[1] "knygoms"
msgstr[2] "knygų"


# ablative
...

Not all languages have grammatical cases and those who have, has not all of
them, so I guess available grammatical cases should be listed in
django.conf.locale.<LANG> to get smaller .po files for languages that
does not have grammatical cases.

Finally, instance of Noun('book') should work this way (all examples
provided using Lithuanian language, that has all grammatical cases except
ablative):

  1. Singular forms:
Noun('book')               -> knyga
Noun('book').accusative    -> knygą
Noun('book').dative        -> knygai
Noun('book').ablative      -> nuo knygos
Noun('book').genitive      -> knygos
Noun('book').vocative      -> knyga
Noun('book').locative      -> knygoje
Noun('book').instrumental  -> knyga
  1. Plural forms:
Noun('book').plural               -> knygos
Noun('book').plural_accusative    -> knygas
Noun('book').plural_dative        -> knygoms
Noun('book').plural_ablative      -> nuo knygų
Noun('book').plural_genitive      -> knygų
Noun('book').plural_vocative      -> knygos
Noun('book').plural_locative      -> knygose
Noun('book').plural_instrumental  -> knygomis
  1. Numeric forms:
Noun('book')[0]   -> knygų
Noun('book')[1]   -> knyga
Noun('book')[2]   -> knygos
Noun('book')[10]  -> knygų

Noun('book')[0]               -> knygų
Noun('book')[0].accusative    -> knygų
Noun('book')[0].dative        -> knygų
Noun('book')[0].ablative      -> nuo knygų
Noun('book')[0].genitive      -> knygų
Noun('book')[0].vocative      -> knygų
Noun('book')[0].locative      -> knygų
Noun('book')[0].instrumental  -> knygų

In Django code, in all places must be used right noun form. For excample:

{% blocktrans with cl.opts.verbose_name.accusative as name %}
    Add {{ name }}
{% endblocktrans %}

This way quite many countries will have possibility to translate this to
correct form.

comment:8 Changed 3 years ago by mitar

Vau. That looks good approach.

comment:9 follow-ups: Changed 3 years ago by lukeplant

The assumption that number and case are sufficient to distinguish all forms of a word is wrong.

For example, there is also agreement: http://en.wikipedia.org/wiki/Verb#Agreement

Using the examples given by Wikipedia, and the example above, the word for "add" could change depending on the gender of the word used for the model for Georgian and Basque.

There are all kinds of other things too - for example, in Greek (at least NT Greek), the choice of case depends on prepositions in such a way that a template author could not possibly know which case to use.

I'm sure there are many other things of which I'm unaware. Essentially, any attempt to build up sentences by substituting words into a template is going to fail. This is a hard problem. The questions we can ask are:

  • how many and how bad are the current failures?
  • what fraction would be fixed by adding a certain mechanism (e.g. ability to distinguish case)?
  • how much will the mechanism require to implement, for Django core developers, for template authors and for translators?
  • is the mechanism extendable to deal with things we haven't thought of yet?

I'm inclined to think that mechanisms that depend on specific language features (e.g. agreement, case etc) are the wrong approach, and a more generic 'context' should be added that allows specific translations for specific sentences.

comment:10 in reply to: ↑ 9 Changed 3 years ago by claudep

Replying to lukeplant:

I'm sure there are many other things of which I'm unaware. Essentially, any attempt to build up sentences by substituting words into a template is going to fail. This is a hard problem.

++1

I would be strongly -1 on any solution trying to build translated sentences by blocks. It will always fail in a way or another. In the "Add {{ name }}" example above, I think that most languages can translate it as something like "Add '{{ name }}' object" so as the accusative is on the intermediate term (object) and not on "name". That's just an example, but each language has to find workarounds adapted to its rules.

comment:11 in reply to: ↑ 9 Changed 3 years ago by sirex

Replying to lukeplant:

I'm sure there are many other things of which I'm unaware. Essentially, any attempt to build up sentences by substituting words into a template is going to fail. This is a hard problem.

Using model class name, in human readable context is already a fail. And yes, fixing it is a hard problem. For English language it is already fixed, by adding 's' or using 'verbose_name_plural' for exceptional cases. But current implementation does not work of most languages. What I'm trying to do is to make it more flexible and extendible to support most Indo-European languages and probably most other languages. I'm sure, that this approach will not cover all possible cases, but at least it will cover much much more languages, than we have now.

If you think, that having model class name in human readable context is fail, then we should drop this feature at all, to make it work for all languages. If you thing, that we should leave this feature, then why you refuse to fix all those English-only related places?

The questions we can ask are:

  • how many and how bad are the current failures?

Current Django admin implementation is heavily hardcoded for English only language. My company can't offer Django admin for any client, because out of the box, many places uses incorrect word forms. It sounds some thing like this: "10 book", that should be "10 books", actually things are much worse, because for example Lithuanian language has 15 noun forms and Django uses only 2 of them, so most places are incorrect. Unless you think, that saying "10 book" is OK?

I tried to do workarounds with translation files by doing some thing like: "New: '{{ name }}'", but it sound unnatural and many places are hardcoded in Django admin templates, so even these workarounds are not possible, for example in such places like this:

{{ cl.result_count }} {% ifequal cl.result_count 1 %}{{ cl.opts.verbose_name }}{% else %}{{ cl.opts.verbose_name_plural }}{% endifequal %}

So if we want to go this way, we should replace all places in Django admin to neutral forms, such like these: "New: {{ name }}", or "Number of {{ name }} items: {{ count }}". But of course this will sound unnatural. Much better "Add {{ name }}" and "{{ count }} {{ name }}".

  • what fraction would be fixed by adding a certain mechanism (e.g. ability to distinguish case)?

At least for Lithuanian language, separating singular/plural forms, replacing all number contexts with ungettext and adding ability to distinguish grammatical case would solve all places, all texts would sound natural and would be correct. I believe, same approach will work form most other Indo-European languages, that has nouns with several grammatical cases.

  • how much will the mechanism require to implement, for Django core developers, for template authors and for translators?

Proposed mechanism is fully backward compatible, since it replace verbose_name to string-like object with additional features. And for this, to take effect, all places, where verbose_name is used, must be adjusted, to correct contextual form, for example: "Add {{ model.verbose_name.accusative }}". If verbose_name will be left as is, then nothing changes. Template authors will have ability to use correct model verbose name form if they want.

Translators of languages with several grammatical cases will have more strings to translate, but for languages like English with two noun forms, nothing changes.

  • is the mechanism extendable to deal with things we haven't thought of yet?

Currently, mechanism provides full flexibility for single model verbose name text. If a language needs more context, to decide correct noun form, this context can be added easily.

What I can think of is that this mechanism will not work for such cases, where not only model name has forms, but surrounding text also some how depends on model verbose name case (I'm not sure if any language has this).

I'm inclined to think that mechanisms that depend on specific language features (e.g. agreement, case etc) are the wrong approach, and a more generic 'context' should be added that allows specific translations for specific sentences.

Agreement is for verbs, verbs are very complicated, at least in Lithuanian language, usually, one verb in Lithuanian language has more than 200 forms. But model names are nouns, because all names of some thing are nouns.

Now Django already uses language specific features, like singular/plural forms. But currently it lacks number forms and grammatical cases, that are very common in other languages. So if Django already have language specific features, then we should make them more complete.

comment:12 Changed 3 years ago by lukeplant

This suggestion really isn't going to fix it even for Indo-European languages, and I think your claim that it does is based entirely on wishful thinking. You haven't shown any stats, just "I believe". (I'm not blaming you for this - we've already got at least 70 languages in Django, and researching all this correctly would be a huge job).

For example, take the phrase "Delete selected %(verbose_name_plural)s". In French, adjectives must agree with the noun, so the translation of the word "selected" needs to agree with the gender of whatever word gets inserted into the phrase.

The current Django translation is: "Supprimer les %(verbose_name_plural)s sélectionnés". For all feminine model names, this is incorrect - it should be: "Supprimer les %(verbose_name_plural)s sélectionnées" (extra e). In other places the current French translations have workarounds like "%(object)s supprimé(e)s", where "supprimé(e)s" is a bit like doing "book(s)".

Your proposed solution doesn't accommodate this problem, and it is a very basic requirement in many European languages, and probably many others.

We are currently at a 95% solution:

  • Most sentences/phrases don't need substitutions
  • We can cover noun/number agreement using ngettext(). The original ticket was about fixing an instance where we are not doing that, and we should definitely try to fix that case.

I agree it would be nice to fix the remaining 5%. But we're not going to agree to a solution that involves a large amount of work, causes a very significant increase in complexity for developers and actually only fixes a small (or completely unknown) fraction of the remaining 5%.

A more generic solution might look something like this:

For any translatable string we have:

  • A template
  • A substitution (let's assume just 1 for now)

Then:

  • Any number of properties of the template might cause a different variant of the substitution to be used (e.g. if the substitution is a noun, the template sentence might put the noun in the accusative/nominative etc. position, causing that variant of the noun to be needed)
  • Any number of properties of the substitution might cause a different variant of the template to be used. (e.g. if the substitution is a noun and has gender, a different sentence is needed that has the right gender adjective).

(The symmetry between template/substitution and the need for more than one substitution makes me think this needs to be modelled in a more general way, but I'll carry on for now).

Now, the properties that cause variants are known to the translators, not to the template authors. Translators need a way to specify the properties and the variants, probably with some kind of pattern matching language for selecting the correct variant.

However:

  1. I don't know if my analysis really covers it — I suspect at the very least that combinatorial explosion of properties/variants could make it totally impractical.
  2. I really think that we should not be inventing solutions here. It is a general problem, there must be general solutions already invented, and we should re-use those, or develop them independently of Django.

Finally, we shouldn't derail this ticket with these concerns. Thanks so much for your work in pushing this ticket forward, but let's have another ticket to cover issues beyond noun/number agreement.

comment:13 Changed 3 years ago by mitar

I would just add that when I was opening this ticket, I have mentioned also case-matching and not just number-matching in the third paragraph. So also in my language only number-matching will not be enough.

comment:14 Changed 3 years ago by sirex

This ticket also covers case-matching, and current patch adds one layer of complexity, but actually fixes only number related issues. And what I'm proposing, is to fix this as whole using one flexible solution, instead of fixing just small parts of it, by implementing different techniques for each issue.

lukeplant thank you for pointing about adjectives, because in Lithuanian language we have same situation, "Delete selected %(verbose_name_plural)s" for masculine gender must be translated to "Ištrinti pažymėtus %(verbose_name_plural)s", and for feminine "Ištrinti pažymėtas %(verbose_name_plural)s".

In combination mmitar and my proposed solutions solves:

  • number context issues
  • grammatical case issues

Unsolved issues:

I did some research, but could not find any thing related to this, except number context which is included in gettext. Also, gettext has context feature that we can reuse.

To fix adjective agreement issue using proposed solution, we can add gender context to translatable strings, that has adjectives describing dynamic model name.

For this, in model's Meta class we need to add verbose_name_gender :

from django.db import models
from django.utils.translation import Noun

class Book(models.Model):
    title = models.CharField(max_length=128)

    class Meta:
        verbose_name = Noun('book', gender='feminine')

Then, in strings with model name, we can use some thing like this:

pgettext(model.opts.verbose_name.gender, 'Delete selected %(verbose_name)s' % {'verbose_name': model.opts.verbose_name.accusative})

And makemessages , reading this Noun('book', gender='feminine') will write to .po file both genders and translators will be able to translate it correctly.

So with this, at least for Lithuanian language all issues seems to be solved. Solution is still fully backward compatible, and for those, who are not interested to support many languages, where model names are used, nothing changes. But those, who want to support other languages, will be easy to add correct grammatical cases and genders where needed.

comment:15 Changed 3 years ago by sirex

I tried to find all places, where verbose_name is involved, and here is list of what I found with noun forms and context for whole string:

[ genitive ]             "Cannot delete %(name)s"
[ accusative, @gender ]  "Delete selected %(verbose_name_plural)s"
[]                       '%(verbose_name)s: %(obj)s'
[ @gender ]              'Added %(name)s "%(object)s".
[ dative,  @gender ]     'Changed %(list)s for %(name)s "%(object)s".'
[ @gender ]              'Deleted %(name)s "%(object)s".'
[ @gender ]              'The %(name)s "%(obj)s" was added successfully.'
[ accusative ]           "You may add another %(verbose_name)s below."
[ @gender ]              'The %(name)s "%(obj)s" was changed successfully.'
[ @gender ]              'The %(name)s "%(obj)s" was added successfully. You may edit it again below.'
[ accusative ]           "You may add another %(verbose_name)s below."
[ accusative ]           'Add %(verbose_name)s'
[ @gender ]              '%(name)s object with primary key %(key)r does not exist.'
[ accusative ]           'Change %s'
[ genitive, number ]     "%(count)s %(name)s was changed successfully."
[ @gender ]              '%(name)s object with primary key %(key)r does not exist.'
[ @gender ]              'The %(name)s "%(obj)s" was deleted successfully.'
[ genitive ]             "Cannot delete %(name)s"
[ accusative ]           {% blocktrans with cl.opts.verbose_name as name %}Add {{ name }}{% endblocktrans %}
[ accusative ]           {% trans 'Add' %} {{ opts.verbose_name }}
[ accusative ]           {% blocktrans with verbose_name=inline_admin_formset.opts.verbose_name|title %}Add another {{ verbose_name }}{% endblocktrans %}
[ accusative ]           'Select %(verbose_name)s'
[ accusative,  @gender ] 'Select %(verbose_name)s to change'
[ genitive, number ]     '%d %(verbose_name)s' % (self.result_count, name)
[ genitive ]             {{ field.verbose_name|capfirst }} calendar
[ accusative ]           {{ model.verbose_name_plural|capfirst }} by {{ field.verbose_name }}
[ accusative ]           Home / {{ model.verbose_name_plural|capfirst }} / By {{ field.field.verbose_name }}
[ accusative ]           {{ model.verbose_name_plural|capfirst }} by {{ field.field.verbose_name }}
[ instrumental ]         {{ model.verbose_name_plural|capfirst }} with {{ field.field.verbose_name }} {{ value }}
[ accusative ]           Home / {{ model.verbose_name_plural|capfirst }} / Fields / By {{ field.field.verbose_name }} / {{ value }}
[ genitive, number ]     {{ object_list.count }} {% if object_list.count|pluralize %}{{ model.verbose_name_plural }}{% else %}{{ model.verbose_name }}{% endif %} with {{ field.field.verbose_name }} {{ value }}
[ plural ]               Home / {{ model.verbose_name_plural|capfirst }}
[ genitive, number ]     {{ model.objects.count }} {% if model.objects.count|pluralize %}{{ model.verbose_name_plural }}{% else %}{{ model.verbose_name }}{% endif %}
[]                       {{ object.model.verbose_name|capfirst }}: {{ object }}
[ plural ]               Home / {{ object.model.verbose_name_plural|capfirst }} / {{ object }}
[]                       {{ object.model.verbose_name|capfirst }}: {{ object }}
[ plural, @gender ]      Appears in "{{ related_object.related_field }}" in the following {{ related_object.model.verbose_name_plural }}:
[ plural, instrumental ] {{ model.verbose_name_plural|capfirst }} with {{ field.verbose_name }} in {{ month|date:"F Y" }}
[ accusative, plural ]   Home / {{ model.verbose_name_plural|capfirst }} / Calendars / By {{ field.verbose_name }} / {{ month|date:"Y" }} / {{ month|date:"F" }}
[ genitive, number ]     {{ object_list.count }} {% if object_list.count|pluralize %}{{ model.verbose_name_plural }}{% else %}{{ model.verbose_name }}{% endif %} with {{ field.verbose_name }} on {{ month|date:"F Y" }}
[ plural, accusative ]   {{ model.verbose_name_plural|capfirst }} by {{ field.field.verbose_name }}
[ plural, accusative ]   Home / {{ model.verbose_name_plural|capfirst }} / Fields / By {{ field.field.verbose_name }}
[ plural, accusative ]   {{ model.verbose_name_plural|capfirst }} by {{ field.field.verbose_name }}
[ plural, instrumental ] {{ model.verbose_name_plural|capfirst }} with {{ field.verbose_name }} {{ day|date:"F j, Y" }}
[ plural, accusative ]   Home / {{ model.verbose_name_plural|capfirst }} / Calendars / By {{ field.verbose_name }} / {{ day|date:"Y" }} / {{ day|date:"F" }} / {{ day|date:"d" }}
[ genitive, number ]     {{ object_list.count }} {% if object_list.count|pluralize %}{{ model.verbose_name_plural }}{% else %}{{ model.verbose_name }}{% endif %} with {{ field.verbose_name }} on {{ day|date:"F j, Y" }}
[ plural, locative ]     Browsable fields in {{ model.verbose_name_plural }}
[ plural ]               Home / {{ model.verbose_name_plural|capfirst }} / Fields
[ plural, locative ]     Browsable fields in {{ model.verbose_name_plural }}
[ plural, accusative ]   {{ model.verbose_name_plural|capfirst }} by {{ field.verbose_name }}
[ plural ]               Home / {{ model.verbose_name_plural|capfirst }} / Calendars
[ plural, accusative ]   {{ model.verbose_name_plural|capfirst }} by {{ field.verbose_name }}
[ plural, instrumental ] {{ model.verbose_name_plural|capfirst }} with {{ field.verbose_name }} in {{ year }}
[ plural, accusative ]   Home / {{ model.verbose_name_plural|capfirst }} / Calendars / By {{ field.verbose_name }} / {{ year }}
[ plural, instrumental ] {{ model.verbose_name_plural|capfirst }} with {{ field.verbose_name }} in {{ year }}
[ plural, accusative ]   {{ model.verbose_name_plural|capfirst }} by {{ field.field.verbose_name }}: {{ value }}
[ plural, accusative ]   Home / {{ model.verbose_name_plural|capfirst }} / By {{ field.field.verbose_name }} / {{ value }}
[ plural, accusative ]   {{ model.verbose_name_plural|capfirst }} by {{ field.field.verbose_name }}: {{ value }}
[ accusative ]           'Can %s %s' % (action, opts.verbose_name_raw)
[ genitive ]             "No %(verbose_name)s found matching the query"
[ genitive, @gender ]    "No %(verbose_name_plural)s available"
[ genitive, @gender ]    "No %(verbose_name_plural)s available"
[]                       "Future %(verbose_name_plural)s not available because %(class_name)s.allow_future is False."

At least for Lithuanian language, plural, number, grammatical case and gender is enough for all these strings to be displayed correctly. As you see, now only 4 of 63 strings are displayed correctly in Lithuanian language...

comment:16 Changed 3 years ago by anonymous

  • Cc 4glitch@… added

comment:17 Changed 3 years ago by lukeplant

As mentioned previously, the analysis of which case to use is not guaranteed to be the same for all languages, especially as different languages have different cases. So, we can't have a solution that requires putting case into the Django source code.

I've written a blog post with some of my ideas on this, but at the moment it is much too big a project, and would require a complete change to how we do translations (moving away from gettext entirely, and therefore massively backwards incompatible). http://lukeplant.me.uk/blog/posts/translating-sentences-with-substitutions/

comment:18 Changed 2 years ago by aaugustin

  • Status changed from reopened to new

comment:19 Changed 9 months ago by shaib

  • Cc shaib added
Note: See TracTickets for help on using tickets.
Back to Top