Translation of Content stored in Models is a concern for many. A variant of 3rd party solutions have been developed. However, these solutions have many drawbacks. This page will summarize current use cases and propose an API for possibility (4). Related resources, discussions and tickets are listed at the end.
Simple language-aware model (1)
Some multilangual applications don't have relations between different language versions of objects (eg. blog, each entry in one language). Here it is only interesting which language the content has, which can be solved by programmer easily. This case works right now without any changes to Django.
class BlogEntry(models.Model): language = models.CharField(max_length=20, db_index=true) title = models.CharField(max_length=255) # other fields
- each object is available only in one language
- no need to share logic between representations in different languages
- there is no representation of the object in an other language
- one object per language
Multilingual model with multiple languages in one object
The translations for all languages are stored on the same object.
(2) One way could be:
class BlogEntry(models.Model): title = models.CharField(max_length=255) title_de = models.CharField(max_length=255) title_ru = models.CharField(max_length=255) # etc.
- a conceptual object is represented by one real object
- Schema changes necessary for additional languages
- FKs cannot directly reference a specific language and also need a language attribute
(3) Another way could be: Translations are stored for each model in another model and are JOINed on demand.
- No schema changes
- JOINs can be considered evil
Multilingual model with one object per language (4)
The Model is language-aware but also has a indirect reference to the same object in different languages via a common key:
class BlogEntry(models.Model): language = models.CharField(max_length=20, db_index=true) group_id = models.CharField(max_length=36) # e.g. UUID # All database objects that represent the same logical object in different languages # have the same group_id (possibly a UUID, may also be an integer). title = models.CharField(max_length=255) # other fields
- flexible in respect to additional languages
- querying stays easy, since basic model capabilities hold
- multiple model objects for one conceptual object, does not really hurt
- duplication of non-translatable fields (ie fields that stay the same across translations)
API for Model Translation in respect to (4) - a proposal/discussion:
class BlogEntry(TranslatableModel): title = models.CharField(max_length=255) author = models.ForeignKey(User) # To mark fields as translatable/untranslatable: # * title = models.CharField(max_length=255, translatable=True) # * author = models.CharField(max_length=255, untranslatable=True) last_modified = models.DateTimeField(auto_now=True) # or introduce a field in a Meta class class Meta: untranslatable_fields = ('last_modified','author',) # or translatable_fields = ('title',) def __unicode__(self): return _(u"Blog entry '%s'") % self.title # works of course # Relations: # A foreign key may conceptually be the following: # 1. it either references an object in a specific language # 2. or it references a conceptual object in all of its languages # 1. class Image(models.Model): blog_entry_1 = models.ForeignKey(BlogEntry) # this ForeignKey can reference the PK of the BlogEntry object, no problem blog_entry_2 = models.ForeignKey(BlogEntry) # this ForeignKey may reference language and group_id as a key, more complicated # 2. class Image(models.Model): blog_entry_1 = models.ForeignKey(BlogEntry) # this ForeignKey can reference the PK of an BlogEntry object and get all other languages via group_id blog_entry_2 = models.MultiLanguageForeignKey(BlogEntry) # this ForeignKey can reference the group_id of a BlogEntry object and get objects in all languages blog_entry = BlogEntry(title="Some title", author=some_user) blog_entry.save() # saves in current language as default, new group # explicitly save for language, creates new group, new group_id is created blog_entry.save(language="en") # explicit language save, same group_id as other_blog_entry: blog_entry.save(language="en",connect_to=other_blog_entry) # Manager API that does implicit current language retrieval: blog_entry = BlogEntry.objects.get(group_id=UUID) # returns blog entry in current language, # ie behind the scenes: BlogEntry.objects.get(group_id=UUID, language=CURRENT_LANGUAGE) # Updates blog_entry.author = some_other_user blog_entry.title = "Some other title" blog_entry.save() # author is marked as untranslatable, therefore all instances must be updated # this can be a single query: UPDATE blog_blogentry SET author_id=<some_other_user_id> WHERE group_id=<UUID> # but needs #4102 to be fixed # title is marked translatable, therefore only this instance is updated. # these are two queries, but they may be reduced to one when changes on fields are tracked after instance creation # and only one kind of fields (translatable xor translatable) are changed # Deletes blog_entry.delete() # should only delete this language blog_entry.delete(all_languages=True) # deletes all translated objects all languages # Fallbacks # When a language is requested for which an object does not exist try: blog_entry = BlogEntry.objects.get(group_id=UUID) # object does not exist in current language except BlogEntry.DoesNotExistInLanguage as e: # a subclass of BlogEntry.DoesNotExist e.available_languages # list of available languages or may even contain the respective objects # List can be shown to user as an option to view object in other language
Things to note
- any time 'current language' is used it means django.utils.translation.get_language() which uses thread locals
- many more
Aspects and design Questions
- setting default locale in Managers?
- setting default locale in related objects?
- locale in URL?
- Mark model fields as internationalized through attribute or through meta class (first approach is bad for subclassing)
- getting current language easy, having easy access to other languages
- Models are often Subclassed, do the subclasses lose flexibility?
- is __unicode__(self): locals aware?
- yes, by the design (4)
- what if somebody needs to use internationalized objects with something complexity adding like django-reversion, is it still possible, which limitations result from a solution?
- must work anyway if they only used the documented API
- Language Fallbacks
- you allways want/need a fallback (flexible fallback strategy? who decides how the fallback is working)
- ltr and rtl languages mixin issues if there are multiple languages used in a page as result of fallback
- Solution to all of that: problem of someone else (application developer)
Indirect Issues, related problems
- Language in the URI, URL namespaces aware of language, reverse should be language-aware
- UI Multiforms for translations
- only languages officially supported by Django work or can be selected in settings.py (not sure if this is describing it exactly but there is a limitation)
- language selection
- in URL as .../<locale>/...
- in subdomain
- as query string ?lang=en
- as cookie (generally bad)
- language database table similar to ContentTypes database table?
A brief overview of existing approaches with implementing projects.
- property column is holding all translations (in a map)
- big table holding all translations
- table per property holding all languages of this property
- table for not internationalized properties and an extratable for the localized version
- language is saved as integer, which doesn't work for multiple site projects where language ids in setting can differ
- object per language, the association between objects is provided by a goup-id (an update of the not internationalized fields would overwrite this fields in all other language objects)
- a code example
- normal model with one language, additional table with all i18n fields in other languages
- django-model-i18n (status: experimental)
- Tickets: #6460 and #6952, not direct related: #9924 and #5446