Version 4 (modified by mizi, 13 years ago) ( diff )

--

Introduction

Translation of Content stored in Models is a concern for many. A variant of 3rd party solutions have been developed. However, these solutions have many drawbacks. This page will summarize current use cases and propose an API for possibility (4). Related resources, discussions and tickets are listed at the end.

Use Cases

Simple language-aware model (1)

Some multilangual applications don't have relations between different language versions of objects (eg. blog, each entry in one language). Here it is only interesting which language the content has, which can be solved by programmer easily. This case works right now without any changes to Django.

Example:

class BlogEntry(models.Model):
    language = models.CharField(max_length=20, db_index=true)
    title = models.CharField(max_length=255)
    # other fields

Specialities:

  • each object is available only in one language
  • no need to share logic between representations in different languages
  • there is no representation of the object in an other language

Model implementation:

  • one object per language

Multilingual model with multiple languages in one object

The translations for all languages are stored on the same object.

(2) One way could be:

class BlogEntry(models.Model):
    title = models.CharField(max_length=255)
    title_de = models.CharField(max_length=255)
    title_ru = models.CharField(max_length=255)
    # etc.

Advantages:

  • a conceptual object is represented by one real object

Disadvantages:

  • Schema changes necessary for additional languages
  • FKs cannot directly reference a specific language and also need a language attribute

(3) Another way could be: Translations are stored for each model in another model and are JOINed on demand.

Advantages:

  • No schema changes

Disadvantages:

  • JOINs can be considered evil

Multilingual model with one object per language (4)

The Model is language-aware but also has a indirect reference to the same object in different languages via a common key:

Example:

class BlogEntry(models.Model):
    language = models.CharField(max_length=20, db_index=true)
    group_id = models.CharField(max_length=36) # e.g. UUID
    # All database objects that represent the same logical object in different languages
    # have the same group_id (possibly a UUID, may also be an integer).
    
    title = models.CharField(max_length=255)
    # other fields
    

Advantages:

  • flexible in respect to additional languages
  • querying stays easy, since basic model capabilities hold

Disadvantages:

  • multiple model objects for one conceptual object, does not really hurt
  • duplication of non-translatable fields (ie fields that stay the same across translations)

API for Model Translation in respect to (4) - a proposal/discussion:

class BlogEntry(TranslatableModel):
    title = models.CharField(max_length=255)
    author = models.ForeignKey(User)
    # To mark fields as translatable/untranslatable:
    #  * title = models.CharField(max_length=255, translatable=True)
    #  * author = models.CharField(max_length=255, untranslatable=True)
    last_modified = models.DateTimeField(auto_now=True)

    # or introduce a field in a Meta class
    class Meta:
        untranslatable_fields = ('last_modified','author',)
        #  or translatable_fields = ('title',)
        
    def __unicode__(self):
        return _(u"Blog entry '%s'") % self.title # works of course

# Relations:
# A foreign key may conceptually be the following:
#  1. it either references an object in a specific language
#  2. or it references a conceptual object in all of its languages

# 1.
class Image(models.Model):
    blog_entry_1 = models.ForeignKey(BlogEntry)
    # this ForeignKey can reference the PK of the BlogEntry object, no problem
    blog_entry_2 = models.ForeignKey(BlogEntry)
    # this ForeignKey may reference language and group_id as a key, more complicated

# 2.
class Image(models.Model):
    blog_entry_1 = models.ForeignKey(BlogEntry)
    # this ForeignKey can reference the PK of an BlogEntry object and get all other languages via group_id
    blog_entry_2 = models.MultiLanguageForeignKey(BlogEntry)
    # this ForeignKey can reference the group_id of a BlogEntry object and get objects in all languages
    

blog_entry = BlogEntry(title="Some title", author=some_user)
blog_entry.save() # saves in current language as default, new group

# explicitly save for language, creates new group, new group_id is created 
blog_entry.save(language="en")
# explicit language save, same group_id as other_blog_entry:
blog_entry.save(language="en",connect_to=other_blog_entry)

# Manager API that does implicit current language retrieval:
blog_entry = BlogEntry.objects.get(group_id=UUID)
# returns blog entry in current language,
# ie behind the scenes: BlogEntry.objects.get(group_id=UUID, language=CURRENT_LANGUAGE)

# Updates
blog_entry.author = some_other_user
blog_entry.title = "Some other title"
blog_entry.save()
# author is marked as untranslatable, therefore all instances must be updated
#   this can be a single query: UPDATE blog_blogentry SET author_id=<some_other_user_id> WHERE group_id=<UUID>
#   but needs #4102 to be fixed
# title is marked translatable, therefore only this instance is updated.
# these are two queries, but they may be reduced to one when changes on fields are tracked after instance creation
# and only one kind of fields (translatable xor translatable) are changed

# Deletes
blog_entry.delete() # should only delete this language
blog_entry.delete(all_languages=True) # deletes all translated objects all languages

# Fallbacks
# When a language is requested for which an object does not exist
try:
    blog_entry = BlogEntry.objects.get(group_id=UUID) # object does not exist in current language
except BlogEntry.DoesNotExistInLanguage as e: # a subclass of BlogEntry.DoesNotExist
    e.available_languages # list of available languages or may even contain the respective objects
    # List can be shown to user as an option to view object in other language

Things to note

  • any time 'current language' is used it means django.utils.translation.get_language() which uses thread locals

Important Tickets

Aspects and design Questions

  • setting default locale in Managers?
  • setting default locale in related objects?
  • locale in URL?
  • Mark model fields as internationalized through attribute or through meta class (first approach is bad for subclassing)
  • getting current language easy, having easy access to other languages
  • Models are often Subclassed, do the subclasses lose flexibility?
  • is __unicode__(self): locals aware?
    • yes, by the design (4)
  • what if somebody needs to use internationalized objects with something complexity adding like django-reversion, is it still possible, which limitations result from a solution?
    • must work anyway if they only used the documented API
  • Language Fallbacks
    • you allways want/need a fallback (flexible fallback strategy? who decides how the fallback is working)
    • ltr and rtl languages mixin issues if there are multiple languages used in a page as result of fallback
    • Solution to all of that: problem of someone else (application developer)

Indirect Issues, related problems

  • Language in the URI, URL namespaces aware of language, reverse should be language-aware
  • UI Multiforms for translations
  • only languages officially supported by Django work or can be selected in settings.py (not sure if this is describing it exactly but there is a limitation)
  • language selection
    • in URL as .../<locale>/...
    • in subdomain
    • as query string ?lang=en
    • as cookie (generally bad)
    • language database table similar to ContentTypes database table?

Approach overview

A brief overview of existing approaches with implementing projects.

  • property column is holding all translations (in a map)
  • transdb

  • table per property holding all languages of this property

  • table for not internationalized properties and an extratable for the localized version
  • django-multilingual
    • language is saved as integer, which doesn't work for multiple site projects where language ids in setting can differ
  • django-multilingual-model

Other resources

bucket

others

Note: See TracWiki for help on using the wiki.
Back to Top