Opened 2 months ago

Last modified 7 days ago

#28636 new New feature

Allow customizing the fallback language from the locale middleware

Reported by: Denis Anuschewski Owned by: nobody
Component: Internationalization Version: master
Severity: Normal Keywords: translation, internationalization, request
Cc: Triage Stage: Accepted
Has patch: yes Needs documentation: yes
Needs tests: yes Patch needs improvement: no
Easy pickings: no UI/UX: no

Description (last modified by Denis Anuschewski)

Problem: The user's discovered language preference is only returned when found in settings.LANGUAGES. That's more than sufficient for the normal translation routine, but there seems to be no way (at least no DRY way) to find out a user's preference REGARDLESS of settings.LANGUAGES.

Suggestion: It would be nice to have an easy way for finding out a languange preference from request, even if it's not listed in your supported languages. You could think of an optional flag, so that the preference from get_language_from_request is checked against LANG_INFO in the translation module rather than against settings.LANGUAGES. This would give you one of all available languages in the translation module. You could use that e.g. for comparing with settings.LANGUAGE_CODE and add a fallback for specific cases.

Change History (27)

comment:1 Changed 2 months ago by Denis Anuschewski

In my specific case I wrote a sub class from LocaleMiddleware in order to use add_fallback whenever a language gets detected that is not equal to settings.LANGUAGE_CODE. But that seems to be not possible with current Django because you always get the value of settings.LANGUAGE_CODE when the discovered language is not in settings.LANGUAGES.

Overwriting the function get_languages didn't work for me because translation was then activated for every language. I would like to have translations only for specific languages when I explicitly put them into settings.LANGUAGES, but also a fallback for every language except of settings.LANGUAGE_CODE.

Last edited 8 weeks ago by Denis Anuschewski (previous) (diff)

comment:2 Changed 8 weeks ago by Denis Anuschewski

Component: UtilitiesInternationalization
Description: modified (diff)
Keywords: internationalization added; locale removed

comment:3 Changed 8 weeks ago by Denis Anuschewski

Needs documentation: set

comment:4 Changed 8 weeks ago by Denis Anuschewski

Description: modified (diff)

comment:5 Changed 8 weeks ago by Denis Anuschewski

Has patch: set

comment:6 Changed 8 weeks ago by Claude Paroz

You could use that e.g. for comparing with settings.LANGUAGE_CODE and add a fallback for specific cases.

Could you please elaborate a bit more about the use case for that request?

comment:7 in reply to:  6 Changed 8 weeks ago by Denis Anuschewski

Replying to Claude Paroz:

You could use that e.g. for comparing with settings.LANGUAGE_CODE and add a fallback for specific cases.

Could you please elaborate a bit more about the use case for that request?

Yes of course.

Let's say we have following settings:

LANGUAGE_CODE = 'de-DE'
LANGUAGES = [
    ('en', 'English'),
    ('de', 'German'),
    ('es', 'Spanish'),

So we have German as standard language and the 3 languages in settings.LANGUAGES as those which we want to have translated. And now imagine we want to have a custom middleware that looks something like this:

class CustomLocaleMiddleware(LocaleMiddleware):
    def process_request(self, request):
        super(CustomLocaleMiddleware, self).process_request(request)
        lan = request.LANGUAGE_CODE
        preference = translation.get_language_from_request(request, support_all_languages=True)
        if not settings.LANGUAGE_CODE.startswith(preference):  # Preference differs from LANGUAGE_CODE
            translation.trans_real.translation(lan).add_fallback(
                translation.trans_real.DjangoTranslation('en'))

So every language preference that gets identified by get_language_from_request and differs from settings.LANGUAGE_CODE gets English as fallback language, even if it's not in settings.LANGUAGES. When you get e.g. pl as preference you get de as language, but with en as fallback (that's what we want). With current Django you would also get de as language, but no fallback because preference would not be pl but de-DE. It would only work with "en", "de", and "es".

Last edited 8 weeks ago by Denis Anuschewski (previous) (diff)

comment:8 Changed 8 weeks ago by Claude Paroz

In which language are the original strings in your scenario?

comment:9 Changed 8 weeks ago by Denis Anuschewski

You have de-DE (German) in LANGUAGE_CODE and pl (Polish) in HTTP_ACCEPT_LANGUAGE header of the user's request.

Currently, Django's output of get_language_from_request is pl if it is in settings.LANGUAGES and de-DE if it is not.

With my modification you get pl when you explicitly set support_all_languages to True, regardless of settings.LANGUAGES.

Check my PR for more clarification in the code base.

comment:10 Changed 8 weeks ago by Claude Paroz

Yes, I understand the difference of output produced by your code. However, I would like to completely understand your use case to be sure we are fixing it at the right level.

Returning to your example, why would you like to get a language code which is not in your site LANGUAGES? Why would the fallback language be different depending on the requested language?
Maybe you could present a table whose columns are: language of original strings, client language, LANGUAGE_CODE, LANGUAGES, resulting language/fallback (one line with current code, one line with your wanted situation).

comment:11 Changed 8 weeks ago by Denis Anuschewski

At first thanks a lot for the time and effort you put into my problem, it is very much appreciated. I did not get what you meant by language of original strings, sorry. You mean the de facto language of the strings that are marked as translations I guess. It is German, just like the value of LANGUAG_CODE (our project's working language).

So here is the table as requested:

language of original strings client language LANGUAGE_CODE LANGUAGES resulting language fallback
de pl de en, de, es de None
de pl de en, de, es pl en

First row is current Django, second is my wanted behaviour.

Returning to your example, why would you like to get a language code which is not in your site LANGUAGES?

Because I want to detect a user with Polish language (or any other foreign language) regardless of LANGUAGES. I dont't want to place pl in LANGUAGES because I don't have and I don't want a Polish translation, but I want to KNOW that I am dealing with a user with Polish language. With current Django (1st row) I just don't have the chance because I get the value of LANGUAGE_CODE as resulting language, so a Polish user gets treated just like a German one.

Why would the fallback language be different depending on the requested language?

Because I don't want a fallback for German users. They shall only get German texts no matter what because that's the de facto standard language in the project. For any other case English shall be the desired fallback so that we get the following order for strings marked as translated:

1) Try to get a translation for the client language (if it is in LANGUAGES, so either for en, de or es)
2) If the client language differs from the one in LANGUAGE_CODE (de=German), fall back to English
3) If there is no English fallback translation, just use the language of original strings (German)

So to be clear: I dont want to alter the existing translation routine. I just want to be able to get the right client language and afterwards add a fallback language depending on the client languages' value.

comment:12 Changed 8 weeks ago by Claude Paroz

I also appreciate your patience while explaining your use case to me :-)

Just one more question, wouldn't setting LANGUAGE_CODE = 'en' solve your issue?

comment:13 in reply to:  12 Changed 8 weeks ago by Denis Anuschewski

No problem :-)

Just one more question, wouldn't setting LANGUAGE_CODE = 'en' solve your issue?

Actually that was the first thing I did. But no, it has one giant drawback: because my language of original strings is German, setting LANGUAGE_CODE to 'en' leads to the necessity of having to maintain redundant German translations (or I get English translations for German users). Additionally when changing the central LANGUAGE_CODE setting you have to also change a lot of other stuff, e.g. in order for unit tests to go through.

comment:14 Changed 8 weeks ago by Claude Paroz

OK, I would suggest a slightly different approach. Could you have a look to this commit and tell me your opinion?
https://github.com/django/django/compare/master...claudep:28636?expand=1

The idea is to allow for custom fallback by subclassing the LocaleMiddleware and setting the fallback_language class variable (which would be 'en' in your case).

comment:15 in reply to:  14 Changed 7 weeks ago by Denis Anuschewski

Thx again. I tested your code and it really does exactly what I want!

You were right that I don't really need to know the request language if it's not in LANGUAGES, because the resulting language is always 'en' for these cases. So my custom middleware's process_request would look like this:

def process_request(self, request):
    self.fallback_language = 'en'
    super(CustomMiddleWare, self).process_request(request)
    lan = request.LANGUAGE_CODE
    if not settings.LANGUAGE_CODE.startswith(lan):
        translation.trans_real.translation(lan).add_fallback(DjangoTranslation(self.fallback_language))

Therefore I get either LANGUAGE_CODE, English or a language from LANGUAGES with English translation fallback. Bingo!

Your approach seems cleaner and more readable for me, plus it affects less code. But there is one little thing I am missing: I would like to see an additional paragraph in the docstring of get_language_from_request describing what fallback does. I would suggest something like this:

With a given fallback, this value will be returned instead of settings.LANGUAGE_CODE if no user language could be found.

Otherwise I am not missing anything. If you make a pull request with the little addition I mentioned, I am more than happy to declare it as a fix for this bug and vote for it.

comment:16 Changed 7 weeks ago by Claude Paroz

Needs tests: set
Summary: Translation module: Check `LANG_INFO` against user's language preference as optional featureAllow customizing the fallback language from the locale middleware
Triage Stage: UnreviewedAccepted
Version: 1.11master

Thanks for testing, I'll made a pull request soon.

However, are you sure you need that special process_request method? What about simply:

class CustomLocaleMiddleWare(LocaleMiddleware):
    fallback_language = 'en'

comment:17 Changed 7 weeks ago by Denis Anuschewski

Yes, I'm sure. Because what you implemented as a fallback differs from what add_fallback is providing:

Your fallback determines what translation language to use when none is found. add_fallback on the other hand gives a fallback on string basis when no translation for the string is found in a translation file. I need both for my desired result.

So with your code a Spanish user e.g. get's Spanish language, but missing Spanish translations get displayed in German. The goal is to have an English fallback in this case which add_fallback provides so that when your fallback does not come into play, the discovered user language get's an additional translation fallback.

Last edited 7 weeks ago by Denis Anuschewski (previous) (diff)

comment:18 Changed 7 weeks ago by Claude Paroz

I'm still not completely satisfied by my proposal. It only partially solves the issue, as shown by the hack you are forced to do in the process_request middleware method.

I think we have a problem in Django i18n in that we generally assume untranslated strings are in English (some problems you mentioned in comment:13). We can see that for example in the en special-casing in DjangoTranslation._add_fallback. We may miss a LANGUAGE_SOURCE setting (even if we know that non-English original strings are problematic with gettext when the language has more than 2 plurals). I don't know if that would completely solve the current issue.

comment:19 in reply to:  18 Changed 7 weeks ago by Denis Anuschewski

There are 3 things about this that come to my mind:

1)

I think we have a problem in Django i18n in that we generally assume untranslated strings are in English (some problems you mentioned in comment:13). We can see that for example in the en special-casing in DjangoTranslation._add_fallback.

I agree that having an exception for English in add_fallback just because of how Django's English translation files are or might be maintained seems awkward. But I would say that it doesn't really address the core of my problem – that is adding a specific fallback language – so it would be out of the scope of this ticket (maybe add a new ticket?).

2)

We may miss a LANGUAGE_SOURCE setting (even if we know that non-English original strings are problematic with gettext when the language has more than 2 plurals).

I don't think this would work out because a LANGUAGE_SOURCE (assuming that the language of original strings is meant) can be totally different in your project and for Django's core translations. So in #24413 you would have Polish as LANGUAGE_SOURCE I assume but it still wouldn't work for this particular issue which was solved with the English exception patch in add_fallback, because the language of original strings in Django's core is always English. In my case it is German for my project so it would also be in conflict with Django's language of original strings (so it either works for your custom translations or for Django's built-in translations).

I would say if you explicitly use a specific language or add a specific fallback you just have to make sure that translations for your language are provided! After all you have the possibility to add missing translations to your own .po files. Or for Django's English core translation the solution could really be just to make sure all translations in the .po files are provided, even if it's hellishly redundant (because msgid and msgstr are always the same).

I'm still not completely satisfied by my proposal. It only partially solves the issue, as shown by the hack you are forced to do in the process_request middleware method.

I wouldn't necessarily call it a 'hack' because process_request is not plainly overwritten, it calls process_request of the super class and adds additional stuff so it's nearly as maintainable as setting a flag. A custom middleware has to be used either way.

But it might be that I have found an even better way to accomplish what we want. I will test it tomorrow and then make a commit.

comment:20 in reply to:  14 Changed 7 weeks ago by Denis Anuschewski

OK, here is what I ended up with: could you maybe take a look and tell me what you think?
https://github.com/django/django/compare/master...denisiko:feature/translation-fallback

With this approach you can set your custom fallback exactly like with your changes by setting the middleware's class variable, but with an additional fallback for missing translations.

Last edited 7 weeks ago by Denis Anuschewski (previous) (diff)

comment:21 Changed 7 weeks ago by Denis Anuschewski

I think I have an even better approach:
https://github.com/denisiko/django/commit/faeb8a4db34121a9cfe61b281849b084dfbe6625

Of course there is missing documentation and unit tests at this point. But first I would like to get some feedback. What do you think Claude Paroz?

comment:22 Changed 7 weeks ago by Denis Anuschewski

Summary: Allow customizing the fallback language from the locale middlewareAllow customizing the translation fallback language

comment:23 Changed 7 weeks ago by Claude Paroz

I think that this makes sense, please make it a pull request after tests/docs have been added.
Adding new settings is not something we are doing lightly, though, but I don't really see how we could avoid that for your use case. Maybe other reviewers will suggest ideas.

comment:24 Changed 7 weeks ago by Denis Anuschewski

Thanks for your review! I also think a new setting makes sense and fits nicely with the other language settings. And not having to implement a custom middleware is a big advantage and therefore justifies a new setting in my opinion.

I am going to write the tests and the doc now and let you know when the PR is ready.

comment:25 Changed 5 weeks ago by Denis Anuschewski

I have a pull request now: https://github.com/django/django/pull/9248

Unfortunately, I could not find out why some tests are failing (has something to do with failing requests returning 404's where something is expected). Maybe you have an idea Claude Paroz?

comment:26 Changed 5 weeks ago by Denis Anuschewski

Nevermind, I fixed it myself :-)

The problem was my default of LANGUAGE_FALLBACK (I used LANGUAGE_CODE for that). In normal cases it works fine, but in unit tests it gives unintended results when you mock LANGUAGE_CODE. The language code then changes, but the fallback stays the same, giving you English values as result.

I solved this by setting LANGUAGE_FALLBACK to None by default, because when it gets used in get_fallback() it will be overwritten with LANGUAGE_CODE anyway if not set properly.

comment:27 Changed 7 days ago by Denis Anuschewski

Summary: Allow customizing the translation fallback languageAllow customizing the fallback language from the locale middleware
Note: See TracTickets for help on using tickets.
Back to Top