Opened 10 years ago
Closed 9 years ago
#24257 closed Bug (fixed)
The trans template tag fails to get a message when there is a % character in the string
Reported by: | Alan Boudreault | Owned by: | nobody |
---|---|---|---|
Component: | Internationalization | Version: | 1.7 |
Severity: | Normal | Keywords: | trans templatetag i18n |
Cc: | Triage Stage: | Accepted | |
Has patch: | yes | Needs documentation: | no |
Needs tests: | no | Patch needs improvement: | no |
Easy pickings: | no | UI/UX: | no |
Description
I am using a nested template with translation in my django template file. This is my basic html code:
<script id="invoice-date-modal-template" type="text/html"> <div id="modal" class="reveal-modal tiny " data-reveal> <h5><%= client %> - <%= date %></h5> <h6>{% trans "Modify the invoice state of <%= client %> dated of <%= date %>?" %}</h6> <p>{% trans "Select the session invoice date and confirm." %}</p> <div class="row">...</div> </div> </script>
makemessages && compilemessages.
In django.po: I see that the % characters has been doubled:
msgid "" "Modify the invoice state of <%%= client %%> of <%%= date %%>?" msgstr "" "Changer l'état de facturation de <%%= client %%> du <%%= date %%>?"
In my web application, I can see that my first trans is not translated, but the second one is ok.
I've found a workaround for that: using blocktrans instead of trans. Doing it this way work in my application:
<h6>{% blocktrans %}Modify the invoice state of <%= client %> of <%= date %>?{% endblocktrans %}</h6>
Looks like something is different or missing with the trans template tag.
Change History (8)
comment:1 by , 10 years ago
Triage Stage: | Unreviewed → Accepted |
---|
comment:2 by , 10 years ago
The problem with the current solution of doubling percents in trans
string content (to prevent msgfmt check errors, as explained in #11240), is that it's almost impossible to make the reverse operation by replacing double percents by single ones, because the result is a lazy translation (and should be kept lazy), hence post-processing the string is not possible. blocktrans
does that implicitely during the placeholder substitution operation (result % data
).
The only solution I can see currently is to stop doubling percents in trans
strings, and document that in case of translation problems, a translator comment should be added before the string: # Translators: xgettext:no-python-format
(to prevent msgfmt from complaining).
comment:3 by , 10 years ago
See that commit where the above solution is applied: https://github.com/claudep/django/compare/24257
The downside is that a same string translated with trans
or with blocktrans
does not appear identically in the po file:
{% trans "50% of something" %} => msgid "50% of something" {% blocktrans %}50% of something{% endblocktrans %} => msgid "50%% of something"
comment:4 by , 10 years ago
This was the ticket I decided to tackle at the pycon sprints (claudep wish you were there so I could thank you in person for your patience on my previous pull... and this one too).
This issue turned out to be quite difficult. I've tried multiple approaches.
Long story short
The simplest solution is to ensure that all translation messages that have a percent signs get python formatting.
Long story long
In an ideal world we would be able to use the template string (with it's curly brace variables) as our msgid. Since that is not possible, template copy is coerced into a python string format / sprintf string.
This is a source of much pain and confusion, especially with percent signs. Why?
xgettext is awkward.
xgettext will identify all python-format strings, example:
echo 'gettext("%s");' | xgettext --language=python --output=- - #, python-format msgid "%s" msgstr ""
This is all good. It gets awkward when you pass it an invalid str fmt, say:
echo 'gettext("%s costs 10%");' | xgettext --language=python --output=- - msgid "%s costs 10%" msgstr ""
This is awkward because the single %
is seen as invalid python format and so the message id is not marked as we would expect.
str fmt / sprintf is awkward.
Since humans are bad parsers, when I look at the format, % %s
I see a percent followed by a string specifier. What this actually means a percent sign specifier, with a whitespace conversion flag, followed by an s
character.
Example:
>>> "% %s" % () '%s' >>> "% 10%s" % () ' %s' >>>
These two bits awkwardness has caused some confusion as past developers have tried to shoehorn the template language into gettext & str fmt.
Example:
{% blocktrans with a=1 %}Blocktrans extraction shouldn't double escape this: %%, a={{ a }}{% endblocktrans %}
Currently django creates msgid:
Blocktrans extraction shouldn't double escape this: %%, a=%(a)s
This is wrong, the correct msgid needs to be:
Blocktrans extraction shouldn't double escape this: %%%%, a=%(a)s
There is a series of negative side-effects.
- The string
1 percent sign %, 2 percent signs %%
is extracted as1 percent sign %%, 2 percent signs %%
- A weird string like
{{item}} costs 10%, three percent signs %%%
will not get marked#, python-format
, Gettext will not complain when compiling messages and django will blow up because string interpolation chokes on%%%
(seeing it as a percent sign followed by a broken identifier).
The solution
First get some robust tests that capture all corner cases of awkwardness. I've added a sample app to the i18n tests to make it easier to write tests that evaluate extraction and translation using the same gettext catalogs. I then refactored tests addressing extraction and translation when a %
is involved.
Second, knock out any special handling of percents and ensure valid python-formatting on strings.
Technically not all template strings with a percent should be python-formatted. If the template was only {% trans "10%" %}
this could go into the gettext catalog with msgid 10%
but such things is not possible with how blocktrans
is rendered.
When using trans
and blocktrans
with the same copy, the two should always extract the same msgid and render identically.
comment:5 by , 10 years ago
Has patch: | set |
---|
comment:7 by , 9 years ago
Patch needs improvement: | unset |
---|
Sigh... will we get this one right one day :-/ (see #11240). I was able to reproduce it.