Opened 9 years ago

Closed 9 years ago

#24257 closed Bug (fixed)

The trans template tag fails to get a message when there is a % character in the string

Reported by: Alan Boudreault Owned by: nobody
Component: Internationalization Version: 1.7
Severity: Normal Keywords: trans templatetag i18n
Cc: Triage Stage: Accepted
Has patch: yes Needs documentation: no
Needs tests: no Patch needs improvement: no
Easy pickings: no UI/UX: no

Description

I am using a nested template with translation in my django template file. This is my basic html code:

 <script id="invoice-date-modal-template" type="text/html">
        <div id="modal" class="reveal-modal tiny " data-reveal>
            <h5><%= client %> - <%= date %></h5>
            <h6>{% trans "Modify the invoice state of <%= client %> dated of <%= date %>?" %}</h6>
            <p>{% trans "Select the session invoice date and confirm." %}</p>
            <div class="row">...</div>
      </div>
 </script>

makemessages && compilemessages.

In django.po: I see that the % characters has been doubled:

msgid ""
"Modify the invoice state of <%%= client %%> of <%%= date %%>?"
msgstr ""
"Changer l'état de facturation de <%%= client %%> du <%%= date %%>?"

In my web application, I can see that my first trans is not translated, but the second one is ok.

I've found a workaround for that: using blocktrans instead of trans. Doing it this way work in my application:

<h6>{% blocktrans %}Modify the invoice state of <%= client %> of <%= date %>?{% endblocktrans %}</h6>

Looks like something is different or missing with the trans template tag.

Change History (8)

comment:1 by Claude Paroz, 9 years ago

Triage Stage: UnreviewedAccepted

Sigh... will we get this one right one day :-/ (see #11240). I was able to reproduce it.

comment:2 by Claude Paroz, 9 years ago

The problem with the current solution of doubling percents in trans string content (to prevent msgfmt check errors, as explained in #11240), is that it's almost impossible to make the reverse operation by replacing double percents by single ones, because the result is a lazy translation (and should be kept lazy), hence post-processing the string is not possible. blocktrans does that implicitely during the placeholder substitution operation (result % data).

The only solution I can see currently is to stop doubling percents in trans strings, and document that in case of translation problems, a translator comment should be added before the string: # Translators: xgettext:no-python-format (to prevent msgfmt from complaining).

comment:3 by Claude Paroz, 9 years ago

See that commit where the above solution is applied: https://github.com/claudep/django/compare/24257

The downside is that a same string translated with trans or with blocktrans does not appear identically in the po file:

{% trans "50% of something" %} => msgid "50% of something"
{% blocktrans %}50% of something{% endblocktrans %} => msgid "50%% of something"

comment:4 by Doug Beck, 9 years ago

This was the ticket I decided to tackle at the pycon sprints (claudep wish you were there so I could thank you in person for your patience on my previous pull... and this one too).

This issue turned out to be quite difficult. I've tried multiple approaches.

Long story short

The simplest solution is to ensure that all translation messages that have a percent signs get python formatting.

Long story long

In an ideal world we would be able to use the template string (with it's curly brace variables) as our msgid. Since that is not possible, template copy is coerced into a python string format / sprintf string.

This is a source of much pain and confusion, especially with percent signs. Why?

xgettext is awkward.

xgettext will identify all python-format strings, example:

echo 'gettext("%s");' | xgettext --language=python --output=- -

#, python-format
msgid "%s"
msgstr ""

This is all good. It gets awkward when you pass it an invalid str fmt, say:

echo 'gettext("%s costs 10%");' | xgettext --language=python --output=- -

msgid "%s costs 10%"
msgstr ""

This is awkward because the single % is seen as invalid python format and so the message id is not marked as we would expect.

str fmt / sprintf is awkward.

Since humans are bad parsers, when I look at the format, % %s I see a percent followed by a string specifier. What this actually means a percent sign specifier, with a whitespace conversion flag, followed by an s character.

Example:

>>> "% %s" % ()
'%s'
>>> "% 10%s" % ()
'         %s'
>>>

Side effects

These two bits awkwardness has caused some confusion as past developers have tried to shoehorn the template language into gettext & str fmt.

Example:

{% blocktrans with a=1 %}Blocktrans extraction shouldn't double escape this: %%, a={{ a }}{% endblocktrans %}

Currently django creates msgid:

Blocktrans extraction shouldn't double escape this: %%, a=%(a)s

This is wrong, the correct msgid needs to be:

Blocktrans extraction shouldn't double escape this: %%%%, a=%(a)s

There is a series of negative side-effects.

  1. The string 1 percent sign %, 2 percent signs %% is extracted as 1 percent sign %%, 2 percent signs %%
  1. A weird string like {{item}} costs 10%, three percent signs %%% will not get marked #, python-format, Gettext will not complain when compiling messages and django will blow up because string interpolation chokes on %%% (seeing it as a percent sign followed by a broken identifier).

The solution

First get some robust tests that capture all corner cases of awkwardness. I've added a sample app to the i18n tests to make it easier to write tests that evaluate extraction and translation using the same gettext catalogs. I then refactored tests addressing extraction and translation when a % is involved.

Second, knock out any special handling of percents and ensure valid python-formatting on strings.

Technically not all template strings with a percent should be python-formatted. If the template was only {% trans "10%" %} this could go into the gettext catalog with msgid 10% but such things is not possible with how blocktrans is rendered.

When using trans and blocktrans with the same copy, the two should always extract the same msgid and render identically.

The pull: https://github.com/django/django/pull/4549

Version 6, edited 9 years ago by Doug Beck (previous) (next) (diff)

comment:5 by Tim Graham, 9 years ago

Has patch: set

comment:6 by Tim Graham, 9 years ago

Patch needs improvement: set

Left comments for improvement on the PR.

comment:7 by Tim Graham, 9 years ago

Patch needs improvement: unset

comment:8 by Tim Graham <timograham@…>, 9 years ago

Resolution: fixed
Status: newclosed

In b7508896:

Fixed #24257 -- Corrected i18n handling of percent signs.

Refactored tests to use a sample project.

Updated extraction:

  • Removed special handling of single percent signs.
  • When extracting messages from template text, doubled all percent signs so they are not interpreted by gettext as string format flags. All strings extracted by gettext, if containing a percent sign, will now be labeled "#, python-format".

Updated translation:

  • Used "%%" for "%" in template text before calling gettext.
  • Updated {% trans %} rendering to restore "%" from "%%".
Note: See TracTickets for help on using tickets.
Back to Top