Opened 8 years ago

Closed 3 years ago

Last modified 2 years ago

#11240 closed Bug (fixed)

Compilemessages fails if a % character is at certain places in the .po file

Reported by: Till Backhaus Owned by: Marc Garcia
Component: Internationalization Version: 1.3
Severity: Normal Keywords:
Cc: kikko, harm.verhagen+django@…, claude@… Triage Stage: Accepted
Has patch: yes Needs documentation: no
Needs tests: no Patch needs improvement: yes
Easy pickings: no UI/UX: no

Description (last modified by Ramiro Morales)

Reproduce:
Create a template:

{% load i18n %}
{% trans "findme 10% " %}

run ./manage.py makemessages -a

find the string in the .po file and translate it like this:

#: templates/test.html:2
#, python-format
msgid "findme 10% of their"
msgstr "findemich 10% an "

run ./manage.py compilemessages

get this error message:

/project/locale/de/LC_MESSAGES/django.po:925: 'msgstr' is not a valid Python format string, unlike 'msgid'. Reason: In the directive number 1, the character 'a' is not a valid conversion specifier.
msgfmt: found 1 fatal errors

Attachments (2)

11240-1.diff (9.4 KB) - added by Ramiro Morales 5 years ago.
11240-test.diff (1.8 KB) - added by Claude Paroz 4 years ago.
Test showing different behaviour between trans and blocktrans

Download all attachments as: .zip

Change History (34)

comment:1 Changed 7 years ago by Ramiro Morales

Description: modified (diff)

(formatted description)

comment:2 Changed 7 years ago by Ramiro Morales

Yes, for the translatable literals extraction process templates are (internally) converted to python code and then fed as such to the gettext tools.

This means that Python string formatting specifiers rules should be followed when using the i18n template tags with string literals containing extrapolated variables or special meaning chars like %. In this case it means you literal would need to be {% trans "findme 10%% " %} as per http://www.gnu.org/software/gettext/manual/gettext.html#python_002dformat and http://www.python.org/doc/2.2.1/lib/typesseq-strings.html (linked from the first).

I don't know if this means we should add a note to the effect to the docs.

comment:3 Changed 7 years ago by Marc Garcia

Owner: changed from nobody to Marc Garcia

comment:4 Changed 7 years ago by Alex Gaynor

Component: InternationalizationDocumentation
Triage Stage: UnreviewedAccepted

comment:5 Changed 6 years ago by leanmeandonothingmachine

For me that outputs "findme 10%%" and not "findme 10%" so that doesn't really seem to be a solution.

comment:6 Changed 6 years ago by Ramiro Morales

Resolution: worksforme
Status: newclosed

Note the reporter claims he extracted the translatable literal from a template but it is prefixed by a #, python-format flag in the PO file, and that's not consistent.

I've tested both kinds (with and without a python-format flag) of literals and translations always similar to the ones reported and didn't get any compilemessages error. MO file got generated without error.

Closing this old ticket as worksforme instead of fixed because I don't see any change committed to the compilemessages management command related to this.

comment:7 Changed 6 years ago by benjaoming

Component: DocumentationTranslations
Easy pickings: unset
Resolution: worksforme
Severity: Normal
Status: closedreopened
Type: Uncategorized
Version: 1.01.3

@ramiro:

Note the reporter claims he extracted the translatable literal from a template but it is prefixed by a #, python-format flag in the PO file, and that's not consistent.

That's very consistent: It is makemessages that generated this flag, because it saw a '%' in the string!

because I don't see any change committed to the compilemessages management command related to this.

Sorry, but that's not enough. You're right that the error message will go away. Nonetheless, two percentage characters are outputted when using {% trans "findme 10%%" %}: "findme 10%%"

Reopening. I have reproduced it on Django 1.3 and 1.2. As of now, the fix is to use blocktrans instead, in which case you write:

{% blocktrans %}findme 10%{% endblocktrans %}

...producing correct .po gramma:

#, python-format
msgid "findme 10%%"
msgstr "findme 10%%"

...and it will wrongly output an untranslated "findme 10%" -- it doesn't translate!

Last edited 5 years ago by Ramiro Morales (previous) (diff)

comment:9 Changed 6 years ago by Julien Phalip

Type: UncategorizedBug

comment:10 Changed 6 years ago by kikko

Cc: kikko added

comment:11 in reply to:  7 Changed 6 years ago by Ramiro Morales

Resolution: needsinfo
Status: reopenedclosed

Replying to ramiro:

Yes, for the translatable literals extraction process templates are (internally) converted to python code and then fed as such to the gettext tools.

This means that Python string formatting specifiers rules should be followed when using the i18n template tags with string literals containing extrapolated variables or special meaning chars like %. In this case it means you literal would need to be {% trans "findme 10%% " %}

I've verified this is behaving correctly with trunk as of now and with 1.3. Ignore that advice I gave back then. I suspect r14459 fixed this in django so it works transparently for the app developer without the need to use %% with the trans i18n template tag.

To test things I created a template like this:

{% load i18n %}
{% trans "a literal with a percent symbol at the end %" %}</br>
{% trans "a literal with a percent symbol at the end 10%" %}</br>
{% trans "a literal with a percent % symbol in the middle" %}</br>
{% trans "a literal with a percent 20% symbol in the middle" %}</br>

makemessages -l de created a .po file like this from it (note I've already added dummy translations):

#: t11240/templates/a.html:2
msgid "a literal with a percent symbol at the end %"
msgstr "translation to German of a literal with a percent symbol at the end %"

#: t11240/templates/a.html:3
msgid "a literal with a percent symbol at the end 10%"
msgstr ""
"translation to German of a literal with a percent symbol at the end 10%"

#: t11240/templates/a.html:4
#, python-format
msgid "a literal with a percent % symbol in the middle"
msgstr ""
"translation to German of a literal with a percent % symbol in the middle"

#: t11240/templates/a.html:5
#, python-format
msgid "a literal with a percent 20% symbol in the middle"
msgstr ""
"translation to German a literal with a percent 20% symbol in the middle"

and compilemessages creates a corresponding .mo file without problems.

Setting LANGUAGE="de" and using the template in a view correctly shows:

translation to German of a literal with a percent symbol at the end %
translation to German of a literal with a percent symbol at the end 10%
translation to German of a literal with a percent % symbol in the middle
translation to German a literal with a percent 20% symbol in the middle

I'm closing this ticket and setting the reason needsinfo. If you reopen it please provide exact details of what you are seeing.

Last edited 6 years ago by Ramiro Morales (previous) (diff)

comment:12 Changed 6 years ago by Ramiro Morales

Component: TranslationsInternationalization

comment:13 Changed 5 years ago by harm

Resolution: needsinfo
Status: closedreopened
UI/UX: unset

Reopened.

I see this issue too with django 1.3

template

{% trans '% of test' %}

po file

msgid "% of test"
msgstr "% van testtrans"

python manage.py compilemessages gives the following error

locale/nl/LC_MESSAGES/django.po:950: 'msgstr' is not a valid Python format string, unlike 'msgid'. Reason: In the directive number 1, the character 'v' is not a valid conversion specifier.
msgfmt: found 1 fatal error

Escaping with a double %% in the template doesn'th help. (that renders as a double percentage)

comment:14 Changed 5 years ago by harm

Cc: harm.verhagen+django@… added

comment:15 Changed 5 years ago by Ramiro Morales

I think I understand now what this ticket has always been about.

The issue is with the '% o' fragment of the examples in the msgid's provided by both users that experienced problems.

'% o' is a valid interpolation specification, it mean an unsigned octal preceded by a space for positive values (or the '-' sign for negative ones) this is what the space conversion flag means.

I suspect this isn't the only case when an unintended formatting specifier can sneak in translatable literals involving percent characters.

Last edited 5 years ago by Ramiro Morales (previous) (diff)

comment:16 Changed 5 years ago by Ramiro Morales

Has patch: set

The patch attached implements a fix for this in the 'makemessages' step by implementing escaping of '%' symbols in literals passed to the {% trans %} template tag (it replaces them with '%%').

This means that starting with this change simple Python string interpolation isn't supported in literals passed to 'trans' anymore.

This also means that the 'msgid''s extracted from such literals will have now '%%' and that translators should also use the same sequence in the respective 'msgstr''s. With this:

  • Unfortunately GNU gettext's xgettext still marks the msgid/msgstr entry with the #, python-format flag. Even when it contains no Python string formatting specification (e.g. "A string with two %%" still gets marked so). There is no way we can avoid addition of the flag or to remove it afterward because the execution of xgettext is an opaque step.
  • Even when the entry is marked with the #, python-format flag GNU gettext's msgfmt command additional checks triggered by such mark doesn't reject it with the error message reported in this ticket.

Changed 5 years ago by Ramiro Morales

Attachment: 11240-1.diff added

comment:17 Changed 5 years ago by Jannis Leidel

LGTM

comment:18 Changed 5 years ago by Claude Paroz

Triage Stage: AcceptedReady for checkin

comment:19 Changed 5 years ago by Claude Paroz

Cc: claude@… added

comment:20 Changed 5 years ago by Ramiro Morales

Resolution: fixed
Status: reopenedclosed

In [17190]:

Fixed #11240 -- Made makemessages i18n command escape % symbols in literals passed to the trans tag.

This avoids problems with unintended automatic detection, marking and
validation of Python string formatting specifiers performed by
xgettext(1)/msgfmt(1) that stem from the fact that under the hood makemessages
converts templates to Python code before passing them to xgettext.

This also makes it consistent with its behavior on literals passed to the
blocktrans tag.

Thanks Jannis and claude for reviewing and feedback.

comment:21 Changed 4 years ago by anonymous

I still see this with 1.4.3? All msgstr's with % fail as original bug explains.

comment:22 Changed 4 years ago by anonymous

Sorry ignore that, I messed up with versions.

comment:23 Changed 4 years ago by defaultwombat

Resolution: fixed
Status: closednew

I have issues with trans tags containing % Symbols using django 1.5.1

The trans tag doesn't handle a % the same way makemessages does. A {% trans "value in %"} still looks for a msgid "value in %" but makemessages created a msgid "value in %%" for it.
The trans tag should implement the same string conversion as makemessage does.

The workaround to already escape the % in the trans tag - {% trans "value in %%"} - returns the right translation.
But as the translation will most likely have a "%%" in it the rendered conent will have a "%%" too.
A workaround could be to let the TranslateNode return value % ().

Currently the only way to avoid issues with % Symbols in translations is using blocktrans tags.

It might be worth monitoring http://savannah.gnu.org/bugs/?func=detailitem&item_id=30854 for changes on gettext handling python format strings.

Changed 4 years ago by Claude Paroz

Attachment: 11240-test.diff added

Test showing different behaviour between trans and blocktrans

comment:24 Changed 4 years ago by Claude Paroz

Unfortunately yes, I also think there is still an issue with trans and percents. Just attached the test case as it should be (and which currently fails).

comment:25 Changed 4 years ago by anonymous

You could just use a custom wrapper for pgettext (for example)

wrapper.py:


from django.utils.translation import pgettext as pgt

def pgettext(context, msg):

return pgt(context, msg).replace("%%", "%")

comment:26 Changed 4 years ago by skyjur

Another workaround is to use alternative syntax for translation:

    {{ _('Percent sign here % is not escaped.') }}

I guess that gettext might be looking for python syntaxed string-formatting in template files and is mislead because text string is followed directly by percent sign %}.

comment:27 Changed 4 years ago by Tim Graham

Patch needs improvement: set
Triage Stage: Ready for checkinAccepted

comment:28 Changed 3 years ago by Ramiro Morales

It would have been much better if you had opened a new ticket instead of reopening this one.

I suspect the issue has to do with the fact that the xgettext tool unconditionally adds a "#, python-format" ​flag (see comment:16) but we (by design) don't support Python string format specifiers on 'trans' tag literals.

comment:29 Changed 3 years ago by Bouke Haarsma

Trying to replicate the bug, I've found that #, python-format compiles just fine with invalid string format specifiers.

#, python-format
msgid "My project has 20% success"
msgstr "No such thing as 20% success"
$ xgettext -V
xgettext (GNU gettext-tools) 0.18.2

Why would this be the case? Is this expected behaviour? If so, replacing % by %% makes no sense and that changeset should be reverted.

comment:30 in reply to:  29 Changed 3 years ago by Ramiro Morales

Replying to bouke:

Trying to replicate the bug, I've found that #, python-format compiles just fine with invalid string format specifiers.

#, python-format
msgid "My project has 20% success"
msgstr "No such thing as 20% success"

These aren't invalid Python format specifiers. "% s" is a valid one. Can you re-test with a %-prefixed sequence that is actually invalid?

Last edited 3 years ago by Ramiro Morales (previous) (diff)

comment:31 Changed 3 years ago by Ramiro Morales

I'm re-closing this ticket because the original issue was fixed two years ago.

If you intend to follow-up please open a newer one with a clean slate. Make sure you take in account comments comment:28 and comment:30 in you analysis.

comment:32 Changed 3 years ago by Ramiro Morales

Resolution: fixed
Status: newclosed

comment:33 Changed 2 years ago by kingsley

Or you could replace instances of "%" with ascii "&#37;"
Works for me

Note: See TracTickets for help on using tickets.
Back to Top