Code

Opened 5 years ago

Closed 5 months ago

Last modified 5 months ago

#11240 closed Bug (fixed)

Compilemessages fails if a % character is at certain places in the .po file

Reported by: tback Owned by: garcia_marc
Component: Internationalization Version: 1.3
Severity: Normal Keywords:
Cc: kikko, harm.verhagen+django@…, claude@… Triage Stage: Accepted
Has patch: yes Needs documentation: no
Needs tests: no Patch needs improvement: yes
Easy pickings: no UI/UX: no

Description (last modified by ramiro)

Reproduce:
Create a template:

{% load i18n %}
{% trans "findme 10% " %}

run ./manage.py makemessages -a

find the string in the .po file and translate it like this:

#: templates/test.html:2
#, python-format
msgid "findme 10% of their"
msgstr "findemich 10% an "

run ./manage.py compilemessages

get this error message:

/project/locale/de/LC_MESSAGES/django.po:925: 'msgstr' is not a valid Python format string, unlike 'msgid'. Reason: In the directive number 1, the character 'a' is not a valid conversion specifier.
msgfmt: found 1 fatal errors

Attachments (2)

11240-1.diff (9.4 KB) - added by ramiro 2 years ago.
11240-test.diff (1.8 KB) - added by claudep 12 months ago.
Test showing different behaviour between trans and blocktrans

Download all attachments as: .zip

Change History (33)

comment:1 Changed 5 years ago by ramiro

  • Description modified (diff)
  • Needs documentation unset
  • Needs tests unset
  • Patch needs improvement unset

(formatted description)

comment:2 Changed 5 years ago by ramiro

Yes, for the translatable literals extraction process templates are (internally) converted to python code and then fed as such to the gettext tools.

This means that Python string formatting specifiers rules should be followed when using the i18n template tags with string literals containing extrapolated variables or special meaning chars like %. In this case it means you literal would need to be {% trans "findme 10%% " %} as per http://www.gnu.org/software/gettext/manual/gettext.html#python_002dformat and http://www.python.org/doc/2.2.1/lib/typesseq-strings.html (linked from the first).

I don't know if this means we should add a note to the effect to the docs.

comment:3 Changed 5 years ago by garcia_marc

  • Owner changed from nobody to garcia_marc

comment:4 Changed 5 years ago by Alex

  • Component changed from Internationalization to Documentation
  • Triage Stage changed from Unreviewed to Accepted

comment:5 Changed 4 years ago by leanmeandonothingmachine

For me that outputs "findme 10%%" and not "findme 10%" so that doesn't really seem to be a solution.

comment:6 Changed 3 years ago by ramiro

  • Resolution set to worksforme
  • Status changed from new to closed

Note the reporter claims he extracted the translatable literal from a template but it is prefixed by a #, python-format flag in the PO file, and that's not consistent.

I've tested both kinds (with and without a python-format flag) of literals and translations always similar to the ones reported and didn't get any compilemessages error. MO file got generated without error.

Closing this old ticket as worksforme instead of fixed because I don't see any change committed to the compilemessages management command related to this.

comment:7 follow-up: Changed 3 years ago by benjaoming

  • Component changed from Documentation to Translations
  • Easy pickings unset
  • Resolution worksforme deleted
  • Severity set to Normal
  • Status changed from closed to reopened
  • Type set to Uncategorized
  • Version changed from 1.0 to 1.3

@ramiro:

Note the reporter claims he extracted the translatable literal from a template but it is prefixed by a #, python-format flag in the PO file, and that's not consistent.

That's very consistent: It is makemessages that generated this flag, because it saw a '%' in the string!

because I don't see any change committed to the compilemessages management command related to this.

Sorry, but that's not enough. You're right that the error message will go away. Nonetheless, two percentage characters are outputted when using {% trans "findme 10%%" %}: "findme 10%%"

Reopening. I have reproduced it on Django 1.3 and 1.2. As of now, the fix is to use blocktrans instead, in which case you write:

{% blocktrans %}findme 10%{% endblocktrans %}

...producing correct .po gramma:

#, python-format
msgid "findme 10%%"
msgstr "findme 10%%"

...and it will wrongly output an untranslated "findme 10%" -- it doesn't translate!

Last edited 2 years ago by ramiro (previous) (diff)

comment:9 Changed 3 years ago by julien

  • Type changed from Uncategorized to Bug

comment:10 Changed 3 years ago by kikko

  • Cc kikko added

comment:11 in reply to: ↑ 7 Changed 3 years ago by ramiro

  • Resolution set to needsinfo
  • Status changed from reopened to closed

Replying to ramiro:

Yes, for the translatable literals extraction process templates are (internally) converted to python code and then fed as such to the gettext tools.

This means that Python string formatting specifiers rules should be followed when using the i18n template tags with string literals containing extrapolated variables or special meaning chars like %. In this case it means you literal would need to be {% trans "findme 10%% " %}

I've verified this is behaving correctly with trunk as of now and with 1.3. Ignore that advice I gave back then. I suspect r14459 fixed this in django so it works transparently for the app developer without the need to use %% with the trans i18n template tag.

To test things I created a template like this:

{% load i18n %}
{% trans "a literal with a percent symbol at the end %" %}</br>
{% trans "a literal with a percent symbol at the end 10%" %}</br>
{% trans "a literal with a percent % symbol in the middle" %}</br>
{% trans "a literal with a percent 20% symbol in the middle" %}</br>

makemessages -l de created a .po file like this from it (note I've already added dummy translations):

#: t11240/templates/a.html:2
msgid "a literal with a percent symbol at the end %"
msgstr "translation to German of a literal with a percent symbol at the end %"

#: t11240/templates/a.html:3
msgid "a literal with a percent symbol at the end 10%"
msgstr ""
"translation to German of a literal with a percent symbol at the end 10%"

#: t11240/templates/a.html:4
#, python-format
msgid "a literal with a percent % symbol in the middle"
msgstr ""
"translation to German of a literal with a percent % symbol in the middle"

#: t11240/templates/a.html:5
#, python-format
msgid "a literal with a percent 20% symbol in the middle"
msgstr ""
"translation to German a literal with a percent 20% symbol in the middle"

and compilemessages creates a corresponding .mo file without problems.

Setting LANGUAGE="de" and using the template in a view correctly shows:

translation to German of a literal with a percent symbol at the end %
translation to German of a literal with a percent symbol at the end 10%
translation to German of a literal with a percent % symbol in the middle
translation to German a literal with a percent 20% symbol in the middle

I'm closing this ticket and setting the reason needsinfo. If you reopen it please provide exact details of what you are seeing.

Last edited 3 years ago by ramiro (previous) (diff)

comment:12 Changed 3 years ago by ramiro

  • Component changed from Translations to Internationalization

comment:13 Changed 2 years ago by harm

  • Resolution needsinfo deleted
  • Status changed from closed to reopened
  • UI/UX unset

Reopened.

I see this issue too with django 1.3

template

{% trans '% of test' %}

po file

msgid "% of test"
msgstr "% van testtrans"

python manage.py compilemessages gives the following error

locale/nl/LC_MESSAGES/django.po:950: 'msgstr' is not a valid Python format string, unlike 'msgid'. Reason: In the directive number 1, the character 'v' is not a valid conversion specifier.
msgfmt: found 1 fatal error

Escaping with a double %% in the template doesn'th help. (that renders as a double percentage)

comment:14 Changed 2 years ago by harm

  • Cc harm.verhagen+django@… added

comment:15 Changed 2 years ago by ramiro

I think I understand now what this ticket has always been about.

The issue is with the '% o' fragment of the examples in the msgid's provided by both users that experienced problems.

'% o' is a valid interpolation specification, it mean an unsigned octal preceded by a space for positive values (or the '-' sign for negative ones) this is what the space conversion flag means.

I suspect this isn't the only case when an unintended formatting specifier can sneak in translatable literals involving percent characters.

Last edited 2 years ago by ramiro (previous) (diff)

comment:16 Changed 2 years ago by ramiro

  • Has patch set

The patch attached implements a fix for this in the 'makemessages' step by implementing escaping of '%' symbols in literals passed to the {% trans %} template tag (it replaces them with '%%').

This means that starting with this change simple Python string interpolation isn't supported in literals passed to 'trans' anymore.

This also means that the 'msgid''s extracted from such literals will have now '%%' and that translators should also use the same sequence in the respective 'msgstr''s. With this:

  • Unfortunately GNU gettext's xgettext still marks the msgid/msgstr entry with the #, python-format flag. Even when it contains no Python string formatting specification (e.g. "A string with two %%" still gets marked so). There is no way we can avoid addition of the flag or to remove it afterward because the execution of xgettext is an opaque step.
  • Even when the entry is marked with the #, python-format flag GNU gettext's msgfmt command additional checks triggered by such mark doesn't reject it with the error message reported in this ticket.

Changed 2 years ago by ramiro

comment:17 Changed 2 years ago by jezdez

LGTM

comment:18 Changed 2 years ago by claudep

  • Triage Stage changed from Accepted to Ready for checkin

comment:19 Changed 2 years ago by claudep

  • Cc claude@… added

comment:20 Changed 2 years ago by ramiro

  • Resolution set to fixed
  • Status changed from reopened to closed

In [17190]:

Fixed #11240 -- Made makemessages i18n command escape % symbols in literals passed to the trans tag.

This avoids problems with unintended automatic detection, marking and
validation of Python string formatting specifiers performed by
xgettext(1)/msgfmt(1) that stem from the fact that under the hood makemessages
converts templates to Python code before passing them to xgettext.

This also makes it consistent with its behavior on literals passed to the
blocktrans tag.

Thanks Jannis and claude for reviewing and feedback.

comment:21 Changed 15 months ago by anonymous

I still see this with 1.4.3? All msgstr's with % fail as original bug explains.

comment:22 Changed 15 months ago by anonymous

Sorry ignore that, I messed up with versions.

comment:23 Changed 12 months ago by defaultwombat

  • Resolution fixed deleted
  • Status changed from closed to new

I have issues with trans tags containing % Symbols using django 1.5.1

The trans tag doesn't handle a % the same way makemessages does. A {% trans "value in %"} still looks for a msgid "value in %" but makemessages created a msgid "value in %%" for it.
The trans tag should implement the same string conversion as makemessage does.

The workaround to already escape the % in the trans tag - {% trans "value in %%"} - returns the right translation.
But as the translation will most likely have a "%%" in it the rendered conent will have a "%%" too.
A workaround could be to let the TranslateNode return value % ().

Currently the only way to avoid issues with % Symbols in translations is using blocktrans tags.

It might be worth monitoring http://savannah.gnu.org/bugs/?func=detailitem&item_id=30854 for changes on gettext handling python format strings.

Changed 12 months ago by claudep

Test showing different behaviour between trans and blocktrans

comment:24 Changed 12 months ago by claudep

Unfortunately yes, I also think there is still an issue with trans and percents. Just attached the test case as it should be (and which currently fails).

comment:25 Changed 12 months ago by anonymous

You could just use a custom wrapper for pgettext (for example)

wrapper.py:


from django.utils.translation import pgettext as pgt

def pgettext(context, msg):

return pgt(context, msg).replace("%%", "%")

comment:26 Changed 12 months ago by skyjur

Another workaround is to use alternative syntax for translation:

    {{ _('Percent sign here % is not escaped.') }}

I guess that gettext might be looking for python syntaxed string-formatting in template files and is mislead because text string is followed directly by percent sign %}.

comment:27 Changed 11 months ago by timo

  • Patch needs improvement set
  • Triage Stage changed from Ready for checkin to Accepted

comment:28 Changed 10 months ago by ramiro

It would have been much better if you had opened a new ticket instead of reopening this one.

I suspect the issue has to do with the fact that the xgettext tool unconditionally adds a "#, python-format" ​flag (see comment:16) but we (by design) don't support Python string format specifiers on 'trans' tag literals.

comment:29 follow-up: Changed 6 months ago by bouke

Trying to replicate the bug, I've found that #, python-format compiles just fine with invalid string format specifiers.

#, python-format
msgid "My project has 20% success"
msgstr "No such thing as 20% success"
$ xgettext -V
xgettext (GNU gettext-tools) 0.18.2

Why would this be the case? Is this expected behaviour? If so, replacing % by %% makes no sense and that changeset should be reverted.

comment:30 in reply to: ↑ 29 Changed 5 months ago by ramiro

Replying to bouke:

Trying to replicate the bug, I've found that #, python-format compiles just fine with invalid string format specifiers.

#, python-format
msgid "My project has 20% success"
msgstr "No such thing as 20% success"

These aren't invalid Python format specifiers. "% s" is a valid one. Can you re-test with a %-prefixed sequence that is actually invalid?

Last edited 5 months ago by ramiro (previous) (diff)

comment:31 Changed 5 months ago by ramiro

I'm re-closing this ticket because the original issue was fixed two years ago.

If you intend to follow-up please open a newer one with a clean slate. Make sure you take in account comments comment:28 and comment:30 in you analysis.

comment:32 Changed 5 months ago by ramiro

  • Resolution set to fixed
  • Status changed from new to closed

Add Comment

Modify Ticket

Change Properties
<Author field>
Action
as closed
as The resolution will be set. Next status will be 'closed'
The resolution will be deleted. Next status will be 'new'
Author


E-mail address and user name can be saved in the Preferences.

 
Note: See TracTickets for help on using tickets.