Code

Opened 5 years ago

Closed 2 years ago

Last modified 5 months ago

#10852 closed Uncategorized (wontfix)

Add no-fuzzy-matching option to makemessages

Reported by: graham.carlyle@… Owned by: nobody
Component: Internationalization Version: master
Severity: Normal Keywords:
Cc: chris@… Triage Stage: Unreviewed
Has patch: no Needs documentation: no
Needs tests: no Patch needs improvement: no
Easy pickings: no UI/UX: no

Description

I'd like an option added to makemessages to invoke msgmerge without fuzzy matching. This is because sometimes I'd prefer a translator not to have a "fuzzy" default and also to be able to make the non-translated text sticks out more in the web app (by being untranslated).

msgmerge with the "-N" option swiches off fuzzy-matching.

Attachments (0)

Change History (11)

comment:1 Changed 5 years ago by ramiro

  • Needs documentation unset
  • Needs tests unset
  • Patch needs improvement unset

AFAIK literals marked as 'fuzzy' aren't regarded by msgmft(1) as translated and the original untranslated msgid gets used, so your second reason: "to be able to make the non-translated text sticks out more in the web app (by being untranslated)" is already covered.

Could you give us a bit more detail about the rationale behind the first reason ("because sometimes I'd prefer a translator not to have a fuzzy default")?

comment:2 follow-up: Changed 5 years ago by graham.carlyle@…

msgmft? don't know what you're referring to there. Sorry I'm new to dealing with translations via po files so might well be misunderstanding things.

When I said "I'd prefer a translator not to have a "fuzzy" default" I meant that when running makemessages to create the po file then the msgstr sometimes seems to be speculatively filled in using existing translations.

For example say I have in a german translation po file...

#: templates/randa/map.html:56
msgid "Country info"
msgstr "Länderinfo"

then I add some new text in a template

  {% trans 'Country' %}

and re-run the makemessages script, I get...

#: templates/randa/map.html:91
#, fuzzy
msgid "Country"
msgstr "Länderinfo"

which worried me as it seemed a bit speculative to provide a default for the translator.

However it seems I was mistaken in thinking that the web app would show this, presumably the "#, fuzzy" line stops that happening (maybe that's what you are referring to by msgmft?).

But if I hack django

ndex: django/core/management/commands/makemessages.py
===================================================================
--- django/core/management/commands/makemessages.py	(revision 10575)
+++ django/core/management/commands/makemessages.py	(working copy)
@@ -185,7 +185,7 @@
                 raise CommandError("errors happened while running msguniq\n%s" % errors)
             open(potfile, 'w').write(msgs)
             if os.path.exists(pofile):
-                (stdin, stdout, stderr) = os.popen3('msgmerge -q "%s" "%s"' % (pofile, potfile), 't')
+                (stdin, stdout, stderr) = os.popen3('msgmerge -N -q "%s" "%s"' % (pofile, potfile), 't')
                 msgs = stdout.read()
                 errors = stderr.read()
                 if errors:

then it generates the po...

#: templates/randa/map.html:91
msgid "Country"
msgstr ""

maybe I worry too much and the #fuzzy stuff is clear to a translator :)

comment:3 in reply to: ↑ 2 ; follow-up: Changed 5 years ago by ramiro

Replying to graham.carlyle@maplecroft.com:

However it seems I was mistaken in thinking that the web app would show this, presumably the "#, fuzzy" line stops that happening (maybe that's what you are referring to by msgmft?).

Exactly, the entries marked as fuzzy by msgmerge are generated by a simple lexicographic comparison and the probability of them being not totally accurate is high so a) they require intervention from the translator (to review/correct them and remove the fuzzy flag) and b) they aren't used in the final translation.

msgfmt is the utility from the GNU gettext suite that compiles .po files to .mo files (the ones that get finally used by the i18n machinery) and it is executed by the compilemessages Django management command. See http://www.gnu.org/software/gettext/manual/gettext.html#msgfmt-Invocation (particularly the --use-fuzzy command line switch, which isn't used by compilemessages)

comment:4 follow-up: Changed 5 years ago by mtredinnick

  • Resolution set to wontfix
  • Status changed from new to closed

Okay, this is a non-issue. As Ramiro points out, the fuzzy annotation prevents the string from being used as a translation. However, it should be kept in the PO file because it saves translators work, particularly in the update phase, as they can see what a likely match is going to be. Particularly in crowd-sourced translations, where multiple people are going to be working on the same file, that's a huge win. In single-sourced cases, it's never going to be a hinderance, either, once people understand what fuzzy means.

So, no, we're not going to provide an option not to include those. They're normal parts of GNU PO files and experienced translators expect them, are used to working with them and gain benefit from them. Less experienced translators become more experienced as time goes by.

comment:5 in reply to: ↑ 4 ; follow-up: Changed 2 years ago by EmilStenstrom

  • Easy pickings unset
  • Severity set to Normal
  • Type set to Uncategorized
  • UI/UX unset

Replying to mtredinnick:

"However, it should be kept in the PO file because it saves translators work, particularly in the update phase, as they can see what a likely match is going to be"

I've seen many instances where translators are confused by the "almost correct" versions of strings, and instead miss them entirely. We've had greater success when manually going through the translation files and removing all the fuzzy strings entirely. For this reason, I'm strongly in favour of a flag for disabling fuzzy strings.

comment:6 in reply to: ↑ 5 Changed 2 years ago by justinkhill@…

  • Resolution wontfix deleted
  • Status changed from closed to reopened

Replying to EmilStenstrom:

Replying to mtredinnick:

"However, it should be kept in the PO file because it saves translators work, particularly in the update phase, as they can see what a likely match is going to be"

I've seen many instances where translators are confused by the "almost correct" versions of strings, and instead miss them entirely. We've had greater success when manually going through the translation files and removing all the fuzzy strings entirely. For this reason, I'm strongly in favour of a flag for disabling fuzzy strings.

I second that. Our translator actually asked me to disable fuzzy, saying "django probably has a way to turn that off". He didn't elaborate, but said fuzzy has caused havoc for them in the past on other projects. Since this isn't an option, I'll be removing the fuzzy translations manually, before sending them off.

comment:7 Changed 2 years ago by claudep

  • Resolution set to wontfix
  • Status changed from reopened to closed

Please do not reopen a ticket closed by a core committer, unless you have either a good new argument (and "fuzzy has caused havoc in the past" is certainly not) or after discussing it on django-dev mailing list.

comment:8 in reply to: ↑ 3 Changed 20 months ago by anonymous

Just wanted to chime in with a "me too".

Replying to ramiro:

Exactly, the entries marked as fuzzy by msgmerge are generated by a simple lexicographic comparison and the probability of them being not totally accurate is high

At least in my case, this is just not true. It's filling things in that are similar but not the same, and it shouldn't be. I'd rather they be empty than speculatively filled in.

There are already many options available for makemessages. I really don't see the harm in adding another option.

For now, I'm patching Django locally, which I really hate to do.

comment:9 Changed 15 months ago by ramiro

  • Resolution changed from wontfix to duplicate

Duplicate of #18714.

comment:10 Changed 15 months ago by ramiro

  • Resolution changed from duplicate to wontfix

Restoring ticket status. Sorry for the noise.

comment:11 Changed 5 months ago by acdha

  • Cc chris@… added

Add Comment

Modify Ticket

Change Properties
<Author field>
Action
as closed
as The resolution will be set. Next status will be 'closed'
The resolution will be deleted. Next status will be 'new'
Author


E-mail address and user name can be saved in the Preferences.

 
Note: See TracTickets for help on using tickets.