Opened 10 years ago

Closed 10 years ago

Last modified 7 years ago

#21415 closed Bug (fixed)

Unicode escapes appear verbatim in translated naturaltime strings

Reported by: 676c7473@… Owned by: Claude Paroz
Component: Translations Version: 1.6
Severity: Release blocker Keywords: i18n l10n translation
Cc: Triage Stage: Accepted
Has patch: yes Needs documentation: no
Needs tests: no Patch needs improvement: no
Easy pickings: no UI/UX: no

Description

In Django 1.6, the Unicode escape \u00a0 "non-breaking space" that was introduced in the translations is doubly-escaped and now appears verbatim in the output of django.contrib.humanize.naturaltime.

What used to be

vor 6 Sekunden

that is the German ("de") translation for "6 seconds ago", now appears as

vor 6\u00a0Sekunden

in templates that use naturaltime.

This affects all django/contrib/humanize/locale/<language>/LC_MESSAGES/django.po files. Not sure if this should be treated as a translation bug, and how?

Change History (24)

comment:1 by Claude Paroz, 10 years ago

Severity: NormalRelease blocker
Triage Stage: UnreviewedAccepted

I could reproduce, ungettext doesn't resolve the double backslash (\\u00a0) it receives from the translation catalog. Crap!
Original ticket: #20246.

comment:2 by Claude Paroz, 10 years ago

if the \xa0 doesn't work with makemessages/xgettext and \u00a0 doesn't work when retrieving the string, I think the only remaining solution is to include a literal non-breaking space, telling translators to preserve it by the comment and hoping they will do. Even if they replace it with a regular space, it's not a big deal anyway.
Other proposals?

comment:3 by 676c7473@…, 10 years ago

Yes, that works. No better ideas here. Then maybe the comment could be more explicit?

#. Translators: There should be a non-breaking space (U+00A0)
#. between number and time unit.
#: templatetags/humanize.py:199
#, python-format
msgid "a second ago"
msgid_plural "%(count)s\\u00a0seconds ago"
msgstr[0] "vor einer Sekunde"
msgstr[1] "vor %(count)s Sekunden"

(Trac seems to convert the non-breaking space to normal space ...)

comment:4 by David, 10 years ago

I have created a pull request https://github.com/django/django/pull/1915 with a proposal for an updated "Translators:" message. Is that ok for you?

I also ran makemessages on the master branch, I could attach this to the pull request. But I suspect that needs to be done in Transifex? Let me know if I can assist you with this ticket in any way. Thanks, David.

comment:5 by Claude Paroz, 10 years ago

Has patch: set
Patch needs improvement: set

If the \u00a0 sequence cannot be used by translators in the translated string, I'm not in favor of using it in the original string. Then I would formulate the comment to something like Translators: please keep a non-breaking space (U+00A0) between number and time unit.

comment:6 by David, 10 years ago

I kept the \u00a0 because PEP 8 talks about keeping string literals in ASCII in the Python standard library. If it's otherwise ok to put any UTF-8 chars (even whitespace) in source files, then that's fine for me too. I'll update the pull request.

comment:7 by David, 10 years ago

Actually, wouldn't "count" be a better word here, since "count" is used in the code and in the msgid? please keep a non-breaking space (U+00A0) between count and time unit? I'll try that.

comment:8 by Claude Paroz, 10 years ago

My priority is to not confuse translators with cryptic char sequences, PEP 8 comes after... And yes, "count" is probably better.

comment:9 by David, 10 years ago

I have updated the pull request. Should I attach the makemessages changes, too?

comment:10 by Claude Paroz, 10 years ago

Owner: changed from nobody to Claude Paroz
Patch needs improvement: unset
Status: newassigned

Thanks, I'll take care of it.

comment:11 by Claude Paroz <claude@…>, 10 years ago

Resolution: fixed
Status: assignedclosed

In 7e0ebd74c107e3267b1df438ed7f061f8be5cf05:

Fixed #21415 -- Replaced escape sequence by literal non-breaking space

Unfortunately, escape sequences (\x.. or \u....) do not fit well
with the gettext toolchain. Falling back to using literal char,
even if visibility is not ideal.

comment:12 by Claude Paroz <claude@…>, 10 years ago

In 1e2bbc3b712d53032a3cf25e77baf3a157de66f0:

[1.6.x] Fixed #21415 -- Replaced escape sequence by literal non-breaking space

Unfortunately, escape sequences (\x.. or \u....) do not fit well
with the gettext toolchain. Falling back to using literal char,
even if visibility is not ideal.

Backport of 7e0ebd74c from master.

comment:13 by Claude Paroz <claude@…>, 10 years ago

In 882ee16f68deeb1831b83a407911e720fbd5e9fd:

[1.6.x] Updated humanize translation catalog

Refs #21415

comment:14 by Claude Paroz, 10 years ago

Resolution: fixed
Status: closednew

Before the bug can really considered fixed, we have to:

  • Wait for Transifex to update the pot file (max 24h)
  • Update translations (should be possible even for non speaker of target language)
  • Commit updated translations

comment:15 by Claude Paroz, 10 years ago

... and add 1.6.1 release note.

comment:16 by Claude Paroz <claude@…>, 10 years ago

In e85baa813f2a2c8e565fc68418ff91e84d7d5ec0:

Updated humanize translations and added release note.

Refs #21415.

comment:17 by Claude Paroz, 10 years ago

Resolution: fixed
Status: newclosed

Hopefully fixed now.

comment:18 by David, 10 years ago

Thanks for that.

I did a quick check, it seems you have overlooked the last four instances in the German translation django/contrib/humanize/locale/de/LC_MESSAGES/django.po, from line 268 downward, they should be non-breaking spaces (so it begins ... :P)

comment:19 by Claude Paroz, 10 years ago

Indeed, I did not replace any normal space with non-breaking space, that's the work of translators. What I did is replace any \u00a0 sequence by non-breaking space. Unfortunately, there is also an issue with Transifex which sometimes keeps a translation even when the original changes (but only spaces are changed). Sorry, I cannot fix all problems myself :-P

comment:20 by David, 10 years ago

Just so there is no misunderstanding: at the time of writing, django/contrib/humanize/locale/de/LC_MESSAGES/django.po contains 6 messages that need a non-breaking space. Of these, the first 2 (vor ... Sekunden/Minuten) have the non-breaking space, but the last 4 (vor ... Stunden, nach Sekunden/Minuten/Stunden) have a normal space. \u00a0 was replaced with normal space.

That's why I said this looks like an oversight to me.

comment:21 by Claude Paroz, 10 years ago

You are probably right, sorry. I will probably update translations once again before 1.6.1, can you fix them (de) on Transifex?

comment:22 by David, 10 years ago

No probs. Unfortunately I was not accepted in the Transifex de group, maybe I did something wrong, will have to check again later.

comment:23 by Claude Paroz <claude@…>, 7 years ago

In 72026fff:

[1.11.x] Refs #21415 -- Fixed contrib.humanize translations for es_AR

Thanks Blas Castro for the report and the patch.

comment:24 by Claude Paroz <claude@…>, 7 years ago

In 61fd2b49:

Refs #21415 -- Fixed contrib.humanize translations for es_AR

Thanks Blas Castro for the report and the patch.
Forward port of 72026fff3963973d607bf0b525a082d20f024406 from stable/1.11.x

Note: See TracTickets for help on using tickets.
Back to Top