Opened 16 months ago

Closed 16 months ago

Last modified 16 months ago

#21415 closed Bug (fixed)

Unicode escapes appear verbatim in translated naturaltime strings

Reported by: 676c7473@… Owned by: claudep
Component: Translations Version: 1.6
Severity: Release blocker Keywords: i18n l10n translation
Cc: Triage Stage: Accepted
Has patch: yes Needs documentation: no
Needs tests: no Patch needs improvement: no
Easy pickings: no UI/UX: no

Description

In Django 1.6, the Unicode escape \u00a0 "non-breaking space" that was introduced in the translations is doubly-escaped and now appears verbatim in the output of django.contrib.humanize.naturaltime.

What used to be

vor 6 Sekunden

that is the German ("de") translation for "6 seconds ago", now appears as

vor 6\u00a0Sekunden

in templates that use naturaltime.

This affects all django/contrib/humanize/locale/<language>/LC_MESSAGES/django.po files. Not sure if this should be treated as a translation bug, and how?

Change History (22)

comment:1 Changed 16 months ago by claudep

  • Needs documentation unset
  • Needs tests unset
  • Patch needs improvement unset
  • Severity changed from Normal to Release blocker
  • Triage Stage changed from Unreviewed to Accepted

I could reproduce, ungettext doesn't resolve the double backslash (\\u00a0) it receives from the translation catalog. Crap!
Original ticket: #20246.

comment:2 Changed 16 months ago by claudep

if the \xa0 doesn't work with makemessages/xgettext and \u00a0 doesn't work when retrieving the string, I think the only remaining solution is to include a literal non-breaking space, telling translators to preserve it by the comment and hoping they will do. Even if they replace it with a regular space, it's not a big deal anyway.
Other proposals?

comment:3 Changed 16 months ago by 676c7473@…

Yes, that works. No better ideas here. Then maybe the comment could be more explicit?

#. Translators: There should be a non-breaking space (U+00A0)
#. between number and time unit.
#: templatetags/humanize.py:199
#, python-format
msgid "a second ago"
msgid_plural "%(count)s\\u00a0seconds ago"
msgstr[0] "vor einer Sekunde"
msgstr[1] "vor %(count)s Sekunden"

(Trac seems to convert the non-breaking space to normal space ...)

comment:4 Changed 16 months ago by glts

I have created a pull request https://github.com/django/django/pull/1915 with a proposal for an updated "Translators:" message. Is that ok for you?

I also ran makemessages on the master branch, I could attach this to the pull request. But I suspect that needs to be done in Transifex? Let me know if I can assist you with this ticket in any way. Thanks, David.

comment:5 Changed 16 months ago by claudep

  • Has patch set
  • Patch needs improvement set

If the \u00a0 sequence cannot be used by translators in the translated string, I'm not in favor of using it in the original string. Then I would formulate the comment to something like Translators: please keep a non-breaking space (U+00A0) between number and time unit.

comment:6 Changed 16 months ago by glts

I kept the \u00a0 because PEP 8 talks about keeping string literals in ASCII in the Python standard library. If it's otherwise ok to put any UTF-8 chars (even whitespace) in source files, then that's fine for me too. I'll update the pull request.

comment:7 Changed 16 months ago by glts

Actually, wouldn't "count" be a better word here, since "count" is used in the code and in the msgid? please keep a non-breaking space (U+00A0) between count and time unit? I'll try that.

comment:8 Changed 16 months ago by claudep

My priority is to not confuse translators with cryptic char sequences, PEP 8 comes after... And yes, "count" is probably better.

comment:9 Changed 16 months ago by glts

I have updated the pull request. Should I attach the makemessages changes, too?

comment:10 Changed 16 months ago by claudep

  • Owner changed from nobody to claudep
  • Patch needs improvement unset
  • Status changed from new to assigned

Thanks, I'll take care of it.

comment:11 Changed 16 months ago by Claude Paroz <claude@…>

  • Resolution set to fixed
  • Status changed from assigned to closed

In 7e0ebd74c107e3267b1df438ed7f061f8be5cf05:

Fixed #21415 -- Replaced escape sequence by literal non-breaking space

Unfortunately, escape sequences (\x.. or \u....) do not fit well
with the gettext toolchain. Falling back to using literal char,
even if visibility is not ideal.

comment:12 Changed 16 months ago by Claude Paroz <claude@…>

In 1e2bbc3b712d53032a3cf25e77baf3a157de66f0:

[1.6.x] Fixed #21415 -- Replaced escape sequence by literal non-breaking space

Unfortunately, escape sequences (\x.. or \u....) do not fit well
with the gettext toolchain. Falling back to using literal char,
even if visibility is not ideal.

Backport of 7e0ebd74c from master.

comment:13 Changed 16 months ago by Claude Paroz <claude@…>

In 882ee16f68deeb1831b83a407911e720fbd5e9fd:

[1.6.x] Updated humanize translation catalog

Refs #21415

comment:14 Changed 16 months ago by claudep

  • Resolution fixed deleted
  • Status changed from closed to new

Before the bug can really considered fixed, we have to:

  • Wait for Transifex to update the pot file (max 24h)
  • Update translations (should be possible even for non speaker of target language)
  • Commit updated translations

comment:15 Changed 16 months ago by claudep

... and add 1.6.1 release note.

comment:16 Changed 16 months ago by Claude Paroz <claude@…>

In e85baa813f2a2c8e565fc68418ff91e84d7d5ec0:

Updated humanize translations and added release note.

Refs #21415.

comment:17 Changed 16 months ago by claudep

  • Resolution set to fixed
  • Status changed from new to closed

Hopefully fixed now.

comment:18 Changed 16 months ago by glts

Thanks for that.

I did a quick check, it seems you have overlooked the last four instances in the German translation django/contrib/humanize/locale/de/LC_MESSAGES/django.po, from line 268 downward, they should be non-breaking spaces (so it begins ... :P)

comment:19 Changed 16 months ago by claudep

Indeed, I did not replace any normal space with non-breaking space, that's the work of translators. What I did is replace any \u00a0 sequence by non-breaking space. Unfortunately, there is also an issue with Transifex which sometimes keeps a translation even when the original changes (but only spaces are changed). Sorry, I cannot fix all problems myself :-P

comment:20 Changed 16 months ago by glts

Just so there is no misunderstanding: at the time of writing, django/contrib/humanize/locale/de/LC_MESSAGES/django.po contains 6 messages that need a non-breaking space. Of these, the first 2 (vor ... Sekunden/Minuten) have the non-breaking space, but the last 4 (vor ... Stunden, nach Sekunden/Minuten/Stunden) have a normal space. \u00a0 was replaced with normal space.

That's why I said this looks like an oversight to me.

comment:21 Changed 16 months ago by claudep

You are probably right, sorry. I will probably update translations once again before 1.6.1, can you fix them (de) on Transifex?

comment:22 Changed 16 months ago by glts

No probs. Unfortunately I was not accepted in the Transifex de group, maybe I did something wrong, will have to check again later.

Note: See TracTickets for help on using tickets.
Back to Top