#25677 closed Bug (fixed)
compilemessages throws an exception and does not report msgformat errors correctly
| Reported by: | Gavin Wahl | Owned by: | Ramiro Morales |
|---|---|---|---|
| Component: | Internationalization | Version: | dev |
| Severity: | Normal | Keywords: | 1.10 windows |
| Cc: | Ramiro Morales | Triage Stage: | Accepted |
| Has patch: | no | Needs documentation: | no |
| Needs tests: | no | Patch needs improvement: | no |
| Easy pickings: | no | UI/UX: | no |
Description
I have a django.po with errors. When I run compilemessages, instead of getting a sensible error, I get a UnicodeDecodeError.
Traceback (most recent call last):
File "/usr/lib/python3.4/pdb.py", line 1661, in main
pdb._runscript(mainpyfile)
File "/usr/lib/python3.4/pdb.py", line 1542, in _runscript
self.run(statement)
File "/usr/lib/python3.4/bdb.py", line 431, in run
exec(cmd, globals, locals)
File "<string>", line 1, in <module>
File "manage.py", line 2, in <module>
import os
File "django/core/management/__init__.py", line 350, in execute_from_command_line
utility.execute()
File "django/core/management/__init__.py", line 342, in execute
self.fetch_command(subcommand).run_from_argv(self.argv)
File "django/core/management/base.py", line 348, in run_from_argv
self.execute(*args, **cmd_options)
File "django/core/management/base.py", line 399, in execute
output = self.handle(*args, **options)
File "django/core/management/commands/compilemessages.py", line 98, in handle
self.compile_messages(locations)
File "django/core/management/commands/compilemessages.py", line 122, in compile_messages
output, errors, status = popen_wrapper(args)
File "django/core/management/utils.py", line 27, in popen_wrapper
output, errors = p.communicate()
File "/usr/lib/python3.4/subprocess.py", line 962, in communicate
stdout, stderr = self._communicate(input, endtime, timeout)
File "/usr/lib/python3.4/subprocess.py", line 1664, in _communicate
self.stderr.encoding)
File "/usr/lib/python3.4/subprocess.py", line 888, in _translate_newlines
data = data.decode(encoding)
UnicodeDecodeError: 'utf-8' codec can't decode byte 0xc5 in position 179: invalid continuation byte
I should get the error from msgformat:
$ msgfmt --check-format -o locale/es/LC_MESSAGES/django.mo locale/es/LC_MESSAGES/django.po locale/es/LC_MESSAGES/django.po:112: 'msgstr' is not a valid Python brace format string, unlike 'msgid'. Reason: In the directive number 0, '�' cannot start a field name. msgfmt: found 1 fatal error
The problem is that the output of msgformat is _not_ a utf-8 string. It's bytes. Any attempt to decode it into unicode is futile.
The exact output of msgformat is b"locale/es/LC_MESSAGES/django.po:112: 'msgstr' is not a valid Python brace format string, unlike 'msgid'. Reason: In the directive number 0, '\xc5' cannot start a field name.\nmsgfmt: found 1 fatal error\n".
Attachments (1)
Change History (25)
comment:1 by , 10 years ago
| Triage Stage: | Unreviewed → Accepted |
|---|---|
| Type: | Uncategorized → Bug |
| Version: | 1.8 → master |
by , 10 years ago
comment:2 by , 10 years ago
It's interesting to look at the traceback on Python 3:
...
File "/home/claude/virtualenvs/djangogit34/lib/python3.4/site-packages/django/core/management/commands/compilemessages.py", line 97, in handle
self.compile_messages(locations)
File "/home/claude/virtualenvs/djangogit34/lib/python3.4/site-packages/django/core/management/commands/compilemessages.py", line 121, in compile_messages
output, errors, status = popen_wrapper(args)
File "/home/claude/virtualenvs/djangogit34/lib/python3.4/site-packages/django/core/management/utils.py", line 27, in popen_wrapper
output, errors = p.communicate()
File "/usr/lib/python3.4/subprocess.py", line 960, in communicate
stdout, stderr = self._communicate(input, endtime, timeout)
File "/usr/lib/python3.4/subprocess.py", line 1662, in _communicate
self.stderr.encoding)
File "/usr/lib/python3.4/subprocess.py", line 888, in _translate_newlines
data = data.decode(encoding)
UnicodeDecodeError: 'utf-8' codec can't decode byte 0xc5 in position 214: invalid continuation byte
We see here that decoding (and error) happens in the standard lib.
Should we blame gettext which outputs non-utf-8-decodable characters? Should we blame Python by not falling back to some human-readable representation of the offending byte? Can Django do something to prevent that?
comment:3 by , 10 years ago
The error is in the standard lib because it was passed universal_newlines=True, which works on unicode. Popen will work in bytes otherwise.
comment:5 by , 10 years ago
| Owner: | changed from to |
|---|---|
| Status: | new → assigned |
comment:6 by , 10 years ago
| Owner: | removed |
|---|---|
| Status: | assigned → new |
comment:7 by , 10 years ago
| Triage Stage: | Accepted → Ready for checkin |
|---|
comment:10 by , 10 years ago
| Has patch: | unset |
|---|---|
| Resolution: | fixed |
| Status: | closed → new |
| Triage Stage: | Ready for checkin → Accepted |
Oops, I haven't installed gettext on the Windows CI machine so the tests aren't running there. Here's a sample error from my local machine which does have gettext installed:
======================================================================
ERROR: test_no_wrap_enabled (i18n.test_extraction.NoWrapExtractorTests)
----------------------------------------------------------------------
Traceback (most recent call last):
File "c:\Users\Tim\code\django\tests\i18n\test_extraction.py", line 597, in te
st_no_wrap_enabled
File "c:\users\tim\code\django\django\core\management\__init__.py", line 117,
in call_command
return command.execute(*args, **defaults)
File "c:\users\tim\code\django\django\core\management\base.py", line 341, in e
xecute
output = self.handle(*args, **options)
File "c:\users\tim\code\django\django\core\management\commands\makemessages.py
", line 307, in handle
self.write_po_file(potfile, locale)
File "c:\users\tim\code\django\django\core\management\commands\makemessages.py
", line 539, in write_po_file
"errors happened while running msgmerge\n%s" % errors)
django.core.management.base.CommandError: errors happened while running msgmerge
c:\Users\Tim\code\django\tests\i18n\commands\locale\de\LC_MESSAGES\django.po:2:4
7: syntax error
msgmerge: found 1 fatal error
comment:11 by , 10 years ago
Claude, any ideas? I can add gettext on the Windows CI machine if you want to try a pull request with some solution (of course, we'd see the test failures for other contributors too in the meantime).
comment:12 by , 10 years ago
I have no Windows system to test with, and don't like blind debugging. If Windows users care for Django, let them step in.
comment:13 by , 10 years ago
| Cc: | added |
|---|
Ramiro, maybe you could help with this issue at your convenience (before Django 1.10)?
comment:14 by , 10 years ago
Claude, the new test is also failing on Ubuntu 12.04 and Python 3 (possibly due to msgfmt 0.18.1 there vs 0.18.3 on 14.04?). Also seen this on my Windows setup (again, Python 3 only) which uses msgfmt 0.17.
Traceback (most recent call last):
File "/mnt/jenkinsdata/workspace/django-master/database/sqlite3/python/python3.4/tests/i18n/test_compilation.py", line 177, in test_msgfmt_error_including_non_ascii
call_command('compilemessages', locale=['ko'], verbosity=0)
File "/usr/lib/python3.4/contextlib.py", line 66, in __exit__
next(self.gen)
File "/mnt/jenkinsdata/workspace/django-master/database/sqlite3/python/python3.4/django/test/testcases.py", line 586, in _assert_raises_message_cm
yield cm
AssertionError: CommandError not raised
comment:15 by , 10 years ago
Are you able to install a more recent gettext on Windows to see if it's gettext-related?
comment:16 by , 10 years ago
I was able to reproduce that on Linux with gettext 0.18.1. I guess that python-brace-format is a "recent" gettext addition. A solution might be to simply skip test_msgfmt_error_including_non_ascii with older gettext versions. Not sure if that fixes all failures.
comment:17 by , 10 years ago
Skipping the test for older versions of gettext seems to work. This doesn't solve the other issues on Windows.
comment:19 by , 10 years ago
| Keywords: | 1.10 added |
|---|
comment:20 by , 10 years ago
| Keywords: | windows added |
|---|
comment:21 by , 9 years ago
| Owner: | changed from to |
|---|---|
| Status: | new → assigned |
comment:22 by , 9 years ago
| Owner: | removed |
|---|---|
| Status: | assigned → new |
comment:23 by , 9 years ago
| Owner: | set to |
|---|---|
| Status: | new → assigned |
comment:24 by , 9 years ago
| Resolution: | → fixed |
|---|---|
| Status: | assigned → closed |
I've opened #26645 to track these comment:10 errors. I suspect they aren't related to the original issue reported here which was fixed by Claude on fa08d27fb714534670b431fde0cd04a17d637585.
Thus, I'm re-closing this one as fixed.
Could you please provide the faulty msgid/msgstr that caused the error?