#25677 closed Bug (fixed)
compilemessages throws an exception and does not report msgformat errors correctly
Reported by: | Gavin Wahl | Owned by: | Ramiro Morales |
---|---|---|---|
Component: | Internationalization | Version: | dev |
Severity: | Normal | Keywords: | 1.10 windows |
Cc: | Ramiro Morales | Triage Stage: | Accepted |
Has patch: | no | Needs documentation: | no |
Needs tests: | no | Patch needs improvement: | no |
Easy pickings: | no | UI/UX: | no |
Description
I have a django.po with errors. When I run compilemessages, instead of getting a sensible error, I get a UnicodeDecodeError.
Traceback (most recent call last): File "/usr/lib/python3.4/pdb.py", line 1661, in main pdb._runscript(mainpyfile) File "/usr/lib/python3.4/pdb.py", line 1542, in _runscript self.run(statement) File "/usr/lib/python3.4/bdb.py", line 431, in run exec(cmd, globals, locals) File "<string>", line 1, in <module> File "manage.py", line 2, in <module> import os File "django/core/management/__init__.py", line 350, in execute_from_command_line utility.execute() File "django/core/management/__init__.py", line 342, in execute self.fetch_command(subcommand).run_from_argv(self.argv) File "django/core/management/base.py", line 348, in run_from_argv self.execute(*args, **cmd_options) File "django/core/management/base.py", line 399, in execute output = self.handle(*args, **options) File "django/core/management/commands/compilemessages.py", line 98, in handle self.compile_messages(locations) File "django/core/management/commands/compilemessages.py", line 122, in compile_messages output, errors, status = popen_wrapper(args) File "django/core/management/utils.py", line 27, in popen_wrapper output, errors = p.communicate() File "/usr/lib/python3.4/subprocess.py", line 962, in communicate stdout, stderr = self._communicate(input, endtime, timeout) File "/usr/lib/python3.4/subprocess.py", line 1664, in _communicate self.stderr.encoding) File "/usr/lib/python3.4/subprocess.py", line 888, in _translate_newlines data = data.decode(encoding) UnicodeDecodeError: 'utf-8' codec can't decode byte 0xc5 in position 179: invalid continuation byte
I should get the error from msgformat:
$ msgfmt --check-format -o locale/es/LC_MESSAGES/django.mo locale/es/LC_MESSAGES/django.po locale/es/LC_MESSAGES/django.po:112: 'msgstr' is not a valid Python brace format string, unlike 'msgid'. Reason: In the directive number 0, '�' cannot start a field name. msgfmt: found 1 fatal error
The problem is that the output of msgformat is _not_ a utf-8 string. It's bytes. Any attempt to decode it into unicode is futile.
The exact output of msgformat is b"locale/es/LC_MESSAGES/django.po:112: 'msgstr' is not a valid Python brace format string, unlike 'msgid'. Reason: In the directive number 0, '\xc5' cannot start a field name.\nmsgfmt: found 1 fatal error\n".
Attachments (1)
Change History (25)
comment:1 by , 9 years ago
Triage Stage: | Unreviewed → Accepted |
---|---|
Type: | Uncategorized → Bug |
Version: | 1.8 → master |
by , 9 years ago
comment:2 by , 9 years ago
It's interesting to look at the traceback on Python 3:
... File "/home/claude/virtualenvs/djangogit34/lib/python3.4/site-packages/django/core/management/commands/compilemessages.py", line 97, in handle self.compile_messages(locations) File "/home/claude/virtualenvs/djangogit34/lib/python3.4/site-packages/django/core/management/commands/compilemessages.py", line 121, in compile_messages output, errors, status = popen_wrapper(args) File "/home/claude/virtualenvs/djangogit34/lib/python3.4/site-packages/django/core/management/utils.py", line 27, in popen_wrapper output, errors = p.communicate() File "/usr/lib/python3.4/subprocess.py", line 960, in communicate stdout, stderr = self._communicate(input, endtime, timeout) File "/usr/lib/python3.4/subprocess.py", line 1662, in _communicate self.stderr.encoding) File "/usr/lib/python3.4/subprocess.py", line 888, in _translate_newlines data = data.decode(encoding) UnicodeDecodeError: 'utf-8' codec can't decode byte 0xc5 in position 214: invalid continuation byte
We see here that decoding (and error) happens in the standard lib.
Should we blame gettext which outputs non-utf-8-decodable characters? Should we blame Python by not falling back to some human-readable representation of the offending byte? Can Django do something to prevent that?
comment:3 by , 9 years ago
The error is in the standard lib because it was passed universal_newlines=True, which works on unicode. Popen will work in bytes otherwise.
comment:5 by , 9 years ago
Owner: | changed from | to
---|---|
Status: | new → assigned |
comment:6 by , 9 years ago
Owner: | removed |
---|---|
Status: | assigned → new |
comment:7 by , 9 years ago
Triage Stage: | Accepted → Ready for checkin |
---|
comment:10 by , 9 years ago
Has patch: | unset |
---|---|
Resolution: | fixed |
Status: | closed → new |
Triage Stage: | Ready for checkin → Accepted |
Oops, I haven't installed gettext on the Windows CI machine so the tests aren't running there. Here's a sample error from my local machine which does have gettext installed:
====================================================================== ERROR: test_no_wrap_enabled (i18n.test_extraction.NoWrapExtractorTests) ---------------------------------------------------------------------- Traceback (most recent call last): File "c:\Users\Tim\code\django\tests\i18n\test_extraction.py", line 597, in te st_no_wrap_enabled File "c:\users\tim\code\django\django\core\management\__init__.py", line 117, in call_command return command.execute(*args, **defaults) File "c:\users\tim\code\django\django\core\management\base.py", line 341, in e xecute output = self.handle(*args, **options) File "c:\users\tim\code\django\django\core\management\commands\makemessages.py ", line 307, in handle self.write_po_file(potfile, locale) File "c:\users\tim\code\django\django\core\management\commands\makemessages.py ", line 539, in write_po_file "errors happened while running msgmerge\n%s" % errors) django.core.management.base.CommandError: errors happened while running msgmerge c:\Users\Tim\code\django\tests\i18n\commands\locale\de\LC_MESSAGES\django.po:2:4 7: syntax error msgmerge: found 1 fatal error
comment:11 by , 9 years ago
Claude, any ideas? I can add gettext on the Windows CI machine if you want to try a pull request with some solution (of course, we'd see the test failures for other contributors too in the meantime).
comment:12 by , 9 years ago
I have no Windows system to test with, and don't like blind debugging. If Windows users care for Django, let them step in.
comment:13 by , 9 years ago
Cc: | added |
---|
Ramiro, maybe you could help with this issue at your convenience (before Django 1.10)?
comment:14 by , 9 years ago
Claude, the new test is also failing on Ubuntu 12.04 and Python 3 (possibly due to msgfmt 0.18.1 there vs 0.18.3 on 14.04?). Also seen this on my Windows setup (again, Python 3 only) which uses msgfmt 0.17.
Traceback (most recent call last): File "/mnt/jenkinsdata/workspace/django-master/database/sqlite3/python/python3.4/tests/i18n/test_compilation.py", line 177, in test_msgfmt_error_including_non_ascii call_command('compilemessages', locale=['ko'], verbosity=0) File "/usr/lib/python3.4/contextlib.py", line 66, in __exit__ next(self.gen) File "/mnt/jenkinsdata/workspace/django-master/database/sqlite3/python/python3.4/django/test/testcases.py", line 586, in _assert_raises_message_cm yield cm AssertionError: CommandError not raised
comment:15 by , 9 years ago
Are you able to install a more recent gettext on Windows to see if it's gettext-related?
comment:16 by , 9 years ago
I was able to reproduce that on Linux with gettext 0.18.1. I guess that python-brace-format is a "recent" gettext addition. A solution might be to simply skip test_msgfmt_error_including_non_ascii
with older gettext versions. Not sure if that fixes all failures.
comment:17 by , 9 years ago
Skipping the test for older versions of gettext seems to work. This doesn't solve the other issues on Windows.
comment:19 by , 9 years ago
Keywords: | 1.10 added |
---|
comment:20 by , 9 years ago
Keywords: | windows added |
---|
comment:21 by , 9 years ago
Owner: | changed from | to
---|---|
Status: | new → assigned |
comment:22 by , 9 years ago
Owner: | removed |
---|---|
Status: | assigned → new |
comment:23 by , 9 years ago
Owner: | set to |
---|---|
Status: | new → assigned |
comment:24 by , 9 years ago
Resolution: | → fixed |
---|---|
Status: | assigned → closed |
I've opened #26645 to track these comment:10 errors. I suspect they aren't related to the original issue reported here which was fixed by Claude on fa08d27fb714534670b431fde0cd04a17d637585.
This, I'm re-closing this one as fixed.
Could you please provide the faulty msgid/msgstr that caused the error?