Opened 9 years ago
Last modified 3 months ago
#23321 new Cleanup/optimization
Remove .mo files from the Django Git repository
Reported by: | Claude Paroz | Owned by: | nobody |
---|---|---|---|
Component: | Internationalization | Version: | dev |
Severity: | Normal | Keywords: | |
Cc: | slav0nic@…, Maciej Olko, Calidae Developers, Ningú | Triage Stage: | Someday/Maybe |
Has patch: | yes | Needs documentation: | no |
Needs tests: | no | Patch needs improvement: | yes |
Easy pickings: | no | UI/UX: | no |
Description
Binary/generated files are no good candidates to be included in a Git repository. They unnecessarily bloat the repository without added value.
It would be nice to compile those .mo files at package build time.
Change History (12)
comment:1 Changed 9 years ago by
Cc: | slav0nic@… added |
---|
comment:2 Changed 9 years ago by
comment:3 Changed 9 years ago by
I think it would be possible to check the presence of .mo files in runserver
and output an appropriate warning. I understand the commodity of having .mo files in the repo, but I don't think this justifies having generated binary files in a VCS.
comment:4 Changed 9 years ago by
Here's a branch where I started working on this: https://github.com/claudep/django/tree/23321
comment:6 Changed 9 years ago by
Triage Stage: | Accepted → Ready for checkin |
---|
Code looks fine to me, but would be good to get an opinion from another person familiar with translations too.
comment:7 Changed 9 years ago by
Patch needs improvement: | set |
---|---|
Triage Stage: | Ready for checkin → Someday/Maybe |
I don't think we should go that route as it would introduce a couple of issues that make it harder for our users and from a maintenance standpoint:
- The most pressing issues IMO will show up for users that are using not-yet-released versions of Django, e.g. translators and contributors.
- there are differences in gettext versions that we would not be able to fix
- Windows users don't usually have gettext installed
- The test system would have to compile the po files on every test run to make sure to have a consistent set to base tests on
- Users on system with a non-writable file system may have problems with the subprocess call as part of trans_real.py
- The Django release manager would have to have gettext installed and run an additional command to build the tarball, something that I think is better suited for the translation manager (who has to pull files from Transifex anyways)
I understand that having compiled files in a VCS aren't good, but the proposed plan doesn't convince me to drop the mo files.
If only we'd use Babel instead.. it does have the ability to compile po files to mo files without dependency on gettext.
comment:8 Changed 4 years ago by
On the repo size issue, for some occasions I've taken to cloning using the depth
option, which restricts the fetched history. e.g. --depth=1000
is more than enough for a lot of cases. Perhaps we could add that as an example to the docs, so that folks don't need to clone the whole history. (?)
comment:9 Changed 3 months ago by
Cc: | Maciej Olko added |
---|
comment:10 Changed 3 months ago by
Cc: | Calidae Developers added |
---|
comment:11 Changed 3 months ago by
Cc: | Ningú added |
---|
comment:12 Changed 3 months ago by
If one reasons about this as if we were speaking about a C extension, I think all those points made by Jannis Leidel do fall pretty short:
- Yes, people working on a repositoy checkout instead of a public release will need the compilation toolchain. Yes, there will be sharp edges on certain platforms because of this and that is out of reach for the Django project.
- Yes, the test system ought to compile those binaries each time. If that ever had a significant impact on CI times, just engineer a cache for both those files and the toolchain setup.
- Yes, you need a writable filesystem to develop on a project. Whoever ships a Django checkout on a read-only FS should be responsible for compiling *.mo files before turning the FS read-only.
- Yes, the release manager also needs the compilation toolchain. If that is cumbersome, just produce the packages on a CI pipeline; the release manager can then download, verify, sign and publish those if your workflow requires that. Otherwise just publish them from the CI as well!
Replacing gettext with babel might alleviate some of this but IMHO that should exclusively be a a build-time dependency and never a run-time dependency, just as gettext. A lot has been going on in the packaging scene since Claude's PR, but now I'd depict this as a build-system requirement
`
[build-system]
requires = ['setuptools>=40.8.0', 'babel>=2']
build-backend = 'setuptools.build_meta'
`
and then tell the build backend (not necessarily setuptools) to produce *.mo files when building a wheel distribution. Either gettext or babel would be a requirement to build either a Django checkout or a source distribution. This would be a better fit for PEP-517 and require less documentation than reminding people to compilemessages before installing or packaging Django while tox could be responsible for producing *.mo files in the CI. But maybe this is an over-engineered idea.
I have a sense this is not addressed because of certain FUD while obviating real recurring "mo and po files out of sync" issues in the whole django ecosystem https://code.djangoproject.com/ticket/8732 . Yes, contributors will be pushed a new build-time dependency if they expect their non-wheel installs to be localized. As it should have always been! Translators should be familiar with gettext anyway, irrespective of their platform.
That change would make it a bit more error-prone to work on i18n'd projects with the development version of Django.
I'm not saying we can't remove the .mo files, but we need to think about the consequences. They add some value.