Opened 5 years ago

Last modified 5 years ago

#17096 new Cleanup/optimization

Strengthen the makemessages command's safe-guarding of po files

Reported by: Julien Phalip Owned by: nobody
Component: Core (Management commands) Version: 1.3
Severity: Normal Keywords:
Cc: Triage Stage: Accepted
Has patch: no Needs documentation: no
Needs tests: no Patch needs improvement: no
Easy pickings: no UI/UX: no


It is common practice to set up a cron job to regularly run the makemessages command in order to update the 'django.po' file with the latest available translations. To limit the risk of data loss, the command first creates a temporary file, but then it simply copies the resulting content into the original file [1]. It seems there is still a risk to run into a race condition where another process would update that file at the same time and override the content initially populated by the makemessages command, or vice versa.

I've discussed this with Alex Gaynor and Benjamin Peterson (Python core dev) about using some file-locking or safe-guarding strategies. Quoting Benjamin:

"On posix systems, you can simply use os.rename to replace translations
file with the temporary file; the operation is guaranteed to be
atomic, so any Django process accessing it will be guaranteed to see
either only the old translation file or the new one.

Windows is far more painful, since it has no guarantees of atomic
renaming. On Vista and after, there is a MoveFileTransacted. This has
the problem of not supporting older versions of Windows and not being
exposed directly by Python (though it is in pywin32). Usually then,
you have to do something in the writing process like creating a lock
file, writing to the main file, then deleting the lock file. The
reader then has to check to make sure there is no lock file before
commencing reading.

Unittesting is not so fun either. You can take the approach of
creating a bunch of readers and a bunch of writers and running them
against each other for a few seconds. You can also mock out the file
system calls and ensure that the writer and the reader and performing
operations in the correct sequence."

So, to strengthen the safe-guarding of po files, we could perhaps use a combination of a) comparing the file's before&after modification date-times in case it has simultaneously been modified by another process — if the date-time is different then the makemessages process should stop with a warning; and b) using os.rename to replace the destination file by a temporary file instead of directly writing into the destination file.

If implementing the same support for Windows appears to be too hard, then maybe we could leave that off and revert to the current system, since the issue described here is most likely to occur in production environments and I've personally never heard of live Django sites deployed on Windows-based platforms.


Change History (1)

comment:1 Changed 5 years ago by Aymeric Augustin

Triage Stage: UnreviewedAccepted

I agree that checking the timestamps and using os.rename are quick wins.

I don't think it's worth engineering an enterprise class test harness for this -- the code should be trivial.

Note: See TracTickets for help on using tickets.
Back to Top