Opened 9 months ago

Last modified 8 weeks ago

#27685 new Cleanup/optimization

Allow autoreloader to use watchman

Reported by: Aymeric Augustin Owned by: nobody
Component: Utilities Version: master
Severity: Normal Keywords:
Cc: emorley@…, Tom Triage Stage: Accepted
Has patch: yes Needs documentation: no
Needs tests: no Patch needs improvement: no
Easy pickings: no UI/UX: no

Description (last modified by Aymeric Augustin)

I believe we should stop maintaining an autoreloader as soon as possible. Django’s autoreloader is annoyingly slow, highly inefficient, moderately well designed, and a gigantic pain to maintain. I’m more scared of django.utils.autoreload than of django.db.models.related before it was cleaned up.

I wish one day someone will take the time to write a good autoreloading dev server based on watchman. This would solve the problem discussed here because watchman watches all files in the current directory. The correct way to do this is to throw away the current design and start from scratch.

Watchman is smart enough to wait until you’ve finished a git operation to trigger a reload. Once such polished tech has become available, trying to compete with a thread that checks the mtime of all known files every second isn’t funny anymore. In fact it’s just sad.

from django-developers thread

The Future of the development server's auto-reloading thread has some ideas about this.

EDIT -- That description was initially enclosed in a <rant> tag on the mailing list. While I stand by the facts, the tone is more ironic that what I'd write in a bug report. Different context.

Change History (7)

comment:1 Changed 9 months ago by Aymeric Augustin

Description: modified (diff)

comment:2 Changed 9 months ago by Ed Morley

Cc: emorley@… added

comment:3 Changed 2 months ago by Tom

For reference, flask has quite a nice reloading implementation that uses Watchdog: https://github.com/pallets/werkzeug/blob/master/werkzeug/_reloader.py

Django obviously cannot bundle watchman as a Watchdog, so it will have to use a 'dumb' stat based implementation as it's base. While this is the lowest common denominator it is OK for tiny projects, we could detect when the number of watched files exceeds X and display a helpful message, like 'install the watchdog package for fun and profit'. This seems to be the easiest way forward to improve the reloader in general, the current implementation is in great need of some improvements.

I really like Watchman though (https://facebook.github.io/watchman/), a frontend frameworks that I use utilizes it. It has a socket API and runs as a service, so if it's available (there is an official API package) we could add this as a supported reloader.

The reloading code in general is a bit of a black box, but it shouldn't be. If a re-implementation is on the cards then maybe it could be pluggable by third party apps that want to do something when files change during development. One interesting use case could be triggering a page reload when template files change.

comment:4 Changed 2 months ago by Tom

Cc: Tom added

comment:5 Changed 2 months ago by Tom

Has patch: set

I have a potential start of a patch here: https://github.com/django/django/pull/8819

I think the first step is to refactor/re-write the autoreloader, and the next step is to add support for interesting things like watchman. The MR linked above is the start of the refactor.

comment:6 Changed 8 weeks ago by Aymeric Augustin

Tom: thanks for picking up the flag! I'm super happy to see this move forwards.

I read the description of your PR; it makes a lot of sense. Having a base class and several implementation (possibly depending on additional software) is an excellent starting point. It would be good to merge regardless of what comes next. I'll try to take a closer look in the coming days. Ping me if you haven't heard of me by mid-August.

My three questions are:

  • is the directory / glob pattern configuration method sufficiently generic to work with all backends we envision?
  • how does the signal-based design work? (I'm usually not a fan of signals, but surprisingly, I have rather positive feelings about this use case. I just want to make sure that's really the best solution.)
  • how much backwards incompatibility does this patch incur? (Larger backwards incompatibility need larger improvements to justify them.)

PS - In addition to the mailing list threads I linked above, I think the best summary I wrote is on the GSOC Ideas page.

comment:7 Changed 8 weeks ago by Tom

Hey Aymeric, thanks for the response!

  1. I believe so. Currently I'm envisioning the watchdog backend first, as it seems pretty easy to add. That supports watching individual directories without a glob, so we'd need to do some manual work to find out if any file changes match out glob expressions (which doesn't sound too difficult?). Services like watchman take a directory and a glob pattern so it does that for us. I can't think of any other more generic method than a (direcrtory, glob) tuple?
  1. When the autoreloader is ready to watch for events a autoreload_started signal is fired, with the autoreloader instance as it's sender. It knows when to send this signal by waiting for apps.ready in a while loop, which is a bit iffy IMO. Any handlers (like the one in i18n) are triggered and can use sender.watch(...) to subscribe to anything they want - in i18n's case it's .mo files in various places. When any file is changed a file_changed signal is sent with the path, and each handler can do whatever it pleases with that path. If any handler returns True then the server will not restart due to this change. I like this approach because it's just simple signals, and you don't care about any settings while registering them. If the debug server is not used (or autoreload is turned off) then no signals will be sent. Simple!
  1. I believe the autoreload module is private, but there are a few usages on GitHub: https://github.com/search?l=Python&q=django.utils.autoreload&type=Code&utf8=%E2%9C%93

A lot of these are similar to these:

https://github.com/JamesTing/ConnectNodes/blob/ba4f6a82416a81d31e51c00d5472ee48098388d8/后台/reload.py
https://github.com/ch3ll0v3k/Doc-Browser/blob/ea31f3c3ce82621f77311bcf536455cb16f137a3/pyinstaller/build/lib.linux-i686-2.7/PyInstaller/loader/rthooks/pyi_rth_django.py
https://github.com/tyru/homedir/blob/98166ffee204db6083bed644d6cb438d274e2635/bin.d/google_appengine/google/appengine/_internal/django/core/management/commands/runserver.py#L82

So to keep some semblance of compatibility we should:
Implement a code_changed function that just runs the current implementation (for people using uwsgi in development?)
Add a restart_with_reloader method (seems to be for pyinstaller compat)
Add a main() method.

None of these seem insurmountable IMO.

Note: See TracTickets for help on using tickets.
Back to Top