Opened 8 years ago

Closed 6 years ago

Last modified 5 years ago

#27685 closed Cleanup/optimization (fixed)

Allow autoreloader to use watchman

Reported by: Aymeric Augustin Owned by: Tom Forbes
Component: Utilities Version: dev
Severity: Normal Keywords:
Cc: emorley@…, Tom Forbes Triage Stage: Ready for checkin
Has patch: yes Needs documentation: no
Needs tests: no Patch needs improvement: no
Easy pickings: no UI/UX: no

Description (last modified by Aymeric Augustin)

I believe we should stop maintaining an autoreloader as soon as possible. Django’s autoreloader is annoyingly slow, highly inefficient, moderately well designed, and a gigantic pain to maintain. I’m more scared of django.utils.autoreload than of django.db.models.related before it was cleaned up.

I wish one day someone will take the time to write a good autoreloading dev server based on watchman. This would solve the problem discussed here because watchman watches all files in the current directory. The correct way to do this is to throw away the current design and start from scratch.

Watchman is smart enough to wait until you’ve finished a git operation to trigger a reload. Once such polished tech has become available, trying to compete with a thread that checks the mtime of all known files every second isn’t funny anymore. In fact it’s just sad.

from django-developers thread

The Future of the development server's auto-reloading thread has some ideas about this.

EDIT -- That description was initially enclosed in a <rant> tag on the mailing list. While I stand by the facts, the tone is more ironic that what I'd write in a bug report. Different context.

Change History (20)

comment:1 by Aymeric Augustin, 8 years ago

Description: modified (diff)

comment:2 by Ed Morley, 8 years ago

Cc: emorley@… added

comment:3 by Tom Forbes, 7 years ago

For reference, flask has quite a nice reloading implementation that uses Watchdog: https://github.com/pallets/werkzeug/blob/master/werkzeug/_reloader.py

Django obviously cannot bundle watchman as a Watchdog, so it will have to use a 'dumb' stat based implementation as it's base. While this is the lowest common denominator it is OK for tiny projects, we could detect when the number of watched files exceeds X and display a helpful message, like 'install the watchdog package for fun and profit'. This seems to be the easiest way forward to improve the reloader in general, the current implementation is in great need of some improvements.

I really like Watchman though (https://facebook.github.io/watchman/), a frontend frameworks that I use utilizes it. It has a socket API and runs as a service, so if it's available (there is an official API package) we could add this as a supported reloader.

The reloading code in general is a bit of a black box, but it shouldn't be. If a re-implementation is on the cards then maybe it could be pluggable by third party apps that want to do something when files change during development. One interesting use case could be triggering a page reload when template files change.

comment:4 by Tom Forbes, 7 years ago

Cc: Tom Forbes added

comment:5 by Tom Forbes, 7 years ago

Has patch: set

I have a potential start of a patch here: https://github.com/django/django/pull/8819

I think the first step is to refactor/re-write the autoreloader, and the next step is to add support for interesting things like watchman. The MR linked above is the start of the refactor.

comment:6 by Aymeric Augustin, 7 years ago

Tom: thanks for picking up the flag! I'm super happy to see this move forwards.

I read the description of your PR; it makes a lot of sense. Having a base class and several implementation (possibly depending on additional software) is an excellent starting point. It would be good to merge regardless of what comes next. I'll try to take a closer look in the coming days. Ping me if you haven't heard of me by mid-August.

My three questions are:

  • is the directory / glob pattern configuration method sufficiently generic to work with all backends we envision?
  • how does the signal-based design work? (I'm usually not a fan of signals, but surprisingly, I have rather positive feelings about this use case. I just want to make sure that's really the best solution.)
  • how much backwards incompatibility does this patch incur? (Larger backwards incompatibility need larger improvements to justify them.)

PS - In addition to the mailing list threads I linked above, I think the best summary I wrote is on the GSOC Ideas page.

comment:7 by Tom Forbes, 7 years ago

Hey Aymeric, thanks for the response!

  1. I believe so. Currently I'm envisioning the watchdog backend first, as it seems pretty easy to add. That supports watching individual directories without a glob, so we'd need to do some manual work to find out if any file changes match out glob expressions (which doesn't sound too difficult?). Services like watchman take a directory and a glob pattern so it does that for us. I can't think of any other more generic method than a (direcrtory, glob) tuple?
  1. When the autoreloader is ready to watch for events a autoreload_started signal is fired, with the autoreloader instance as it's sender. It knows when to send this signal by waiting for apps.ready in a while loop, which is a bit iffy IMO. Any handlers (like the one in i18n) are triggered and can use sender.watch(...) to subscribe to anything they want - in i18n's case it's .mo files in various places. When any file is changed a file_changed signal is sent with the path, and each handler can do whatever it pleases with that path. If any handler returns True then the server will not restart due to this change. I like this approach because it's just simple signals, and you don't care about any settings while registering them. If the debug server is not used (or autoreload is turned off) then no signals will be sent. Simple!
  1. I believe the autoreload module is private, but there are a few usages on GitHub: https://github.com/search?l=Python&q=django.utils.autoreload&type=Code&utf8=%E2%9C%93

A lot of these are similar to these:

https://github.com/JamesTing/ConnectNodes/blob/ba4f6a82416a81d31e51c00d5472ee48098388d8/后台/reload.py
https://github.com/ch3ll0v3k/Doc-Browser/blob/ea31f3c3ce82621f77311bcf536455cb16f137a3/pyinstaller/build/lib.linux-i686-2.7/PyInstaller/loader/rthooks/pyi_rth_django.py
https://github.com/tyru/homedir/blob/98166ffee204db6083bed644d6cb438d274e2635/bin.d/google_appengine/google/appengine/_internal/django/core/management/commands/runserver.py#L82

So to keep some semblance of compatibility we should:
Implement a code_changed function that just runs the current implementation (for people using uwsgi in development?)
Add a restart_with_reloader method (seems to be for pyinstaller compat)
Add a main() method.

None of these seem insurmountable IMO.

comment:8 by Tim Martin, 7 years ago

Patch needs improvement: set

comment:9 by Tom Forbes, 7 years ago

Owner: changed from nobody to Tom Forbes
Status: newassigned

comment:10 by Tom Forbes, 6 years ago

Patch needs improvement: unset

I've completed what I think is the bulk of the work for the first stage of this, refactoring the autoreloader to allow us to add support for Watchman in the future. After making the autoreloader more modular I've expanded the tests (including ones for the inotify reloader) and fleshed out the API for triggering arbitrary watches.

I think with this we can easily close #25624, #25791 and #28602 in the near future.

comment:11 by Tim Graham, 6 years ago

Patch needs improvement: set

comment:12 by Tom Forbes, 6 years ago

Patch needs improvement: unset

I've updated the PR considerably since last time. I've removed the entire pyinotify mess and replaced it with a concrete, working (at least on my machine!) watchman implementation.

I've also worked out a pretty decent way of testing the reloaders. If we structure the inner loop as a generator we can run individual 'ticks' in tests using next(). The first tick will set up the files to be watched. Then we modify them in some way and then trigger another tick. This should then pick up the modifications we have made and do the right thing.

I've used a metaclass to share 8 integration tests across the watchman and stat reloaders. Individually each class passes but when run as a whole module the watchman ones fail (perhaps due to the metaclass?), so I need to look into that.

The watchman service is pretty good at cleaning itself up, if we can get it installed on a CI worker I think it would be safe to just run in the background indefinitely?

comment:13 by Tim Graham, 6 years ago

Patch needs improvement: set

comment:14 by Tom Forbes, 6 years ago

Patch needs improvement: unset

comment:15 by Tim Graham, 6 years ago

Triage Stage: AcceptedReady for checkin

comment:16 by Tim Graham <timograham@…>, 6 years ago

Resolution: fixed
Status: assignedclosed

In c8720e76:

Fixed #27685 -- Added watchman support to the autoreloader.

Removed support for pyinotify (refs #9722).

comment:17 by Tim Graham <timograham@…>, 6 years ago

In 7331dd8a:

[2.2.x] Refs #27685 -- Removed "watchman unavailable" message.

Backport of 65ef5f467ba84c26392a157de1622d805401ec7d from master

comment:18 by Tim Graham <timograham@…>, 6 years ago

In 65ef5f46:

Refs #27685 -- Removed "watchman unavailable" message.

comment:19 by Ülgen Sarıkavak, 6 years ago

Is it possible to enable "watchman unavailable" errors back? I understand that watchman is not the default option and printing watchman related errors as default not a good experience but when i try to install/use watchman and something went wrong, there is no way to see what is happening. Maybe something like --force-watchman or --reloader=watchman can do the work.

Last edited 6 years ago by Ülgen Sarıkavak (previous) (diff)

comment:20 by Mariusz Felisiak <felisiak.mariusz@…>, 5 years ago

In 43f54e1:

Refs #27685 -- Logged unexpected Watchman autoreloader errors.

Note: See TracTickets for help on using tickets.
Back to Top