Add generic pre/post_modify signal
|Reported by:||akaariai||Owned by:||nobody|
|Component:||Database layer (models, ORM)||Version:||1.3|
|Cc:||anssi.kaariainen@…||Triage Stage:||Design decision needed|
|Has patch:||no||Needs documentation:||no|
|Needs tests:||no||Patch needs improvement:||no|
There might be some value in a new model signal: pre/post_modify. The signal would have six arguments:
- sender: the model class under DML.
- objs: an possibly empty iterable containing all the objects going to be modified. Accessing the iterable can trigger a query.
- action: one of "save","delete","update","bulk_create" or "raw_sql".
- action_args: this is the args passed to the save/delete/update/bulk_create command.
- raw: True if the signal fired because of fixture loading.
- db: The database alias under modification.
- Currently it is somewhat hard to do cache invalidation, Haystack search index rebuilding etc. especially if you need to also check related model changes. This would give you one point where you can do the invalidation/rebuilding.
- The current signals framework is only halfway complete: you can catch some modifications done by Django, but not all. The above would allow you to block some operations categorically, or actually do proper auditing in Python code.
- You could make a DB read-only in one simple step: a pre_modify signal raising "ReadOnlyDBError" for all senders.
There is one use-case which is problematic: bulk_insert. Here the objs are not going to have PK set, so this will cause problems for Haystack indexing just for an example. I think it would be possible to implement the signal in such a way that the caller could raise a "IDidItForYouException", or to modify the given objs iterator in a way that it removes some objects from the iterator. Both of these would make it possible to convert the bulk_insert into regular .save() operations if the caller absolutely needs that for one reason or another.
The biggest reason this signal would be nice is that this allows total data modifying control without overhead: for example .update() can't fire pre/post_save signals as that would be really costly for large batches. Now, if you pass the queryset being updated into the signal, and in addition the kwargs for .update() you could possibly schedule reindexing or do cache invalidtation by just fetching all the PK's from the queryset which isn't that expensive. Or you could do auditing by running another .update() from the signal. If there is nobody interested in the operation, the signal costs you almost nothing.
This obviously needs discussion at django-developers, but just now isn't the right time for that.
Change History (3)
comment:1 Changed 4 years ago by akaariai
- Cc anssi.kaariainen@… added
- Needs documentation unset
- Needs tests unset
- Patch needs improvement unset
comment:2 Changed 4 years ago by aaugustin
- Triage Stage changed from Unreviewed to Design decision needed