Model._do_update not always consistent with the precise meaning of returned value
|Reported by:||knaperek||Owned by:||knaperek|
|Component:||Database layer (models, ORM)||Version:||master|
|Severity:||Normal||Keywords:||select_on_save concurrency race-condition|
|Cc:||knaperek||Triage Stage:||Ready for checkin|
|Has patch:||yes||Needs documentation:||no|
|Needs tests:||no||Patch needs improvement:||no|
This is a bit obscure corner-case bug which might seem unimportant, but the solution is easy and actually only makes the code simpler (removes one nested if) and more consistent.
When select_on_save is set, Model._do_update() ignores the result of Model._update() if it learnt that the object exists in a previous DB call. That last thing is crucial, as the object could have already been deleted by concurrent worker just in-between these two ORM calls (exists and _update). In such case, the _update() call will fail, resulting in nothing being actually updated. However, _do_update() method will currently return True regardless of _update() result, which is inconsistent with the common (and expected) behavior, and also in contrast with the method's docs (saying: If the model was updated (in the sense that an update query was done and a matching row was found from the DB) the method will return True.)
As a consequence, object will not be saved into the database, because _save_table() won't call _do_insert(), being misled by _do_update(). Now you may oppose that this does not really matter that much, since it has been caused by concurrent modifications of the same data, which is considered non-deterministic anyway (so noone has the right to complain that the object is not there at the end). However, here's one example that vouches for this bug to be fixed: if you happen to have a trigger in your DB counting the number of times an object was changed (updated/inserted), you'd be missing one (which is no longer justifiable by concurrent save() and delete() operations).
As it sometimes happens, it is IMHO impossible to create a unit test for this bug, as it is too low level thing, exploitable only in a concurrent environment, and with some (bad) luck. I only found the bug when studying the code. So that's why I tried to provide a solid explanation, believing the fix I'm about to make should be obvious and mergable without worries.
Change History (8)
comment:1 Changed 15 months ago by knaperek
- Needs documentation unset
- Needs tests unset
- Owner changed from nobody to knaperek
- Patch needs improvement unset
- Status changed from new to assigned
comment:5 Changed 15 months ago by knaperek
- Keywords select_on_save concurrency race-condition added
- Version changed from 1.6 to master
comment:7 Changed 11 months ago by timgraham
- Triage Stage changed from Accepted to Ready for checkin