#21169 closed Bug (fixed)
Deletion in custom reverse managers
Reported by: | Sebastian Goll | Owned by: | loic84 |
---|---|---|---|
Component: | Database layer (models, ORM) | Version: | dev |
Severity: | Release blocker | Keywords: | |
Cc: | Triage Stage: | Accepted | |
Has patch: | yes | Needs documentation: | no |
Needs tests: | no | Patch needs improvement: | no |
Easy pickings: | no | UI/UX: | no |
Description
The newly implemented custom reverse managers (cf. #3871) expose potentially serious behavior when remove()
and clear()
are called: instead of limiting the effects of the removal to the objects considered related by the chosen custom manager, the methods fall back to the unrestricted behavior of the default manager, removing more objects from the relation than is desired.
I'm attaching a test which illustrates the problem for reverse foreign keys with custom managers. I suppose the same behavior is present when using generic foreign keys as well as for the forward and backward sides of many-to-many relations.
Attachments (1)
Change History (17)
by , 11 years ago
Attachment: | test_related_manager_fk_removal.diff added |
---|
comment:1 by , 11 years ago
comment:2 by , 11 years ago
I think I am going to revert #3871, we can then improve the patch and recommit. Of course, if we can solve this in agreed upon way really soon there is no need to revert + repatch.
comment:3 by , 11 years ago
Unfortunately, I am not entirely sure how to proceed here. When I wrote my patch proposal for #3871, I remember it wasn't possible to make remove()
and many-to-many's clear()
work as expected without any additional overhead (necessary to make sure so that we do not remove more objects from the relation as intended).
That is the reason why I decided to entirely forbid remove()
and many-to-many's clear()
when their outcome would be unexpected. Is this a reasonable approach here, as well? Ideally, providing these methods would be great but are they even possible without issuing another query (i.e., to make sure that the object removed is actually part of the collection in case of remove()
)?
Due to the way it is implemented, foreign-key's clear()
already works as expected. So the problem lies only in foreign-key's and many-to-many's remove()
and many-to-many's clear()
.
comment:4 by , 11 years ago
Triage Stage: | Unreviewed → Accepted |
---|
Hmmh, lets just go for "forbid operations" approach. These methods can be added back later on if we can fix them in clean way.
Actually, lets go for "no related manager methods" approach. You can also do set which could clear existing instances outside the manager's viewable set, and .add() could add something that isn't viewable by the manager.
comment:5 by , 11 years ago
Owner: | changed from | to
---|---|
Status: | new → assigned |
The problem is not limited to the __call__
syntax, the same would happen to default managers.
I'm working on a fix.
comment:6 by , 11 years ago
I've reworked M2M RelatedManager.clear()
to account for the default filtering through a subquery.
https://github.com/django/django/pull/1685
As previously noted, GFK and FK RelatedManager.clear()
weren't affected by the issue.
Also as discussed on IRC, remove()
seems less critical since you specify the objects manually, so we might get away with a warning in the docs.
comment:7 by , 11 years ago
Just a note to those who try to fix release blockers for 1.6: the release blocker is for master, not 1.6.
comment:8 by , 11 years ago
I did some work to make sure queries used by m2m clear() and remove() are as simple as possible. Now the clearing queries do not use subqueries when the manager doesn't have filters. The subqueries in these cases didn't actually filter anything out. In addition, when subqueries are used, the subquery has one less join than before. End result is that the queries are similar to current master when the base manager doesn't have filters, and a bit more efficient that current HEAD of pull 1685 when there are filters. The tradeoff is even more complexity in m2m manager implementation.
To me it seems important that as simple as possible queries are used. This ensures as little DB resources are used as possible. In addition I believe old versions of MySQL will simply panic when they see a query with both a subquery and joins...
Patch at https://github.com/akaariai/django/commit/9dd75213050e6f85a0b93ddd59243f8d3809b06c - this doesn't have any tests, and regressions in query efficiency seem likely. I wonder if CaptureQueriesContext could be used to inspect the used queries.
BTW I spotted one possible data-corruption issue in current clear() code for symmetric m2m. Currently the clear is implemented as two separate delete queries which delete all items from the relation, first in one direction, then in another. But, if a concurrent add() commits in between then the added relation is deleted in one direction, but not in anohter. The sequence is like this for transactions T1 and T2:
T1: add relation for instances m1 <-> m2. Two rows are added so that the relation is symmetrical (rows m1:m2 and m2:m1). T2: m1.clear() starts. First, delete all rows where m1:*. m1:m2 isn't deleted as that isn't yet visible. T1: commits, m1:m2 and m2:m1 become visible to T2. T2: Second part of clear is executed, that is *:m1 is deleted. Now m2:m1 is visible and it is deleted. T2: commits. m1:m2 relation exists, but m2:m1 doesn't. The symmetry of m1<->m2 is broken.
Luckily this seems rare in practice. pull 1685 does the delete in one go, so this issue is fixed in PR1685. This seems to be impossible to test in a way that the regression test actually picks up regressions. So, no tests needed for this case.
comment:9 by , 11 years ago
Has patch: | set |
---|---|
Needs documentation: | set |
For the record I'm going to list all the changes and issues addressed by this ticket and #21174 since they are related.
The culprits: the clear()
and remove()
methods of the M2M, FK, and GFK related managers, so six methods in total.
Issues:
- All methods except
FK.clear()
completely bypassed default filtering (i.e. customManager.get_queryset()
with filters). - All methods except
FK.clear()
caused multiple consecutive db operations run without transaction. (see #21174) - Inconsistency between the implementation of
FK.remove()
andFK.clear()
, in particular wrt emitted signals. M2M.clean()
andM2M.remove()
were prone to race conditions for symmetrical relations. (see #comment:8)
List of changes:
- All methods now respect default filtering, which means you can't delete or change something that is filtered out.
- All methods now run in a single query, which addresses all transaction concerns.
FK.remove()
changed from successivesave()
calls, to a singleupdate()
.
Gotcha:
pre_save
andpost_save
signals aren't fired anymore.
M2M.remove()
andM2M.clear()
rely on subqueries when filtering is involved.
- Symmetrical
M2M.remove()
andM2M.clear()
now run in a single query.
GFK.remove()
andGFK.clear()
now perform bulk delete.
Gotcha:
Model.delete()
is not called anymore.
comment:10 by , 11 years ago
Needs documentation: | unset |
---|
comment:11 by , 11 years ago
The plan is to introduce pre/post update signals, and after that merge the changes from this ticket in.
comment:12 by , 11 years ago
Resolution: | → fixed |
---|---|
Status: | assigned → closed |
In my patch proposal patch
r17204-custom-reverse-managers.diff
in #3871, I considered the problem and then took the easy way out by simply disallowing theremove()
andclear()
methods on custom reverse managers. Obviously, this solution is far from perfect.