Opened 11 years ago

Closed 11 years ago

Last modified 11 years ago

#21169 closed Bug (fixed)

Deletion in custom reverse managers

Reported by: Sebastian Goll Owned by: loic84
Component: Database layer (models, ORM) Version: dev
Severity: Release blocker Keywords:
Cc: Triage Stage: Accepted
Has patch: yes Needs documentation: no
Needs tests: no Patch needs improvement: no
Easy pickings: no UI/UX: no

Description

The newly implemented custom reverse managers (cf. #3871) expose potentially serious behavior when remove() and clear() are called: instead of limiting the effects of the removal to the objects considered related by the chosen custom manager, the methods fall back to the unrestricted behavior of the default manager, removing more objects from the relation than is desired.

I'm attaching a test which illustrates the problem for reverse foreign keys with custom managers. I suppose the same behavior is present when using generic foreign keys as well as for the forward and backward sides of many-to-many relations.

Attachments (1)

test_related_manager_fk_removal.diff (1.9 KB ) - added by Sebastian Goll 11 years ago.

Download all attachments as: .zip

Change History (17)

by Sebastian Goll, 11 years ago

comment:1 by Sebastian Goll, 11 years ago

In my patch proposal patch r17204-custom-reverse-managers.diff in #3871, I considered the problem and then took the easy way out by simply disallowing the remove() and clear() methods on custom reverse managers. Obviously, this solution is far from perfect.

comment:2 by Anssi Kääriäinen, 11 years ago

I think I am going to revert #3871, we can then improve the patch and recommit. Of course, if we can solve this in agreed upon way really soon there is no need to revert + repatch.

comment:3 by Sebastian Goll, 11 years ago

Unfortunately, I am not entirely sure how to proceed here. When I wrote my patch proposal for #3871, I remember it wasn't possible to make remove() and many-to-many's clear() work as expected without any additional overhead (necessary to make sure so that we do not remove more objects from the relation as intended).

That is the reason why I decided to entirely forbid remove() and many-to-many's clear() when their outcome would be unexpected. Is this a reasonable approach here, as well? Ideally, providing these methods would be great but are they even possible without issuing another query (i.e., to make sure that the object removed is actually part of the collection in case of remove())?

Due to the way it is implemented, foreign-key's clear() already works as expected. So the problem lies only in foreign-key's and many-to-many's remove() and many-to-many's clear().

comment:4 by Anssi Kääriäinen, 11 years ago

Triage Stage: UnreviewedAccepted

Hmmh, lets just go for "forbid operations" approach. These methods can be added back later on if we can fix them in clean way.

Actually, lets go for "no related manager methods" approach. You can also do set which could clear existing instances outside the manager's viewable set, and .add() could add something that isn't viewable by the manager.

comment:5 by loic84, 11 years ago

Owner: changed from nobody to loic84
Status: newassigned

The problem is not limited to the __call__ syntax, the same would happen to default managers.

I'm working on a fix.

comment:6 by loic84, 11 years ago

I've reworked M2M RelatedManager.clear() to account for the default filtering through a subquery.

https://github.com/django/django/pull/1685

As previously noted, GFK and FK RelatedManager.clear() weren't affected by the issue.

Also as discussed on IRC, remove() seems less critical since you specify the objects manually, so we might get away with a warning in the docs.

comment:7 by Anssi Kääriäinen, 11 years ago

Just a note to those who try to fix release blockers for 1.6: the release blocker is for master, not 1.6.

comment:8 by Anssi Kääriäinen, 11 years ago

I did some work to make sure queries used by m2m clear() and remove() are as simple as possible. Now the clearing queries do not use subqueries when the manager doesn't have filters. The subqueries in these cases didn't actually filter anything out. In addition, when subqueries are used, the subquery has one less join than before. End result is that the queries are similar to current master when the base manager doesn't have filters, and a bit more efficient that current HEAD of pull 1685 when there are filters. The tradeoff is even more complexity in m2m manager implementation.

To me it seems important that as simple as possible queries are used. This ensures as little DB resources are used as possible. In addition I believe old versions of MySQL will simply panic when they see a query with both a subquery and joins...

Patch at https://github.com/akaariai/django/commit/9dd75213050e6f85a0b93ddd59243f8d3809b06c - this doesn't have any tests, and regressions in query efficiency seem likely. I wonder if CaptureQueriesContext could be used to inspect the used queries.

BTW I spotted one possible data-corruption issue in current clear() code for symmetric m2m. Currently the clear is implemented as two separate delete queries which delete all items from the relation, first in one direction, then in another. But, if a concurrent add() commits in between then the added relation is deleted in one direction, but not in anohter. The sequence is like this for transactions T1 and T2:

T1: add relation for instances m1 <-> m2. Two rows are added so that the
    relation is symmetrical (rows m1:m2 and m2:m1).
T2: m1.clear() starts. First, delete all rows where m1:*. m1:m2 isn't
    deleted as that isn't yet visible.
T1: commits, m1:m2 and m2:m1 become visible to T2.
T2: Second part of clear is executed, that is *:m1 is deleted. Now m2:m1
    is visible and it is deleted.
T2: commits. m1:m2 relation exists, but m2:m1 doesn't. The symmetry of
    m1<->m2 is broken.

Luckily this seems rare in practice. pull 1685 does the delete in one go, so this issue is fixed in PR1685. This seems to be impossible to test in a way that the regression test actually picks up regressions. So, no tests needed for this case.

comment:9 by loic84, 11 years ago

Has patch: set
Needs documentation: set

For the record I'm going to list all the changes and issues addressed by this ticket and #21174 since they are related.

The culprits: the clear() and remove() methods of the M2M, FK, and GFK related managers, so six methods in total.

Issues:

  • All methods except FK.clear() completely bypassed default filtering (i.e. custom Manager.get_queryset() with filters).
  • All methods except FK.clear() caused multiple consecutive db operations run without transaction. (see #21174)
  • Inconsistency between the implementation of FK.remove() and FK.clear(), in particular wrt emitted signals.
  • M2M.clean() and M2M.remove() were prone to race conditions for symmetrical relations. (see #comment:8)

List of changes:

  • All methods now respect default filtering, which means you can't delete or change something that is filtered out.
  • All methods now run in a single query, which addresses all transaction concerns.
  • FK.remove() changed from successive save() calls, to a single update().

Gotcha: pre_save and post_save signals aren't fired anymore.

  • M2M.remove() and M2M.clear() rely on subqueries when filtering is involved.
  • Symmetrical M2M.remove() and M2M.clear() now run in a single query.
  • GFK.remove() and GFK.clear() now perform bulk delete.

Gotcha: Model.delete() is not called anymore.

comment:10 by loic84, 11 years ago

Needs documentation: unset

comment:11 by Anssi Kääriäinen, 11 years ago

The plan is to introduce pre/post update signals, and after that merge the changes from this ticket in.

comment:12 by Anssi Kääriäinen <akaariai@…>, 11 years ago

Resolution: fixed
Status: assignedclosed

In 17c3997f6828e88e4646071a8187c1318b65597d:

Fixed #21169 -- Reworked RelatedManager methods use default filtering

The remove() and clear() methods of the related managers created by
ForeignKey, GenericForeignKey, and ManyToManyField suffered from a
number of issues. Some operations ran multiple data modifying queries without
wrapping them in a transaction, and some operations didn't respect default
filtering when it was present (i.e. when the default manager on the related
model implemented a custom get_queryset()).

Fixing the issues introduced some backward incompatible changes:

  • The implementation of remove() for ForeignKey related managers changed from a series of Model.save() calls to a single QuerySet.update() call. The change means that pre_save and post_save signals aren't called anymore.
  • The remove() and clear() methods for GenericForeignKey related managers now perform bulk delete so Model.delete() isn't called anymore.
  • The remove() and clear() methods for ManyToManyField related managers perform nested queries when filtering is involved, which may or may not be an issue depending on the database and the data itself.

Refs. #3871, #21174.

Thanks Anssi Kääriäinen and Tim Graham for the reviews.

comment:13 by Anssi Kääriäinen <akaariai@…>, 11 years ago

In 52015b963d263642957aec880b52ad4063b484cd:

Used simpler queries for m2m clearing when possible.

Refs #21169

comment:14 by Anssi Kääriäinen <akaariai@…>, 11 years ago

In f450bc9f44bc1270eae911daa157c155c29d1d9d:

Added a bulk option to RelatedManager remove() and clear() methods

Refs #21169

comment:15 by Anssi Kääriäinen <akaariai@…>, 11 years ago

comment:16 by Loic Bistuer <loic.bistuer@…>, 11 years ago

In 91fce675a453764cb233e53c0460826600c828fa:

Use 'update_fields' in RelatedManager.clear() when bulk=False.

Thanks Simon Charette for the suggestion.

Refs #21169.

Note: See TracTickets for help on using tickets.
Back to Top