Opened 4 years ago

Closed 4 years ago

Last modified 2 years ago

#14373 closed (duplicate)

annotate() will gladly delete your data

Reported by: patrys Owned by: carljm
Component: Database layer (models, ORM) Version: 1.2
Severity: Keywords:
Cc: Triage Stage: Accepted
Has patch: no Needs documentation: no
Needs tests: no Patch needs improvement: no
Easy pickings: UI/UX:

Description

Consider the following example:

class Foo(models.Model):
    name = models.CharField(max_length=100)

class Bar(models.Model):
    name = models.CharField(max_length=100)
    foos = models.ManyToManyField(Foo, related_name='bars')

Create your database, fill it with important data, then do the following:

bars = Bar.objects.all().annotate(foos=Sum('foos'))

Now all your data connections are gone. Yay.

What happens is that annotate gladly accepts "foos" even if that attribute name is already taken. It then retrieves all the objects from the result set and proceeds to destroy your data by assigning aggregated values to your precious related manager. The manager then happily swallows the integer it receives and goes to delete all the relations.

Possible fixes:

  • Make annotate call delattr before calling setattr
  • Make annotate raise an exception when passed an already existing attribute

Also:

  • Make the many-to-many manager think for a while before it accepts 5 or any other integer as the new relation set

Attachments (1)

14373_r13978_initial_tests.diff (2.7 KB) - added by carljm 4 years ago.

Download all attachments as: .zip

Change History (13)

comment:1 Changed 4 years ago by patrys

  • Needs documentation unset
  • Needs tests unset
  • Patch needs improvement unset

In case it was not obvious, the example was supposed to say Count, not Sum.

comment:2 Changed 4 years ago by lukeplant

  • milestone set to 1.3
  • Triage Stage changed from Unreviewed to Accepted

comment:3 Changed 4 years ago by carljm

  • Owner set to carljm

Putting together a patch for this. My initial take is that both fixes should be made:

  1. m2m attributes should be pickier about what can be assigned to them, and complain rather than silently clear if they get something nonsensical (need to dig in more to define what exactly nonsensical means, initially it looks like "iterables only, either of the appropriate model instance or a valid PK");
  1. annotate/aggregate should complain if you tell them to overwrite the value of an existing field. At least, I don't think this should be allowed; it's theoretically possible that someone might have a usecase for it and be depending on it. The other option would be to only disallow it if it conflicts with an m2m name, since that's the case where it can immediately destroy data. But overwriting any field's value with the result of annotate/aggregate could inadvertently destroy data if the returned instance is later saved, so I'm at least initially in favor of just making that impossible.

I'm thinking I'll do two separate patches here for the above two issues, since they aren't dependent on each other. (Perhaps that means this should be two tickets? I'll leave it as one for now.)

Changed 4 years ago by carljm

comment:4 follow-up: Changed 4 years ago by carljm

Ok, out of time for this at the moment; uploaded my initial tests. So far I haven't been able to reproduce the reported issue on trunk: the new aggregation_regress test passes. Any further detail from patrys or Luke (did you reproduce this on trunk when you accepted it?) on what I might be doing wrong in that test, or what other factor might be involved, would be welcome.

The other m2m_regress test that I attached does fail. The test verifies that a TypeError is in fact raised when you assign a non-sequence to an m2m attribute; but the existing data is nonetheless cleared. Ideally I would say the argument should be checked before existing data is cleared, but that really belongs in its own ticket.

comment:6 Changed 4 years ago by Alex

Seems related to #11256.

comment:7 in reply to: ↑ 4 Changed 4 years ago by lukeplant

Replying to carljm:

Any further detail from patrys or Luke (did you reproduce this on trunk when you accepted it?) on what I might be doing wrong in that test, or what other factor might be involved, would be welcome.

Yes, I did reproduce it with an appropriate model I had to hand. I look further at this tomorrow.

comment:8 Changed 4 years ago by carljm

One possible difference is that I only ran this test on Postgres and SQLite; I noticed that PG at least was returning no results on the query, which would eliminate any possibility of data-overwriting. It may be that this same test would fail on MySQL? Haven't gotten back to it to test that yet.

comment:9 Changed 4 years ago by patrys

I only test with sqlite and pg and could reproduce with both in 1.2 (although I used slightly more complex queries when I discovered it).

comment:10 Changed 4 years ago by carljm

  • Resolution set to duplicate
  • Status changed from new to closed

Closing this as duplicate of #11256. The possible data-loss aspect is new here, but the core issue and the fix are the same, and that ticket already has a patch.

There's also the question of m2m-assignment behavior, but that's not aggregation-related; filed a separate ticket #14394.

comment:11 Changed 3 years ago by jacob

  • milestone 1.3 deleted

Milestone 1.3 deleted

comment:12 Changed 2 years ago by akaariai

  • Component changed from ORM aggregation to Database layer (models, ORM)
Note: See TracTickets for help on using tickets.
Back to Top