Opened 5 years ago

Closed 10 months ago

Last modified 10 months ago

#14394 closed Bug (fixed)

Assigning bad data to an m2m attribute should not clear existing data

Reported by: carljm Owned by: nobody
Component: Database layer (models, ORM) Version: master
Severity: Normal Keywords:
Cc: krzysiumed@… Triage Stage: Accepted
Has patch: yes Needs documentation: no
Needs tests: no Patch needs improvement: yes
Easy pickings: no UI/UX: no


If you assign something nonsensical to a many-to-many attribute on a model instance, you'll get a TypeError; but not until after Django has cleared all the existing relationship data. Ideally Django would verify that you've passed in something reasonable before it clears the existing data.

(This issue was originally half of #14373).

Attached patch with test demonstrating the issue.

Attachments (3)

14394_r13984_testonly.diff (1.9 KB) - added by carljm 5 years ago.
validate_bad_data_assignment_to_m2m.patch (1.7 KB) - added by igors 5 years ago.
validate_bad_data_assignment_to_m2m_strictly.patch (6.3 KB) - added by igors 5 years ago.

Download all attachments as: .zip

Change History (17)

Changed 5 years ago by carljm

comment:1 Changed 5 years ago by PaulM

  • milestone set to 1.3
  • Needs documentation unset
  • Needs tests unset
  • Patch needs improvement unset
  • Triage Stage changed from Unreviewed to Accepted

This seems like reasonable behavior.

Changed 5 years ago by igors

comment:2 Changed 5 years ago by igors

  • Has patch set

comment:3 Changed 5 years ago by igors

  • Triage Stage changed from Accepted to Unreviewed

comment:4 Changed 5 years ago by PaulM

  • Triage Stage changed from Unreviewed to Accepted

Please don't change accepted tickets to unreviewed. These statuses have nothing to do with your patch.

comment:5 Changed 5 years ago by john_scott

The if not isinstance(value, (list, tuple)) check in the patch is not quite right. Any iterable (QuerySet, set, etc.) is also valid so long as it yields valid related model instances or primary keys.

comment:6 Changed 5 years ago by igors

Following jonh_scott's suggestion, I've attached a patch that validates the value more carefully. It has to be an iterable with valid models instances or primary keys.

comment:7 Changed 5 years ago by john_scott

That is certainly much more strict validation. I will leave it to people much smarter than I to decide whether it is worth the queries to test if each int is a valid pk. There is validation much further down the stack that tests this before saving to the db, just not before calling clear().

Since the add() method can handle an iterable of pks, perhaps just provide this rather than model instances since _add_items() just grabs these anyway:

    instance = manager.model.objects.get(pk=obj)
except (manager.model.DoesNotExist, ValueError):
    raise TypeError(u"Cannot set value to this ManyToManyField. Make sure you pass valid models or primary keys.")

comment:8 Changed 5 years ago by Alex

[14602] removed ManyToManyField.isValidIDList which was unused and did something a little different from what's proposed here, but I'm not it for posterity (and because Carl asked me to).

comment:9 Changed 4 years ago by julien

  • Severity set to Normal
  • Type set to Bug

comment:10 Changed 4 years ago by jacob

  • milestone 1.3 deleted

Milestone 1.3 deleted

comment:11 Changed 4 years ago by krzysiumed

  • Cc krzysiumed@… added
  • Easy pickings unset
  • Patch needs improvement set
  • UI/UX unset

The last patch seems to be almost ready-for-checkin, but I think we should not check if primary keys are valid (lines 743-746 in last patch), because it requires large amount of sql queries and it could be slow. Instead, we should check if primary key is int. Maybe we should add note in documentation that django does not check it?

In my opinion tests need small improvements:

  • In test_assigning_any_iterable_with_valid_models_to_m2m_works we don't need to check all possible iterables: sets, lists, tuples.
  • As I wrote, we should not validate primary keys, so we don't need test_assigning_iterable_with_invalid_pks_to_m2m_doesnt_clear_existing_relations.
  • We can join test_assigning_any_iterable_with_valid_models_to_m2m_works and test_assigning_any_iterable_with_valid_primary_keys_to_m2m_works by testing c1.tags = [t1,].
  • There are redundant lines of code (for example t1 = Tag.objects.create(name='t1')). We can move them to setUp.

comment:12 Changed 4 years ago by akaariai

The "lets iterate through the objects and fetch them one at a time" is a clear no-no. However, you can't assume a primary key is an int.

I really don't know what there is to do here. You are at a point where data validation should have prevented the errors. If you get an error, rollback your transaction. If you don't want to do that, use a savepoint.

This is not a unique situation in Django. For example, if you do a .save() on multitable inherited model, you can get it half saved if there is a data type mismatch in the topmost object in the inheritance chain. Something like:

class A:
class B(A):
    f = models.IntegerField()
B(f='not an integer').save()

do that and you got a save into A's table, but not into B's table.

Now, this is not a reason to avoid all sanity checks. But, aiming for anything like perfection (guaranteeing that the insert will succeed after the delete) isn't worth the effort in my opinion. Even the query for all the objects one by one will not be guaranteed to work under normal transactional rules. You would need something like select for update. Lets not go there.

comment:13 Changed 10 months ago by claudep

  • Resolution set to fixed
  • Status changed from new to closed

I think this issue has been addressed in current code, notably by bc9be72bdc9bb4dfc7f967ac3856115f0a6166b8. I'll still commit a test for confirmation.

comment:14 Changed 10 months ago by Claude Paroz <claude@…>

In 7ce9644d9347455ae6f9bd383788a65e4bcadda3:

Added a test to ensure bad assignation to M2M doesn't clear data

Refs #14394.

Note: See TracTickets for help on using tickets.
Back to Top