Opened 8 years ago

Closed 3 years ago

Last modified 3 years ago

#14394 closed Bug (fixed)

Assigning bad data to an m2m attribute should not clear existing data

Reported by: Carl Meyer Owned by: nobody
Component: Database layer (models, ORM) Version: master
Severity: Normal Keywords:
Cc: krzysiumed@… Triage Stage: Accepted
Has patch: yes Needs documentation: no
Needs tests: no Patch needs improvement: yes
Easy pickings: no UI/UX: no


If you assign something nonsensical to a many-to-many attribute on a model instance, you'll get a TypeError; but not until after Django has cleared all the existing relationship data. Ideally Django would verify that you've passed in something reasonable before it clears the existing data.

(This issue was originally half of #14373).

Attached patch with test demonstrating the issue.

Attachments (3)

14394_r13984_testonly.diff (1.9 KB) - added by Carl Meyer 8 years ago.
validate_bad_data_assignment_to_m2m.patch (1.7 KB) - added by Igor Sobreira 8 years ago.
validate_bad_data_assignment_to_m2m_strictly.patch (6.3 KB) - added by Igor Sobreira 8 years ago.

Download all attachments as: .zip

Change History (17)

Changed 8 years ago by Carl Meyer

Attachment: 14394_r13984_testonly.diff added

comment:1 Changed 8 years ago by Paul McMillan

milestone: 1.3
Triage Stage: UnreviewedAccepted

This seems like reasonable behavior.

Changed 8 years ago by Igor Sobreira

comment:2 Changed 8 years ago by Igor Sobreira

Has patch: set

comment:3 Changed 8 years ago by Igor Sobreira

Triage Stage: AcceptedUnreviewed

comment:4 Changed 8 years ago by Paul McMillan

Triage Stage: UnreviewedAccepted

Please don't change accepted tickets to unreviewed. These statuses have nothing to do with your patch.

comment:5 Changed 8 years ago by John-Scott Atlakson

The if not isinstance(value, (list, tuple)) check in the patch is not quite right. Any iterable (QuerySet, set, etc.) is also valid so long as it yields valid related model instances or primary keys.

Changed 8 years ago by Igor Sobreira

comment:6 Changed 8 years ago by Igor Sobreira

Following jonh_scott's suggestion, I've attached a patch that validates the value more carefully. It has to be an iterable with valid models instances or primary keys.

comment:7 Changed 8 years ago by John-Scott Atlakson

That is certainly much more strict validation. I will leave it to people much smarter than I to decide whether it is worth the queries to test if each int is a valid pk. There is validation much further down the stack that tests this before saving to the db, just not before calling clear().

Since the add() method can handle an iterable of pks, perhaps just provide this rather than model instances since _add_items() just grabs these anyway:

    instance = manager.model.objects.get(pk=obj)
except (manager.model.DoesNotExist, ValueError):
    raise TypeError(u"Cannot set value to this ManyToManyField. Make sure you pass valid models or primary keys.")

comment:8 Changed 7 years ago by Alex Gaynor

[14602] removed ManyToManyField.isValidIDList which was unused and did something a little different from what's proposed here, but I'm not it for posterity (and because Carl asked me to).

comment:9 Changed 7 years ago by Julien Phalip

Severity: Normal
Type: Bug

comment:10 Changed 7 years ago by Jacob

milestone: 1.3

Milestone 1.3 deleted

comment:11 Changed 6 years ago by Christopher Medrela

Cc: krzysiumed@… added
Easy pickings: unset
Patch needs improvement: set
UI/UX: unset

The last patch seems to be almost ready-for-checkin, but I think we should not check if primary keys are valid (lines 743-746 in last patch), because it requires large amount of sql queries and it could be slow. Instead, we should check if primary key is int. Maybe we should add note in documentation that django does not check it?

In my opinion tests need small improvements:

  • In test_assigning_any_iterable_with_valid_models_to_m2m_works we don't need to check all possible iterables: sets, lists, tuples.
  • As I wrote, we should not validate primary keys, so we don't need test_assigning_iterable_with_invalid_pks_to_m2m_doesnt_clear_existing_relations.
  • We can join test_assigning_any_iterable_with_valid_models_to_m2m_works and test_assigning_any_iterable_with_valid_primary_keys_to_m2m_works by testing c1.tags = [t1,].
  • There are redundant lines of code (for example t1 = Tag.objects.create(name='t1')). We can move them to setUp.

comment:12 Changed 6 years ago by Anssi Kääriäinen

The "lets iterate through the objects and fetch them one at a time" is a clear no-no. However, you can't assume a primary key is an int.

I really don't know what there is to do here. You are at a point where data validation should have prevented the errors. If you get an error, rollback your transaction. If you don't want to do that, use a savepoint.

This is not a unique situation in Django. For example, if you do a .save() on multitable inherited model, you can get it half saved if there is a data type mismatch in the topmost object in the inheritance chain. Something like:

class A:
class B(A):
    f = models.IntegerField()
B(f='not an integer').save()

do that and you got a save into A's table, but not into B's table.

Now, this is not a reason to avoid all sanity checks. But, aiming for anything like perfection (guaranteeing that the insert will succeed after the delete) isn't worth the effort in my opinion. Even the query for all the objects one by one will not be guaranteed to work under normal transactional rules. You would need something like select for update. Lets not go there.

comment:13 Changed 3 years ago by Claude Paroz

Resolution: fixed
Status: newclosed

I think this issue has been addressed in current code, notably by bc9be72bdc9bb4dfc7f967ac3856115f0a6166b8. I'll still commit a test for confirmation.

comment:14 Changed 3 years ago by Claude Paroz <claude@…>

In 7ce9644d9347455ae6f9bd383788a65e4bcadda3:

Added a test to ensure bad assignation to M2M doesn't clear data

Refs #14394.

Note: See TracTickets for help on using tickets.
Back to Top