Django models __eq__ and __hash__ treat all unsaved model instances as identical
|Reported by:||fengb||Owned by:||nobody|
|Component:||Database layer (models, ORM)||Version:||1.4|
|Cc:||jdunck@…, Ben Davis||Triage Stage:||Accepted|
|Has patch:||no||Needs documentation:||no|
|Needs tests:||no||Patch needs improvement:||no|
When a model instance is still new and unsaved, it has no primary key.
Both eq and hash rely solely on _get_pk_val(), which means all unsaved model instances are treated as though they are identical.
b1 = models.Base()
b2 = models.Base()
b1 == b2
hash(b1) == hash(b2)
For integrity, we should add a check for Python object identity since if both instances are saved, they will end up with different PKs.
For hashing, we can use the default Python hashing to prevent large collisions based on hash(None). We would also need to cache the hash so that hash(instance) will be the same both before save and after save such that:
b = models.Base()
s = set()
b in set
This last case actually has an edge case that I'm not sure is solvable.
b = models.Base.object.get(id=b.id)
b in set