Opened 3 years ago

Closed 3 years ago

Last modified 3 years ago

#19634 closed Bug (fixed)

Broken __hash__ methods

Reported by: akaariai Owned by: aaugustin
Component: Python 3 Version: master
Severity: Normal Keywords: sprint2013
Cc: Triage Stage: Accepted
Has patch: no Needs documentation: no
Needs tests: no Patch needs improvement: no
Easy pickings: no UI/UX: no

Description

We have some classes which defined __eq__ but not __hash__ before the py3 support. Python 3 doesn't allow these objects to be used in dicts, so the fix was to introduce:

__hash__ = object.__hash__

However, this is doing the wrong thing. The Python docs say that the only requirement is that objects which compare equal have the same hash value. However, if we define __eq__ so that it makes object with different id() values equal, then object.__hash__ is failing this requirement.

This isn't a regression - we have done the same thing implicitly in python 2 for ages. However, py3 introduced the requirement of defining __hash__ exactly to avoid this situation, so we should consider fixing the __hash__ functions somehow.

I am not sure what the implications of abusing object.__hash__ are.

The problematic cases seem to be:

> git grep '__hash__ = obj'
django/db/backends/__init__.py:    __hash__ = object.__hash__
django/db/models/fields/__init__.py:    __hash__ = object.__hash__
django/dispatch/saferef.py:    __hash__ = object.__hash__
django/test/html.py:    __hash__ = object.__hash__
django/utils/functional.py:        __hash__ = object.__hash__

Attachments (1)

#19634-broken_hash_methods.diff (2.3 KB) - added by regebro 3 years ago.

Download all attachments as: .zip

Change History (10)

comment:1 Changed 3 years ago by aaugustin

  • Component changed from Uncategorized to Python 3
  • Triage Stage changed from Unreviewed to Accepted

comment:2 Changed 3 years ago by regebro

  • Keywords sprint2013 added
  • Owner changed from nobody to regebro
  • Status changed from new to assigned

When "cheating" with the hash in this way values that should evaluate as equal will not do so when checking in dictionaries and sets:

>>> class FakeHash(object):
...    def __init__(self, value):
...        self.value = value
...    def __eq__(self, o):
...        return self.value == o.value
...    __hash__ = object.__hash__
... 
>>> f1 = FakeHash(5)
>>> f2 = FakeHash(5)
>>> f1 == f2
True
>>> d = {f1: 'value'}
>>> f2 in d
False
>>> s = {f1}
>>> f2 in s
False

It will work in tuples and lists though.

>>> l = [f1]
>>> f2 in l
True
>>> t = f1,
>>> f2 in t
True

Implementing a real hash in most cases listed above is easy:

>>> class RealHash(object):
...     def __init__(self,value):
...        self.value = value
...     def __eq__(self,o):
...        return self.value == o.value
...     def __hash__(self):
...        return hash(self.value)
... 
>>> r1 = RealHash(5)
>>> r2 = RealHash(5)
>>> r1 == r2
True
>>> dr = {r1: 'value'}
>>> r2 in dr
True
>>> s2 = {r1}
>>> r2 in s2
True

This is usually what you expect. I'll take a stab at this.

Changed 3 years ago by regebro

comment:3 follow-up: Changed 3 years ago by regebro

A patch that fixes most of these attached.

When it comes to BaseDatabaseWrapper, it's clear that you can have several DatabaseWrappers that has the same parameters but are different connections, so you can't implement hash in a way that won't break tests. That opens the question if there really should be a eq, but that's a different question.

comment:4 Changed 3 years ago by aaugustin

  • Owner changed from regebro to aaugustin

This looks interesting. I'll review that post-sprint.

comment:5 in reply to: ↑ 3 Changed 3 years ago by aaugustin

Replying to regebro:

When it comes to BaseDatabaseWrapper, it's clear that you can have several DatabaseWrappers that has the same parameters but are different connections, so you can't implement hash in a way that won't break tests. That opens the question if there really should be a eq, but that's a different question.

Its __eq__ method compares the alias attribute, which is the key in the DATABASES dictionary. Parameters aren't involved at all. What's wrong with returning hash(self.alias)?

comment:6 Changed 3 years ago by aaugustin

I found the answer to my question after running the tests:

======================================================================
FAIL: test_connections_thread_local (regressiontests.backends.tests.ThreadTests)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/Users/myk/Documents/dev/django/tests/regressiontests/backends/tests.py", line 626, in test_connections_thread_local
    self.assertEqual(len(connections_set), 6)
AssertionError: 2 != 6

======================================================================
FAIL: test_default_connection_thread_local (regressiontests.backends.tests.ThreadTests)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/Users/myk/Documents/dev/django/tests/regressiontests/backends/tests.py", line 599, in test_default_connection_thread_local
    3)
AssertionError: 2 != 3

I think these tests are poorly written and should be updated.

comment:7 Changed 3 years ago by akaariai

Yeah, rewriting those tests would be a good idea, they do cause problems somewhat often.

comment:8 Changed 3 years ago by Aymeric Augustin <aymeric.augustin@…>

  • Resolution set to fixed
  • Status changed from assigned to closed

In e76147a83a7306d6d83057b982207ddab37eb942:

Fixed #19634 -- Added proper hash methods.

Classes overriding eq need a hash such that equal objects have
the same hash.

Thanks akaariai for the report and regebro for the patch.

comment:9 Changed 3 years ago by Aymeric Augustin <aymeric.augustin@…>

In 702d39921cb95883d3b7590eadd65b03f76d0f92:

[1.5.x] Fixed #19634 -- Added proper hash methods.

Classes overriding eq need a hash such that equal objects have
the same hash.

Thanks akaariai for the report and regebro for the patch.

Backport of e76147a from master.

Note: See TracTickets for help on using tickets.
Back to Top