3

Couldn't find the details of this anywhere online, when comparing two frozensets does Python iterate through the elements in one of the sets or does it check the hash values of the frozensets since frozensets are hashable?

SW Williams
  • 427
  • 1
  • 4
  • 14

2 Answers2

3

Since the reference docs don't say anything about this, it's implementation-dependent, so there is no answer short of looking at the source code for the version of Python you're using (in your CPython distribution's Objects/setobject.c). Looking at the source for Python 3.7.0, the answer is "maybe" ;-)

Equality first checks whether the frozensets have the same size (len()). If not, they can't be equal, so False is returned at once.

Else the hash codes are compared if they've already been computed. If they have already been computed, then False is returned at once if the hash codes aren't equal. Else element-by-element code is invoked to check whether one is a subset of the other.

A hash code for a frozenset isn't computed just for the heck of it - that would be an expense that may not pay off. So something has to force it. The primary use case for frozensets at the start was to allow sets of sets, and in that context hash codes will be computed as a normal part of adding a frozenset to a containing set. The C-level set implementation contains a slot to record the hash if and when it's computed, which is initialized to -1 (a reserved value that means "no hash code known" internally).

Tim Peters
  • 55,793
  • 10
  • 105
  • 118
1

hash(x) == hash(y) does not imply that x == y:

>>> help(hash)
hash(...)
    hash(object) -> integer

    Return a hash value for the object.  Two objects with the same value have
    the same hash value.  The reverse is not necessarily true, but likely.

so to compare two frozenset values for equality, you still need to check that both sets have the same size, then check if every element in one is also in the other.

I leave it as an exercise for the reader with lots of spare time to find two different frozensets with the same hash value.

chepner
  • 389,128
  • 51
  • 403
  • 529
  • 1
    FYI, you can find a whole bunch with the same hash code by building singleton frozensets with the integers `2**i` for, say, `i in range(300)` :-) – Tim Peters Aug 10 '18 at 04:26