93

If I have two multiple threads accessing a HashMap, but guarantee that they'll never be accessing the same key at the same time, could that still lead to a race condition?

agentofuser
  • 8,291
  • 10
  • 48
  • 81

4 Answers4

107

In @dotsid's answer he says this:

If you change a HashMap in any way then your code is simply broken.

He is correct. A HashMap that is updated without synchronization will break even if the threads are using disjoint sets of keys. Here are some of the things that can go wrong.

  • If one thread does a put, then another thread may see a stale value for the hashmap's size.

  • When a thread does a put that triggers a rebuild of the table, another thread may see transient or stale versions of the hashtable array reference, its size, its contents or the hash chains. Chaos may ensue.

  • When a thread does a put for a key that collides with some key used by some other thread, and the latter thread does a put for its key, then the latter might see a stale copy of hash chain reference. Chaos may ensue.

  • When one thread probes the table with a key that collides with one of some other thread's keys, it may encounter that key on the chain. It will call equals on that key, and if the threads are not synchronized, the equals method may encounter stale state in that key.

And if you have two threads simultaneously doing put or remove requests, there are numerous opportunities for race conditions.

I can think of three solutions:

  1. Use a ConcurrentHashMap.
  2. Use a regular HashMap but synchronize on the outside; e.g. using primitive mutexes, Lock objects, etcetera.
  3. Use a different HashMap for each thread. If the threads really have a disjoint set of keys, then there should be no need (from an algorithmic perspective) for them to share a single Map. Indeed, if your algorithms involve the threads iterating the keys, values or entries of the map at some point, splitting the single map into multiple maps could give a significant speedup for that part of the processing.
Stephen C
  • 632,615
  • 86
  • 730
  • 1,096
  • Can you elaborate on the type of chaos? Infinite loop? Exceptions? – piepi May 22 '21 at 11:34
  • Either of those may be possible, depending on the HashMap implementation, etc. **HOWEVER** - is neither possible or necessary to enumerate all the possible things that could go wrong. All the reader needs to know is that any code that does this is unreliable ... because it is relying on properties that are not guaranteed by the JLS or the `HashMap` spec. – Stephen C May 23 '21 at 02:08
32

Just use a ConcurrentHashMap. The ConcurrentHashMap uses multiple locks which cover a range of hash buckets to reduce the chances of a lock being contested. There is a marginal performance impact to acquiring an uncontested lock.

To answer your original question: According to the javadoc, as long as the structure of the map doesn't change, your are fine. This mean no removing elements at all and no adding new keys that are not already in the map. Replacing the value associated with existing keys is fine.

If multiple threads access a hash map concurrently, and at least one of the threads modifies the map structurally, it must be synchronized externally. (A structural modification is any operation that adds or deletes one or more mappings; merely changing the value associated with a key that an instance already contains is not a structural modification.)

Though it makes no guarantees about visibility. So you have to be willing to accept retrieving stale associations occasionally.

Tim Bender
  • 19,152
  • 2
  • 44
  • 56
6

It depends on what you mean under "accessing". If you just reading, you can read even the same keys as long as visibility of data guarantied under "happens-before" rules. This means that HashMap shouldn't change and all changes (initial constructions) should be completed before any reader start to accessing HashMap.

If you change a HashMap in any way then your code is simply broken. @Stephen C provides very good explanation why.

EDIT: If the first case is your actual situation, I recommend you to use Collections.unmodifiableMap() to be shure that your HashMap is never changed. Objects which are pointed by HashMap should not change also, so aggressive using final keyword can help you.

And as @Lars Andren says, ConcurrentHashMap is best choice in most cases.

Denis Bazhenov
  • 8,695
  • 6
  • 41
  • 62
  • 2
    ConcurrentHashMap is a best choice in my opinion. The only reason I didn't recommend it, because of author didn't ask it :) It have less throughput because of CAS operations, but as the golden rule of concurrent programming says: "Make it right, and only then make it fast" :) – Denis Bazhenov Apr 22 '10 at 06:43
  • `unmodifiableMap` ensures the client cannot change the map. It does nothing to ensure that the underlying map is not changed. – Pete Kirkham Apr 22 '10 at 07:42
  • As I already pointed out: "Objects which are pointed by HashMap should not change also" – Denis Bazhenov Apr 22 '10 at 07:52
4

Modifying a HashMap without proper synchronization from two threads may easily lead to a race condition.

  • When a put() leads to a resize of the internal table, this takes some time and the other thread continues to write to the old table.
  • Two put() for different keys lead to an update of the same bucket if the keys' hashcodes are equal modulo the table size. (Actually, the relation between hashcode and bucket index is more complicated, but collisions may still occur.)
Christian Semrau
  • 8,408
  • 2
  • 28
  • 37
  • 1
    It is worse than just race conditions. Depending on the internals of the `HashMap` implementation you are using, you can get corruption of the `HashMap` data structures, etcetera cause by memory anomalies. – Stephen C Mar 10 '20 at 13:12