2

I'm curious about thread safety for hashes in Ruby. Running the following from the console (Ruby 2.0.0-p247):

h = {}
10.times { Thread.start { 100000.times {h[0] ||= 0; h[0] += 1;} } }

returns

{0=>1000000}

which is the correct expected value.

Why does it work? Can I rely on hashes being thread-safe with this version of Ruby?

Edit: Testing 100 times:

counter = 0
100.times do
  h={}
  threads = Array.new(10) { Thread.new { 10000.times { h[0] ||= 0; h[0] += 1 } } }
  threads.map { |thread| thread.join }
  counter += 1 if h[0] != 100000
end
puts counter

Counter is still 0 at the end. I tried up to 10K times and never had a single thread-safety issue with this code.

Nicolas M.
  • 731
  • 1
  • 12
  • 22

2 Answers2

4

No, you cannot rely on Hashes being thread safe, because they aren't built to be thread safe, most probably for performance reasons. In order to overcome these limitations of the standard library, Gems have been created which provide thread safe (thread_safe) or immutable (hamster) data structures. These will make accessing the data thread safe, but your code has a different problem in addition to that:

Your output will not be deterministic; in fact, I tried you code a few times and once I got 544988 as result. In your code, a classical race condition can occur because there are separate reading and writing steps involved (i.e. they are not atomic). Consider the expression h[0] ||= 0, which basically translates to h[0] || h[0] = 0. Now, it is easy to construct a case where a race condition occurs:

  • thread 1 reads h[0] and finds it is nil
  • thread 2 reads h[0] and finds it is nil
  • thread 1 sets h[0] = 0 and increments h[0] += 1
  • thread 2 sets h[0] = 0 and increments h[0] += 1
  • the resulting hash is {0=>1} although the correct result would be {0=>2}

If you want to make sure that your data will not be corrupted, you can lock the operation with a mutex:

require 'thread'
semaphore = Mutex.new

h = {}

10.times do
  Thread.start do
    semaphore.synchronize do
      100000.times {h[0] ||= 0; h[0] += 1;}
    end
  end
end
Patrick Oscity
  • 49,954
  • 15
  • 127
  • 157
  • Another solution is immutable data structures. https://github.com/hamstergem/hamster – Reactormonk Mar 27 '14 at 00:05
  • 1
    @Reactormonk I am pretty sure that thread safe data structures will not deal with the problem of race conditions though. Made the same mistake by referencing the thread_safe gem ;-) These gems only provide thread-safe access to the data structure, but the problem here is in the separate read/write steps which has nothing to do with the data structure. – Patrick Oscity Mar 27 '14 at 00:13
  • Thanks! I am aware of both those gems and thread safety issues that can occur with this type of code. I edited my question with a sample experiment and not a single time could I reproduce a thread safety issue with the code. I was curious about why this code is not breaking and if this is just an exception or it would always work this way. – Nicolas M. Mar 27 '14 at 00:22
  • I think it's simply working by chance, and the chances are very good, but you cannot rely on it to always be like that. – Patrick Oscity Mar 27 '14 at 00:33
  • Here's another thread explaining why even `+=` and `||=` for themselves are not thread-safe: http://stackoverflow.com/questions/15184338/how-to-know-what-is-not-thread-safe-in-ruby – Patrick Oscity Mar 27 '14 at 00:36
1

It is more accurate to say thread safety in ruby hashes depends more upon the runtime than on the code. I wasn't able to witness a race condition in any of the examples in MRI 2.6.2. I suspect this to be that MRI threads won't be interrupted when native operations are being executed and MRI Hash is native written in C. However, in jruby 9.2.8.0 I did see the race condition.

Here is my example:

loops = 100
round = 0
while true do
  round += 1
  h={}

  work = lambda do
    h[0] = 0 if h[0].nil?
    val = h[0]
    val += 1

    # Calling thread pass in MRI will absolutely exhibit the classic race
    # condition described in https://en.wikipedia.org/wiki/Race_condition .
    # Otherwise MRI doesn't exhibit the race condition as it won't interrupt the
    # small amount of work taking place in this lambda.
    #
    # In jRuby the race condition will be exhibited quickly.

    # Thread.pass if val > 10

    h[0] = val
  end

  threads = Array.new(10) { Thread.new { loops.times { work.call } } }
  threads.map { |thread| thread.join }

  expected = loops * threads.size
  if h[0] != expected
    puts "#{h[0]} != #{expected}"
    break
  end
  puts "round #{round}" if round % 10000 == 0
end

Under jruby I get this result:

% jruby counter.rb
597 != 1000

Under MRI I get this result which will run without exhibiting the race condition for a long time before having to kill it:

% ruby counter.rb
round 10000
round 20000
round 30000
round 40000
round 50000
round 60000
...
round (very large number)
^CTraceback (most recent call last):
        3: from counter.rb:25:in `<main>'
        2: from counter.rb:25:in `map'
        1: from counter.rb:25:in `block in <main>'
counter.rb:25:in `join': Interrupt

If I uncomment the Thread.pass if val > 10 line then MRI will exhibit the race condition immediately.

% ruby counter.rb
112 != 1000

% ruby counter.rb
110 != 1000
monde
  • 71
  • 5