10

Following are my Celluloid codes.

  1. client1.rb One of the 2 clients. (I named it as client 1)

  2. client2.rb 2nd of the 2 clients. (named as client 2 )

Note:

the only the difference between the above 2 clients is the text that is passed to the server. i.e ('client-1' and 'client-2' respectively)

On testing this 2 clients (by running them side by side) against following 2 servers (one at time). I found very strange results.

  1. server1.rb (a basic example taken from the README.md of the celluloid-zmq)

    Using this as the example server for the 2 above clients resulted in parallel executions of tasks.

OUTPUT

ruby server1.rb

Received at 04:59:39 PM and message is client-1
Going to sleep now
Received at 04:59:52 PM and message is client-2

Note:

the client2.rb message was processed when client1.rb request was on sleep.(mark of parallelism)

  1. server2.rb

    Using this as the example server for the 2 above clients did not resulted in parallel executions of tasks.

OUTPUT

ruby server2.rb

Received at 04:55:52 PM and message is client-1
Going to sleep now
Received at 04:56:52 PM and message is client-2

Note:

the client-2 was ask to wait 60 seconds since client-1 was sleeping(60 seconds sleep)

I ran the above test multiple times all resulted in same behaviour.

Can anyone explain me from the results of the above tests that.

Question: Why is celluloid made to wait for 60 seconds before it can process the other request i.e as noticed in server2.rb case.?

Ruby version

ruby -v

ruby 2.1.2p95 (2014-05-08 revision 45877) [x86_64-darwin13.0]

Community
  • 1
  • 1
Viren
  • 5,764
  • 6
  • 35
  • 91

2 Answers2

6

Using your gists, I verified this issue can be reproduced in MRI 2.2.1 as well as jRuby 1.7.21 and Rubinius 2.5.8 ... The difference between server1.rb and server2.rb is the use of the DisplayMessage and message class method in the latter.


Use of sleep in DisplayMessage is out of Celluloid scope.

When sleep is used in server1.rb it is using Celluloid.sleep in actuality, but when used in server2.rb it is using Kernel.sleep ... which locks up the mailbox for Server until 60 seconds have passed. This prevents future method calls on that actor to be processed until the mailbox is processing messages ( method calls on the actor ) again.

There are three ways to resolve this:

  • Use a defer {} or future {} block.

  • Explicitly invoke Celluloid.sleep rather than sleep ( if not explicitly invoked as Celluloid.sleep, using sleep will end up calling Kernel.sleep since DisplayMessage does not include Celluloid like Server does )

  • Bring the contents of DisplayMessage.message into handle_message as in server1.rb; or at least into Server, which is in Celluloid scope, and will use the correct sleep.


The defer {} approach:

def handle_message(message)
  defer {
    DisplayMessage.message(message)
  }
end

The Celluloid.sleep approach:

class DisplayMessage
    def self.message(message)
      #de ...
      Celluloid.sleep 60
    end
end

Not truly a scope issue; it's about asynchrony.

To reiterate, the deeper issue is not the scope of sleep ... that's why defer and future are my best recommendation. But to post something here that came out in my comments:

Using defer or future pushes a task that would cause an actor to become tied up into another thread. If you use future, you can get the return value once the task is done, if you use defer you can fire & forget.

But better yet, create another actor for tasks that tend to get tied up, and even pool that other actor... if defer or future don't work for you.

I'd be more than happy to answer follow-up questions brought up by this question; we have a very active mailing list, and IRC channel. Your generous bounties are commendable, but plenty of us would help purely to help you.

digitalextremist
  • 5,813
  • 3
  • 39
  • 59
  • Sorry, but i noticed your answer after i had finished beautifying and posted mine. Anyways your answer has more approaches, so sounds better to me too. Here's my +1. – TheCodeArtist Jan 30 '16 at 18:36
  • you might want to correct the statement in the 2nd bullet-point of the "three ways to resolve this" list. I have verified that invoking `Celluloid.sleep()` within `DisplayMessage.message()` does in-fact trigger the [`actor sleeping` path](https://github.com/celluloid/celluloid/blob/master/lib/celluloid.rb#L401). It appears to be that **Celluloid can correctly determine the "actor" context even if methods of other objects are invoked by a Celluloid "actor"**. – TheCodeArtist Jan 30 '16 at 19:03
  • @TheCodeArtist you mean change the wording in the parenthesis? Because invoking `Celluloid.sleep` does work, like in my code sample. The `sleep` method body I linked to *does* detect if it's being run from inside an actor, but it does need to be explicitly called ( if the call is made outside `Celluloid` scope ) – digitalextremist Jan 30 '16 at 19:07
  • "which will end up calling Kernel.sleep" does not appear to be true. I added logs in my local copy of `celluloid.rb` and saw that `invoking Celluloid.sleep()` within `DisplayMessage.message()` does call `actor.sleep` (not `kernel.sleep()`) – TheCodeArtist Jan 30 '16 at 19:10
  • ... we're saying the same thing ... and if we're not, I'm very confused about what you mean, because your entire answer reiterates my bullet point several times. If explicitly called, `Celluloid.sleep` will invoke the appropriate method, whereas `sleep` by itself, if called without `DisplayMessage` scope, will call `Kernel.sleep` ... I will try to tweak that line to make it clearer? – digitalextremist Jan 30 '16 at 19:13
  • I edited it, maybe that's clearer but the message is the same as I first wrote. But for the record, I favor `defer { }` over explicitly calling `Celluloid.sleep` because sometimes we do not have control over external classes ... or we don't wish to invade them to make them compatible with `Celluloid` ... `defer {}` is the universal cure, no matter whether one owns the offending class or not. – digitalextremist Jan 30 '16 at 19:16
  • Yep. Crystal clear now. Next to wait for OP to wake-up and [+300] your answer. \(^.^)/ Cheers. – TheCodeArtist Jan 30 '16 at 19:19
  • Thanks/good. Nice running into you -- followed you on Twitter. I am a Celluloid maintainer ( hence knowledge of `defer` which is obscure ) and would gladly invite you to work on Celluloid, as you apparently have put a lot of work into understanding it. – digitalextremist Jan 30 '16 at 19:21
  • Oh! but i had no idea celluloid existed before today. I was scanning for any unanswered/neglected questions and came across this. Kudos to the self-explanatory code and the fact that its on github. – TheCodeArtist Jan 30 '16 at 19:32
  • @digitalextremist So what I understand any potential `IO` that is not a part of `Celluloid` core classes would cause Celluloid to work *serially* – Viren Feb 01 '16 at 11:31
  • The purpose of `Celluloid::IO` is to prevent that, but when an actor gets tied up, its mailbox will not be processing new requests. They will be received when the actor is no longer tied up. Reason being, the actor and its mailbox are operating in one thread, and each method call against the actor is a `task` which occurs in a fiber... by default. It is possible to use tasks which are each themselves a thread, but that's not the default behavior. In general, it's best to assume that a long-running process will tie up an actor. This is why I recommend `defer` or `future` where needed. – digitalextremist Feb 02 '16 at 22:17
  • Using `defer` or `future` pushes a task that would cause an actor to become tied up into another thread. If you use `future`, you can get the return value once the task is done, if you use `defer` you can fire & forget. But better yet, you can create another actor for tasks that tend to get tied up, and even pool that other actor... if `defer` or `future` don't work for you. I'd be more than happy to answer follow-up questions brought up by this question, we have a very active mailing list, and IRC channel. Your generous bounties are commendable, but plenty of us would help purely to help you. – digitalextremist Feb 04 '16 at 09:45
3

Managed to reproduce and fix the issue. Deleting my previous answer. Apparently, the problem lies in sleep. Confirmed by adding logs "actor/kernel sleeping" to the local copy of Celluloids.rb's sleep().


In server1.rb,

the call to sleep is within server - a class that includes Celluloid.

Thus Celluloid's implementation of sleep overrides the native sleep.

class Server
  include Celluloid::ZMQ

  ...

  def run
    loop { async.handle_message @socket.read }
  end

  def handle_message(message)

        ...

        sleep 60
  end
end

Note the log actor sleeping from server1.rb. Log added to Celluloids.rb's sleep()

This suspends only the current "actor" in Celluloid i.e. only the current "Celluloid thread" handling the client1 sleeps.


In server2.rb,

the call to sleep is within a different class DisplayMessage that does NOT include Celluloid.

Thus it is the native sleep itself.

class DisplayMessage
    def self.message(message)

           ...

           sleep 60
    end
end

Note the ABSENCE of any actor sleeping log from server2.rb.

This suspends the current ruby task i.e. the ruby server sleeps (not just a single Celluloid actor).


The Fix?

In server2.rb, the appropriate sleep must be explicitly specified.

class DisplayMessage
    def self.message(message)
        puts "Received at #{Time.now.strftime('%I:%M:%S %p')} and message is #{message}"
        ## Intentionally added sleep to test whether Celluloid block the main process for 60 seconds or not.
        if message == 'client-1'
           puts 'Going to sleep now'.red

           # "sleep 60" will invoke the native sleep.
           # Use Celluloid.sleep to support concurrent execution
           Celluloid.sleep 60
        end
    end
end
TheCodeArtist
  • 19,131
  • 3
  • 60
  • 123