13

I got a Cloud Service deployment with 4 worker roles, one of which got auto-scaling enabled. As soon as auto-scaling occurs, all instances of all roles are recycling.

Ideally, I'd like to stop the roles from recycling or at least terminate the work of all other roles in a controlled way.

I found out, that you can react to the RoleEnvironment.Changing event and cancel it to request a graceful shutdown (i.e. make OnStop being called). However, by adding tracing output to the Changing event handler, I noticed that the Changing event was obviously not even fired, so the cancellation was not being registered either.

private void RoleEnvironmentChanging(object sender, RoleEnvironmentChangingEventArgs e)
{
    // This tracing output does not show up in the logs table.
    Trace.TraceInformation("RoleEnvironmentChanging event fired.");
    if ((e.Changes.Any(change => change is RoleEnvironmentConfigurationSettingChange)))
    {
        // This one neither.
        Trace.TraceInformation("One of the changes is a RoleEnvironmentConfigurationSettingChange. Cancelling..");

        e.Cancel = true;
    }
    if ((e.Changes.Any(change => change is RoleEnvironmentTopologyChange)))
    {
        // This one neither.
        Trace.TraceInformation("One of the changes is a RoleEnvironmentTopologyChange. Cancelling.");

        e.Cancel = true;
    }
}

public override bool OnStart()
{
    // Hook up to the changing event to prevent roles from unnecessarily restarting.
    RoleEnvironment.Changing += RoleEnvironmentChanging;

    // Set the maximum number of concurrent connections
    ServicePointManager.DefaultConnectionLimit = 12;

    bool result = base.OnStart();

    return result;
}

Also adding an internal endpoint to each role did not bring the change. Here the configuration from the .csdef:

<WorkerRole name="MyRole" vmsize="Medium">
[...ConfigurationSettings...]
<Endpoints>
  <InternalEndpoint name="Endpoint1" protocol="http" />
</Endpoints>
</WorkerRole>

Also changing the protocol to "any" wasn't successful.

How can I stop my role instances from recycling after a scaling operation?

EDIT:
» Included code snippets
» Fixed typos

Community
  • 1
  • 1
Ben Sch
  • 2,639
  • 4
  • 17
  • 23

3 Answers3

2

Did you try one of the following?

  • Check whether the event is being fired in the instances of role which is auto-scaling (to make sure its not a problem with the internal endpoint)
  • Do a complete re-deployment (instead of update).
  • Add a short Thread.Sleep() after the Tracing output in the event handler (sometimes the role is being shut down before the trace output can be registered)
  • Do a change in one of the configs via the management portal (and check whether event is being triggered)
  • Check whether the other events (for instance RoleEnvironment.Changed) are being fired
Paul Facklam
  • 1,588
  • 11
  • 15
  • 1
    Thanks for these hints, Paul. Unfortunately none of these brought success for me. That's why I will not mark your answer as accepted as it might mislead other users. However, since your answer showed most effort in providing help, I will award the bounty to you. If you got further ideas, please let me know! – Ben Sch May 02 '15 at 11:26
0

Wow, over 2 years w/o a real answer here. Too bad. My experience with the topic is: set e.Cancel to false if your instance is able to work after and while scaling without needed to be reconfigured.

if (e.Changes.Any(change => change is RoleEnvironmentConfigurationSettingChange)){
Trace.WriteLine("with recycle");
e.Cancel = true;
}
else {
Trace.WriteLine("without recycle");
e.Cancel = false;
}

Maybe you want to set Trace.AutoFlush = true at OnStart.

EvertonMc
  • 373
  • 1
  • 10
-3

Role Environment Methods and Events There are five main places where you can write code to respond to environment changes. Two of these, OnStart and OnStop, are methods on the RoleEntryPoint class which you can override in your main role class (which is called WebRole or WorkerRole by default). The other three are events on the RoleEnvironment class which you can subscribe to: Changing, Changed and Stopping.

The purpose of these methods is pretty clear from their names:

OnStart gets called when the instance is first started.
Changing gets called when something about the role environment is about to change.
Changed gets called when something about the role environment has just been changed.

Stopping gets called when the instance is about to be stopped. OnStop gets called when the instance is being stopped. In all cases, there’s nothing your code can do to prevent the corresponding action from occurring, but you can respond to it in any way you wish. In the case of the Changing event, you can also choose whether the instance should be recycled to deal with the configuration change by setting e.Cancel = true.

Why aren’t Changing and Changed firing in my application? When I first started exploring this topic, I observed the following unusual behaviour in both the Windows Azure Compute Emulator (previously known as the Development Fabric) and in the cloud:

The Changing and Changed events did not fire on any instance when I made configuration changes. RoleEnvironment.CurrentRoleInstance.Role.Instances.Count always returned 1, even when there were many instances in the role. It turns out that this is the expected behaviour when a role has in no internal endpoints defined, as documented in this MSDN article. So the solution is simply to define an internal endpoint in your ServiceDefinition.csdef file like this:

<Endpoints>
  <InternalEndpoint name=”InternalEndpoint1″ protocol=”http” />
</Endpoints>

Which Events Fire Where and When? Even though the names of the events seem pretty self-explanatory, the exact behaviour when scaling deployments up and down is not necessarily what you might expect. The following diagram shows which events fire in an example scenario containing a single role. 2 instances are deployed initially, the deployment is then scaled to 4 instances, then back down to 3, and finally the deployment is stopped.

taken from http://azure.microsoft.com/blog/2011/01/04/responding-to-role-topology-changes/

Barak Kedem
  • 630
  • 6
  • 20
  • 2
    Thanks for the answer. But merely copy-pasting a website which I even included in my question and stated that this doesn't work, is not very helpful to me, sorry! Any other ideas why this did not work for me? – Ben Sch May 01 '15 at 06:39