How can I gracefully migrate open Websocket connections after a deployment slot swap?

Question

In my web app when the user logs in to the app, their browser opens a Websocket to the server so that updates can be pushed down to the browser.

It's an ASP.NET Core Web App (self-hosted) running in Azure App Services. I'd like to use Azure's deployment slot swapping feature to push code updates to production with zero downtime deployments.

In the limited testing I've done, it looks like after a slot swap the Websocket connection stays open to the original slot the browser was connected to. (So if the browser's Websocket was connected to slot A, and we swapped slot A and B so that new connections go to slot B, the Websocket would still be open to the app running on slot A.)

At some point the old slot will be taken offline, which will forcibly close any open Websockets. I would prefer to re-open the Websocket to the new slot as gracefully as possible and as soon as possible after the slot swap, so that if I update the Websocket-related code all clients will be running the new code as soon as possible.

A sketch of how this might work:

Slot swap takes place
A notification is sent to the code running on the old slot
Code running on the old slot pushes a Websocket message to reconnect
On receiving the message, the browser opens a new Websocket connection (which will go to the new slot)
When the connection succeeds, the browser closes the old Websocket

Is there a better way to do it?

How can the code running on the old slot know when it has been swapped?

Is handling this gracefully even possible? Or are there always going to be a bunch of race conditions?

score 6 · Accepted Answer · answered Mar 21 '18 at 19:25

6

The WebSockets do stay connected, because ARR can only direct new requests to the "new" application. You'd see the same behavior if you were downloading a large file in the middle of a swap, for example.

The way I've handled this is to have my deployment system (Octopus) do the swap, and then make a request to the "old" application that notifies all of the connected WebSocket clients that they need to disconnect and reconnect. The clients will immediately disconnect, pick a random delay (to prevent thousands of reconnects at once) and then reconnect after that delay.

answered Mar 21 '18 at 19:25

Jeremy

399
2
6

1

That's a good point about the random delay. It hadn't even occurred to me. One tweak I'm considering making is to only disconnect when the reconnect succeeds, in order to minimize the time the client is disconnected. – Michael Kropat Mar 26 '18 at 13:59
That's actually a great idea. I'm using Redis for the backplane, so the old instance still receives any messages that are posted. In my application, the messages for individual users are infrequent enough that a few seconds of disconnection hasn't caused any issues, but I'm definitely going to implement your idea to eliminate that gap anyway. – Jeremy Mar 27 '18 at 15:03

How can I gracefully migrate open Websocket connections after a deployment slot swap?

1 Answers1