Redis Dual Masters temporarily during a failure and recover

Question

We have three redis servers setup as follows:

Node1: Default Redis Master & Running Redis Sentinel Software
Node2: Redis Slave & Running Redis Sentinel Software
Node3: Redis Slave & Running Redis Sentinel Software

If I go to node1 and issue a command to stop the redis-server (service redis-server stop), the sentinel quickly detect that redis on the Master has gone down and promotes Node2 or Node3 to be a Master (Exactly what we want/expect).

Now suppose I got back to Node one and start it back up again (service redis-server start). If I issue "redis-cli info | grep ^role" what I see is that for a few seconds (10-15 seconds) Node1 still thinks its a Master.

Consequently we end up with dual Master's for a short period of time. Eventually after a few seconds the sentinels sort out the situation and Node1 is demoted to a slave. However I assume having dual Master's (even for a few seconds) could result in data integrity issues.

What happens to data that's sent to Node1 during a recovery when it still thinks it's a Master and is then demoted to a slave. Does that data sent to it get lost?

Is there any way we can avoid this? Is there a setting for the redis-server to not accept connections until the sentinal has told it weather its a slave/master? How else can we deal with this?

Thanks Brad

score 0 · Accepted Answer · answered Jun 16 '14 at 13:42

For anyone curious as to how we addressed this. Please see the following discussion on github: https://github.com/antirez/redis/issues/1813

Basically we setup an xinetd http service which consists of a bash shell script that queries the sentinal and reports the status back to the load balancer.

Redis Dual Masters temporarily during a failure and recover

1 Answers1