Questions tagged [high-availability]

High availability is a software design approach and implementation that ensures a prearranged level of operational performance will be met during a contractual measurement period.

Attributes of high availability (HA):

Maximum uptime
Online maintenance - With little or no service interruption.
Simplicity - Complexity is an enemy of reliability, and encourages operator error, and so it is best avoided (e.g., Does a particular use-case really require the burden of implementing HA?).

Approaches that increase availability:

Fault-tolerance: Duplicate services waiting to take over should the primary fail or become unreachable.
- Active/Active +Enables load-balancing -More complicated
- Active/Passive +Simpler -Does not increase load capacity.
Replication:
- Synchronous +Safer -Slow over longer distances.
  - The "C" in CAP Theorem.
- Asynchronous +Faster -Possibility of data loss
  - The "A" in CAP Theorem.
Graceful degradation: Rate limiting and client throttling.

1355 questions

votes

11 answers

ZooKeeper alternatives? (cluster coordination service)

ZooKeeper is a highly available coordination service for data centers. It originated in the Hadoop project. One can implement locking, fail over, leader election, group membership and other coordination issues on top of it. Are there any…

asked May 18 '11 at 16:26

Thomas Koch

2,723
2
26
36

votes

5 answers

Web App: High Availability / How to prevent a single point of failure?

Can someone explain to me how high-availability ("HA") works for a web application ... because I assume HA means that there exist no single-point-of-failure. However, even if a load balancer is used- isn't that the single point of failure?

load-balancing high-availability cluster-computing uptime

asked Oct 30 '11 at 03:52

nickb

8,430
11
34
46

votes

4 answers

Redis master/slave replication - single point of failure?

How does one upgrade to a newer version of Redis with zero downtime? Redis slaves are read-only, so it seems like you'd have to take down the master and your site would be read-only for 45 seconds or more while you waited for it to reload the DB. Is…

redis high-availability

asked Jan 18 '11 at 00:20

nornagon

14,011
16
68
84

votes

5 answers

How to Guarantee Message delivery with Celery?

I have a python application where I want to start doing more work in the background so that it will scale better as it gets busier. In the past I have used Celery for doing normal background tasks, and this has worked well. The only difference…

message-queue redis rabbitmq celery high-availability

asked Jul 05 '11 at 01:14

Ken Cochrane

68,551
9
45
57

votes

4 answers

Method to replicate sqlite database across multiple servers

I'm developing an application that works distributed, and I have a SQLite database that must be shared between distributed servers. If I'm in serverA, and change sqlite row, this change must be in the other servers instantly, but if a server were…

sqlite replication distributed-computing high-availability rethinkdb

asked Apr 16 '13 at 09:02

ManuParra

1,221
6
14
33

votes

4 answers

Scala + Akka: How to develop a Multi-Machine Highly Available Cluster

We're developing a server system in Scala + Akka for a game that will serve clients in Android, iPhone, and Second Life. There are parts of this server that need to be highly available, running on multiple machines. If one of those servers dies…

scala high-availability fault-tolerance akka

asked Sep 11 '10 at 21:10

Unoti

1,255
10
12

votes

5 answers

Design Patterns (or techniques) for Scalability

What design patterns or techniques have you used that are specifically geared toward scalability? Patterns such as the Flyweight pattern seem to me to be a specialized version of the Factory Pattern, to promote high scalability or when working…

design-patterns scalability high-availability

asked Sep 17 '09 at 15:02

Chris Ballance

32,056
25
101
147

votes

7 answers

name node Vs secondary name node

Hadoop is Consistent and partition tolerant, i.e. It falls under the CP category of the CAP theoram. Hadoop is not available because all the nodes are dependent on the name node. If the name node falls the cluster goes down. But considering the fact…

hadoop hdfs hadoop2 high-availability

asked Nov 14 '13 at 05:47

Sam

2,207
7
31
53

votes

2 answers

How to setup Jenkins with HA?

Currently we are using a Jenkins as our CI system and there is one master server and slaves which are provisioned by Saltstack on Openstack. If our Jenkins master server goes down, we need to create a new master and we need to pull the files from…

linux jenkins continuous-integration jenkins-plugins high-availability

asked Mar 23 '16 at 08:34

Vishnu Nair

1,301
1
12
16

votes

3 answers

Which part of the CAP theorem does Cassandra sacrifice and why?

There is a great talk here about simulating partition issues in Cassandra with Kingsby's Jesper library. My question is - with Cassandra are you mainly concerned with the Partitioning part of the CAP theorem, or is Consistency a factor you need to…

cassandra partitioning high-availability consistency cap-theorem

asked Nov 25 '13 at 23:42

hawkeye

31,052
27
133
271

votes

1 answer

Why should a production Kubernetes cluster have a minimum of three nodes?

The first section of the official Kubernetes tutorial states that, A Kubernetes cluster that handles production traffic should have a minimum of three nodes. but gives no rationale for why three is preferred. Is three desirable over two in order…

kubernetes high-availability

asked Feb 17 '18 at 21:07

rjs

votes

13 answers

How do you update a live, busy web site in the politest way possible?

When you roll out changes to a live web site, how do you go about checking that the live system is working correctly? Which tools do you use? Who does it? Do you block access to the site for the testing period? What amount of downtime is…

release-management high-availability

asked Sep 15 '08 at 21:51

Tim Booker

2,711
1
23
34

votes

3 answers

Why are RDBMS considered Available (CA) for CAP Theorem

If I understand the CAP Theorem correctly, availability means that the cluster continues to operate even if a node goes down. I've seen a lot of people (http://blog.nahurst.com/tag/guide) list RDBMS as CA, but I do not understand how RBDMS is…

rdbms high-availability cap-theorem nosql

asked Apr 16 '15 at 01:09

PiedPiper

votes

7 answers

How to design and verify distributed systems?

I've been working on a project, which is a combination of an application server and an object database, and is currently running on a single machine only. Some time ago I read a paper which describes a distributed relational database, and got some…

distributed protocols high-availability formal-verification

asked Feb 07 '09 at 17:48

Esko Luontola

71,072
15
108
126

votes

4 answers

Do load balancers flood?

I am reading about load balancing. I understand the idea that load balancers transfer the load among several slave servers of any given app. However very few literature that I can find talks about what happens when the load balancers themselves…

amazon-web-services networking load-balancing high-availability

asked Apr 21 '16 at 13:58

PedroD

4,310
8
38
75

2 3

…

90 91 Next