16

I am reading about load balancing.

I understand the idea that load balancers transfer the load among several slave servers of any given app. However very few literature that I can find talks about what happens when the load balancers themselves start struggling with the huge amount of requests, to the point that the "simple" task of load balancing (distribute requests among slaves) becomes an impossible undertaking.

Take for example this picture where you see 3 Load Balancers (LB) and some slave servers.

enter image description here Figure 1: Clients know one IP to which they connect, one load balancer is behind that IP and will have to handle all those requests, thus that first load balancer is the bottleneck (and the internet connection).

What happens when the first load balancer starts struggling? If I add a new load balancer to side with the first one, I must add even another one so that the clients only need to know one IP. So the dilema continues: I still have only one load balancer receiving all my requests...!

enter image description here Figure 2: I added one load balancer, but for having clients to know just one IP I had to add another one to centralize the incoming connections, thus ending up with the same bottleneck.

Moreover, my internet connection will also reach its limit of clients it can handle so I probably will want to have my load balancers in remote places to avoid flooding my internet connection. However if I distribute my load balancers, and want to keep my clients knowing just one single IP they have to connect, I still need to have one central load balancer behind that IP carrying all the traffic once again...

How do real world companies like Google and Facebook handle these issues? Can this be done without giving the clients multiple IPs and expect them to choose one at random avoiding every client to connect to the same load balancer, thus flooding us?

PedroD
  • 4,310
  • 8
  • 38
  • 75
  • I answered this in your previous question: http://stackoverflow.com/questions/36755884/how-do-i-scale-a-java-app-with-a-rest-api-and-a-database Elastic Load Balacers are a service, not a single instance. You get at least one per availability zone, and they are automatically scaled up to handle your traffic. – Mark B Apr 21 '16 at 14:34

4 Answers4

10

Your question doesn't sound AWS specific, so here's a generic answer (elastic LB in AWS auto-scales depending on traffic):

You're right, you can overwhelm a loadbalancer with the number of requests coming in. If you deploy a LB on a standard build machine, you're likely to first exhaust/overload the network stack including max number of open connections and handling rate of incoming connections.

As a first step, you would fine tune the network stack of your LB machine. If that still does not provide you the required throughput, there are special purpose loadbalancer appliances on the market, that are built ground-up and highly optimized to handle a large number of incoming connections and routing them to several servers. Examples of these are F5 and netscaler

You can also design your application in ways that help you split traffic to different sub domains, thereby reducing the number of requests 1 LB has to handle.

It is also possible to implement a round-robin DNS, where you would have 1 DNS entry point to several client facing LBs instead of just one as you've depicted.

RaGe
  • 19,186
  • 10
  • 54
  • 86
6

Advanced load balancers like Netscaler and similar also does GSLB with DNS not simple DNS-RR (to explain further scaling)

if you are to connect to i.e service.domain.com, you let the load balancers become Authorative DNS for the zone and you add all the load balancers as valid name servers.

When a client looks up "service.domain.com" any of your loadbalancers will answer the DNS request and reply with the IP of the correct data center for your client. You can then further make the loadbalancer reply on the DNS request based of geo location of your client, latency between clients dns server and netscaler, or you can answer based on the different data centers load.

In each datacenter you typically set up one node or several nodes in cluster. You can scale quite high using such a design.

KaiT
  • 136
  • 2
2

Since you tagged Amazon, they have load balancers built in to their system so you don't need to. Just use ELB and Amazon will direct the traffic to your correct system.

If you are doing it yourself, load balancers typically have a very light processing load. They typically do little more than redirect a connection from one machine to another based on a shallow inspection (or no inspection) of the data. It is possible for them to be overwhelmed, but typically that requires a load that would saturate most connections.

If you are running it yourself, and if your load balancer is doing more work or your connection is getting saturated, the next step is to use Round-Robin DNS for looking up your load balancers, generally using a combination of NS and CNAME records so different name lookups give different IP addresses.

Bryan
  • 142
  • 9
1

If you plan to use amazon elastic load balancer they claim that

Elastic Load Balancing automatically scales its request handling capacity to meet the demands of application traffic. Additionally, Elastic Load Balancing offers integration with Auto Scaling to ensure that you have back-end capacity to meet varying levels of traffic levels without requiring manual intervention.

so you can go with them and do not need to handle the Load Balancer using your own instance/product

Frederic Henri
  • 45,144
  • 6
  • 98
  • 119