First you need to scale you application by running PAT kind of testing.
And,
There are three main strategies for handling the load:
- The site can invest in a single huge machine with lots of processing
power, memory, disk space and redundancy.
- The site can distribute the load across a number of machines.
- The site can use some combination of the first two options.
When you visit a site that has a different URL every time you visit (for example www1.xyz.com, www2.xyz.com, www3.xyz.com, etc.), then you know that the site is using the second approach at the front end. Typically the site will have an array of stand-alone machines that are each running Web server software. They all have access to an identical copy of the pages for the site. The incoming requests for pages are spread across all of the machines in one of two ways:
- The Domain Name Server (DNS) for the site can distribute the load. DNS is an Internet service that translates domain names into IP
addresses. Each time a request is made for the Web server, DNS
rotates through the available IP addresses in a circular way to share
the load. The individual servers would have common access to the same
set of Web pages for the site.
- Load balancing switches can distribute the load. All requests for the Web site arrive at a machine that then passes the request to one
of the available servers. The switch can find out from the servers
which one is least loaded, so all of them are doing an equal amount
of work. This is the approach that HowStuffWorks uses with its
servers. The load balancer spreads the load among three different Web
servers. One of the three can fail with no effect on the site.