As software engineering students or IT enthusiasts, most of us have wondered how these huge IT applications handle the millions of simultaneous requests on their servers. The sheer hard work of developers behind every successful system to manage its scalability is primarily unknown. When the applications start getting popular and user traffic increases daily, the load on the single server also increases, making the application slower. The long process of scaling the system by adding various machines across the network to handle the traffic is used.

Once there are multiple servers across the network, it is pretty hard to decide which server will be entertained with which types of requests. The total data and demands of the system are distributed across the system by using the Load Balancer. The load balancer in the system decides which request should be routed to which server without disturbing the load balance among all the existing machines.

Load balancing is a critical and widely used topic in the domain of System Design; it is a common topic that may be asked in the technical interview rounds of any IT company. Let us now look at the basics, types and algorithms of Load Balancing in detail…

Load Balancer

You can think of Load Balancer as the traffic police officer standing between the network of servers and the request traffic, deciding which request should be routed by which server. It is basically used to divide the total number of requests like reading or writing operations on a Database or cache memory so that no single server simultaneously bears too many requests. A load balancer can be either a physical part of the hardware devices or an instance in the virtual form on a cloud-based service.

Why do we need a Load Balancer?

To answer this question, try to imagine what will be the scenario when the system with no load balancer encounters a large number of requests altogether. All submissions will be directly connected to a single server without load balancing. The primary problems that such a system model would face are...

Higher Chances of Failure

Being the only server with all the user traffic, the machine is likelier to crash or go down. If something like this happens, the entire application would be interrupted, and the user would not be able to access the service. In simple words, the system is more prone to failure and would be unavailable to the users making it less popular among them, which will be a kind of loss for the users.

Overloaded Servers

There is a limit on the user requests for a particular server that it can handle smoothly. Once the traffic increases and the requests keep increasing, it may slow down, degrading the performance of the server or even may crash if the load gets higher. This is the problem of an overloaded server, and hence to solve this, we are required to add more machines across the server and distribute the requests across that network.

To distribute the requests mentioned above, we need a Load Balancer between the internet and the network of our machines. With a load balancer, we can manage any number of user requests by adding the web servers across the network and spreading the traffic amongst them equally. In such systems, the benefit of using a load balancer is that even if any server crashes or goes offline for some reason, the application would still continue to provide the service.

Also, with this technique, the time taken or the delay to execute each request will be reduced as with each additional server, the RAM, Disk storage and Processing units also increase.

Load balancers minimize the response time and maximize the throughput of each server or machine across the network.

Load balancers in a system design ensure that the system is available throughout even if any one of the servers goes offline, as it only sends or distributes the requests among only the online servers across the network.

Load balancers keep track of the capabilities of each server to handle the requests by performing continuous health checks.

The load balancer also decides the number of servers used to distribute the request depending on the user traffic.

Another benefit of the load balancer is that a system using a load balancer is more secure than the systems with the same configuration without load balancing, as in this, the user only gets to know the address of the load balancer and not the actual servers.

Locating Load Balancer in your System

Between System and servers.
Between server and job servers.
Between application servers and the cache servers.
Between cache and database servers.

Types of Load Balancers

Load balancing in your systems is a necessity to avoid most of the irregularities; it can be implemented using these three ways, namely

Hardware Load Balancer
Software Load Balancers on the client side
Software Load Balancers on the server Sides

Although the name suggests how each of the load Balancers would be implemented, let us look at the idea of their basic implementation in detail.

Hardware Load Balancers

Hardware Load Balancers are implemented using the physical device in our networks to control and manage traffic flow across the cluster of machines over the network. The load Balancers are strong and capable enough to handle all the kinds of traffic like HTTPS, HTTP, TCP, and UDP. Hardware Load Balancers are commonly known as the Layer 4-7 load routers.

Hardware Load Balancer secure the network by providing a virtual server address that points to the load balancer instead of the actual machine on your System. The hardware Load Balancers are implemented on the server side. On encountering any request, it forwards the same to the most appropriate real server by performing the bi-directional NAT (Network Address Translation). However, Hardware Load Balancers can handle a vast amount of traffic simultaneously, but it has limited flexibility and are often expensive.

Hardware Load Balancers ensure that each server is in good condition and responds by regularly performing health checkups. If any of the servers show distinctive signs, it stops sending the traffic to that server to maintain the System's availability.

The task to acquire and configure the Hardware Load Balancers is costly, and this is the primary reason why most of the

Firms use hardware Load Balancers only at the entry point and rely on other types to divert or distribute the request over the network.

Software Load Balancer on the Client Side

Load balancers are not always necessarily implemented using a hardware device. Sometimes, as per your System's needs, you can implement the load balancer as software logic on the client side of your applications. On every request, the load balancer would give the client application a list of all the available or online servers connected to the network; the application would usually choose the first one from the list to execute the said request.

If the System faces an issue with the selected server and the failure occurs persistently even after the set limit of retries, the logic would discard the current server from the list and mark it as unavailable. The application then chooses the next available server in the list.

Software Load Balancing is one of the fastest and least expensive methods to implement load balancing in your systems. Implementing a client-side Software Load balancer is comparatively easier than implementing a physical load balancer.

Software Load Balancer on the Server Side

As we have seen, Load balancers are basically just the small code snippets that are used to distribute the user traffic across the network of servers. Software Load balancers can be implemented at the server side just before the web of machines or server starts. Load balancers at the server side provide you with maximum flexibility in your design.

As it can be implemented on any standard device like Windows or Linux, there is no need to install any particular hardware device that makes it more affordable and requires less maintenance. Most firms prefer implementing the off-to-shelf load balancer, but you can always choose the option of implementing it yourself.

There are many apps that help you to install the load balancers directly in your System. These apps are directly installed on the server or are available to be used as a service (LBaaS) in your application. The cloud-based service provider LBaaS takes responsibility for the smooth installation, upgradation and configuration of Load Balancer in your System.

Some of the top software that you can use to implement Load Balancing in your System is as follows:

Nginx
Avi Vantage Software Load Balancer
HAProxy
Kemp LoadMaster
Loadbalancer.org
ManageEngine OpManager
Citrix ADC
Barracuda Load Balancer ADC
Incapsula
Total Uptime Cloud Load Balancer
jetNEXUS Load Balancers

Categories of Load Balancing

Commonly Load balancers are categorized into three categories

1. Layer 4 (L4) Load Balancer

The L4 Load Balancers are also known as Network Load Balancers. The Network Load Balancers use the information from the 4th (Network) layer to make the decisions regarding routing the user traffic at specific machines. Layer 4 load balancers help you utilize the servers efficiently and maximize your system's availability as it distributes the user traffic across every device, switch and router.

L4 load balancers are strong enough to handle even millions of client requests per second in all forms of TCP or UDP traffic. The routing decision is based on the network packets received along with the IP address of their source destinations from the TCP or UDP ports. The Layer -4 load balancers also perform the Network Address Translation (NAT) on the network packets, but it doesn't disrupt or inspects the content of any of these packets.

2. Layer 7 (L7) Load Balancer

Being the oldest form of load balancing, Layer 7 Load balancers are commonly known as the Application Load Balancers or HTTP(S) Load Balancers. In the OSI model, Layer-7 deals with the application layer (HTTP / HTTPS), where the load balancers decide to route different requests.

The Application Load Balancers track the HTTP header, cookies, URL, and SSL session details along with the HTML form data to use to decide which server would be entertained with which type of request.

3. Global Server Load Balancing (GSLB)

Many applications are hosted on cloud-based data centers in multiple geographic locations; hence, many organizations try to move to the load Balancers that can deliver these applications with more reliability and less delay in executing the requests that are even at any remote device or location. The GSLB is the appropriate choice for IT companies with such requirements. The Global Server Load Balancers extend the properties of L4 and L7 servers in almost every geographic location so that they can adequately distribute the data across all the servers or machines.

The GSLB also ensures a consistent experience for the end-level clients by helping them when they need to navigate across multiple applications in any virtual workspace.

Load Balancing Algorithms

There are various Load Balancing algorithms to decide the routing process for your applications, and different firms use different algorithms per their requirements. Some of the most commonly used load balancing algorithms are as follows:

1. Round Robin

The requests are divided among all the machines or servers in the network sequentially in a rotational manner. This approach works fine, but the problem with this approach is that it does not check the existing load of any server before routing any request to it, so the risk of a server getting overloaded remains persistent.

2. Weighted Round Robin

This approach is almost similar to the Round Robin algorithm. Still, in this approach, every server is assigned a specific weight, and the requests are routed so that the server or machine with the higher weight entertains more requests than the servers with fewer weights.

3. Least Connection Method

This algorithm solves the problem faced in the round-robin approach. In this algorithm, we first check the active connections or requests associated with each server and the new request is then routed to the machine with the least active connections. The Least Connection Method is a bit costlier as compared to the Roun-robin algorithm, but as the decision is made concerning the existing load on any server, the chances of the application failing are reduced as compared to the round-robin approach.

4. Least Response Time Method

This algorithm is a bit more complicated than the Least Connection Method. The least response time method routes the request to the server with the least active connections and the least average response time. Selecting a server with the least response time reduces the latency in finding the results and hence improves the performance of your application.

5. Source IP Hash

In the Source IP Hash method, the client’s IP address is considered to decide the server where the request would be routed. The IP address of the client and the receiving compute instance are computed with a cryptographic algorithm in order to ease the process of dividing the network traffic.

Software Engineering Tutorial

SDLC Models

Software Management

Software Metrics

Project Planning

Software Configuration

System Design

Misc