Jump to Navigation

Linux Load Balancing/High Availability Cluster

We now offer training in Linux Load Balancing and High Availability Clusters

Building a load-balancing, highly available (HA) cluster with Linux means your company doesn't have to spend big bucks to scale your application with increasing demand or to ensure critical services never go down. Most corporates respond to stringent service level requirements or poor application performance by buying larger, more expensive, machines or worse, turning to a proprietary load-balancing or ha appliance. 

Using Linux Virtual Server(LVS), Linux Director Daemon  and Linux HA one can quickly create a load-balanced, HA server farm from a 2 node cluster, to any number of nodes, limited only by your network bandwidth/configuration. In any LVS setup there are three components necessary to create a highly available, load-balanced cluster namely 

  • A load-balancer (LVS),
  • A cluster manager and (ldirectord),
  • A fail-over service (linux-ha) 

In a load-balanced scenario, where there is a cluster of real servers providing services on behalf of a virtual server, a failure of one or more real serves will allow the remaining operational servers to continue to provide a service via the load balancer. It is the job of the cluster manager to remove dead or unresponsive servers from the cluster configuration and make servers available once they come back online. The last components, the fail-over service, is necessary to protect against the single point of failure should the load-balancer itself suffer an outage. Typically there are at least two load balancers in any setup with the second being in passive mode ready to take over the virtual IP, in the event of the primary load-balancer failure. 

Creating a 2-node load-balanced cluster 

In this post we will cover creating a simple two-node load-balancing solution. The typical scenario is any small or medium company that wants to provide some level of uptime and performance guarantee. In this example we will not implement a fail-over load-balancer assuming that this represents an acceptable level of risk. We will use Redhat for this tutorial but the packages are available on all major distributions. It is not Redhat specific. One of the reasons we like this solution is because it can be replicated on any distribution with ease.

The solution makes use of three physical servers, 1 load balancer and 2 web servers for the cluster. Using virtualisation you can get away with two physical machines. You can run the load balancer and 1 web server in a virtualised environment on the same piece of tin. (You could just use 1 machine with 3 virtualised server but then what would be the point?) 

First you will need to install the required packages on the load-balancer with

yum install ipvsadm heartbeat heartbeat-ldirectord

Although we install the heartbeat package we will not be using it. It is a required dependency in Redhat for the heartbeat-ldirectord package, even though its not listed as such and therefore not installed by default when installing heartbeat-ldirectord with yum. Our setup is as follows:  

  • load balancer - 192.168.1.20 - virtual ip, the ip external clients connect to (VIP), 192.168.22.2 (internal IP),
  • real server 1 - 192.168.22.4,
  • real server 2 - 192.168.22.5 

 IPVSADM - Load Balancer Setup 

To setup load balancing one use's the ipvsadm userland utility to configure the relevant kernel modules. We cover the command line utility here for completeness as the ldirectord service handles ipvsadm setup for you once configured. So these command may seem irrelevant but its always good to know what is going on. Besides if you don't know what is doing how will you fix it when its broke?

There are three scenarios for load balancing approaches: 

  • Network Address Translation - requests arrive at the load-balancer and the destination IP address is changed from the public virtual ip, to one of the available real servers IP. The real server response is then sent back through the load-balancer with its source address changed by the load balancer to reflect the public virtual ip.
  • IP Tunneling -requests are sent through an IP tunnel from the load balancer to one of the real servers, with the real server response going directly to the client, ie by-passing the load-balancer on outgoing.
  • Direct Route  - requests arrive at the load balancer and sent to one of the real servers without modification. The real servers are all configured with the same public virtual IP. The real server responds directly to the client. This setup requires the Virtual IP device on the real servers be disabled from responding to arp requests. 

We will make use of LVS using NAT for this solution. LVS with NAT is good for small server farms of up to about 6 real servers, but performance will begin to suffer as more servers are introduced as the load-balancer becomes a bottleneck. This is not a restriction for the direct route or ip tunneling approach. First we setup the virtual server with

"ipvsadm -A -t 192.168.1.20:80 -s rr"

This setup the virtual http service on the virtual IP (-A -t 192.168.1.20) and tells ipvsadm to use the round robin scheduler (-s rr). For moew on schedulers see the ipvsadm man pages. Next we add our two real servers. 

  • "ipvsadm -a -t 192.168.1.20:80 -r 192.168.22.4:80 -m -w1" and 
  • "ipvsadm -a -t 192.168.1.20:80 -r 192.168.22.5:80 -m -w1"

Here we tell ipvsadm to add the real servers (-r 192.167.22.4/5) to the target virtual server (-t 192.168.1.20), using masquerading/NAT (-m) and with a weight of 1 (-w 1). The weights are used in the scheduling algorithms.

If you are not going to use ldirectord then you can save your rules with ipvsadm-save >/etc/ipvsadm.rules or in Redhat /etc/init.d/ipvsadm save. In Redhat, if you try "/etc/init.d/ipvsadm status",  you will always get a error "dead but subsys locked". Don't worry about this, it is a bug of no consequence. To see if your ipvs is active run "ipvsadm -l" to see your config. Once running you can also use "ipvsadm -l --stats" to see input/output stats.  For more information on ipvsadm see the man pages. To ensure that the load-balancer works don't forget to enable ip forwarding by editing /etc/sysctl.conf and setting net.ipv4.ip_forward=1 or run "echo 1 > /proc/sys/net/ipv4/ip_forward"

IPVSADM - Setup Real Servers 

There is not much to configure on the real servers, other than make sure the http service is running and ensure that traffic coming from the load balancer is routed back through the load-balancer for VS with NAT. The easiest way to do this is to set the default gateway on the real servers to be the load-balancer's IP. Alternatively you can use iptables firewall mark action to mark all http traffic and then route it via an alternative routing table. 

  • iptables -t mangle  -I PREROUTING -p tcp --dport 80 -j MARK --set-mark 80
  • ip route add default via 192.168.22.2 table 1
  • ip rule add  fwmark 80 table 1 

 Because we are using masquerading we can use non routable IPs for the VIP. If you are using direct route then the VIP must be routable otherwise you will get "martian source" errors in /var/log/messages. Once this has been completed you can test the load balancing by opening a browser and going to 192.168.1.20. The default web pages from the different servers should show up on refreshes. (You should modify index.html on each real server so you can see which server has been selected by the load balancer.)

Cluster Manager (ldirectord) Setup

It may seem that all one needs for load balancing is to setup IPVSADM! But this component fails in one crucial area, it does not detect and remove failed real servers or add them back in when they are bought online again! To do this one needs to use a cluster manager such as ldirectord. There are many different cluster manager solutions other than ldirectord, such as keepalived which can be used but for our solution we are going to use ldirectord.

Besides managing cluster membership, the manager also "automates" setting up of ipvsadm as well, meaning you only have to configure one file to get it all to work. On Redhat this file is /etc/ha.d/ldirectord.cf. Here is an the file for the above scenario. 

checktimeout=3
checkinterval=5
autoreload=yes
logfile="var/log/ldirectord.log"
quiescent=yes
emailalert="admin@somedomain.co.za"
virtual=192.168.1.20:80

    fallback=127.0.0.1:80
     real=192.168.22.4:80 masq
     real=192.168.22.5:80 masq
     service=http
     request="ping.html"
     receive="Hello World"
     scheduler=wlc
     protocol=tcp
     checktype=negotiate
    

Please note the indentation above is required. There is a minimum of 4 spaces or a tab to denote each virtual servers child elements. The above configuration tells ldirectord to check each real server every 5 seconds (checkinterval) with a time-out for real server respnse of 3 seconds (checktimeout). Alerts are sent to the specified email address. The ipvsadm virtual and real servers are then setup followed by the mechanism by which to test if the real server is alive or dead. ldirectord supports several services, such as smtp,ftp,imap pop and several ways to detect dead servers/services including pinging, open connections etc. In our case we have told ldirector to get a web url, ping.html, and then check the response(checktype=negotiate) matches "Hello World" (receive). This mean we not only check if the server is up but if the http service is running. You will need to add a ping.html page to each real servers http web directory which returns a simple string of "Hello World".

ldirector will remove a server from the list of available servers if it is dead, or add the server back in when it comes online. The quiescent option tells ldirectord to allow existing connections to finish before removing the server. (In fact its not removed, its weight is set to 0 so it no longer receives request but allows existing ones to complete their session. This is handy if you need to take a server out of the cluster for maintenance.) More information on the configuration file can be found in the ldirectord man pages. 

You can now test bringing down one of the http services and see how it is removed from the load-balancers list of available servers by running "ipvsadm -l" The disabled server should have a weight of 0, brining the http service back up will result in its weight being changed to 1 again when it is detected.

Well thats it folks. No need for expensive solutions when a robust, affordable solution will do! 

Open Source: 


by Dr. Radut.