Wednesday, August 28, 2013

Keepalived, iproute2 and HAProxy (part 2)

In part 1 of this 2-part series, I explained how we initially set up keepalived and iproute2 on 2 HAProxy load balancers with the goal of achieving high availability at the load balancer layer. Each of the load balancers had 3 interfaces, and we wanted to be able to ssh into any IP address on those interfaces -- hence the need to iproute2 rules. However, adding keepalived into the mix complicated things.

To test failover at the HAProxy layer, we simulated a system failure by rebooting the primary load balancer. As expected, keepalived transferred the floating IP address to the secondary load balancer, and everything worked as expected. However, things started going south when the primary load balancer came back online. We had a chicken and egg problem: the iproute2 rules related to the floating IP address didn't kick in when rc.local was run, because the floating IP wasn't there yet. Then keepalived correctly identified the primary system as being up and transferred the floating IP there, but there was no route to it via iproute2. We decided that the iproute2 rules/policies unnecessarily complicated things, so we got rid of them. This meant we were back to one default gateway, on the same subnet as our front-end interface. The downside was that we were only able to ssh into one of the 3 IPs associated with the 3 interfaces on each load balancer, but the upside was that things were a lot simpler.

However, our failover tests with keepalived were still not working as expected. We mainly had issues when the primary load balancer came back after a reboot. Although keepalived correctly reassigned the floating IP to the primary LB, we weren't able to actually hit that IP over ssh or HTTP. It turned out that it was an ARP cache issue on the switch stack where the load balancers were connected. We had to clear the ARP cache in order for the floating IP to be associated again with the correct MAC. On further investigation, it turned out that the switches weren't accepting gratuitous ARP requests, so we enabled them by running this command on our switches:

ip arp gratuitous local

With this setup in place, we were able to fail over back and forth from the primary to the secondary load balancer. Note that whenever there are modifications to be made to the keepalived configuration, there is no good way we found to apply them (via chef) to the load balancers unless we take a very short outage while restarting keepalived on both load balancers.