Let's assume you have a cluster of Apache servers behind an HAProxy and you want to sustain 500 requests/second with low latency per request. First of all, you need to bump up MaxClients and ServerLimit in your Apache configuration, as I explained in another post. In this case you would set both variables to 500. Note that you actually need to stop and start the httpd service, because simply restarting it won't change the built-in limit (which is 256). Also ignore the warning that Apache gives you on startup:
WARNING: MaxClients of 500 exceeds ServerLimit value of 256 servers,
lowering MaxClients to 256. To increase, please see the ServerLimit
Note that the more httpd processes you have, the more CPU and RAM will be consumed on the server. You need to decide how much to push the envelope in terms of concurrent httpd processes you can sustain on a given server. A good measure is the latency / responsiveness you expect from your Web application. At some point, it will start to suffer, and that will be a sign that you need to add a new Web server to your server farm (of course, this over-simplifies things a bit, since there's always the question of the database layer; I'm assuming you can use memcache to minimize database access.) Here's a good overview of the trade-offs related to MaxClients.
Other Apache configuration variables I've tweaked are StartServers, MinSpareServers and MaxSpareServers. It sometimes pays to bump up the values for these variables, so you can have spare httpd processes waiting around for those peak times when the requests hitting your server suddenly increase. Again, there's a trade-off here between server resources and number of spare httpd processes you want to maintain.
Assuming you fine-tuned your Apache servers, it's time to tweak some variables in the HAProxy configuration. Perhaps the most important ones for our discussion are the number of maximum connections per server (maxconn), httpclose and abortonclose.
It's a good idea to throttle the maximum number of connections per server and set it to a number related to the request/second rate you're shooting for. In our case, that number is 500. Since HAProxy itself needs some connections for healthchecking and other internal bookkeeping, you should set the maxconn per server to something slightly lower than 500. In terms of syntax, I have something similar to this in the backend section of haproxy.cfg:
server server1 10.1.1.1:80 check maxconn 500
I also have the following 2 lines in the backend section:
According to the official HAProxy documentation, here's what these options do:
In presence of very high loads, the servers will take some time to respond.
The per-instance connection queue will inflate, and the response time will
increase respective to the size of the queue times the average per-session
response time. When clients will wait for more than a few seconds, they will
often hit the "STOP" button on their browser, leaving a useless request in
the queue, and slowing down other users, and the servers as well, because the
request will eventually be served, then aborted at the first error
encountered while delivering the response.
As there is no way to distinguish between a full STOP and a simple output
close on the client side, HTTP agents should be conservative and consider
that the client might only have closed its output channel while waiting for
the response. However, this introduces risks of congestion when lots of users
do the same, and is completely useless nowadays because probably no client at
all will close the session while waiting for the response. Some HTTP agents
support this behaviour (Squid, Apache, HAProxy), and others do not (TUX, most
hardware-based load balancers). So the probability for a closed input channel
to represent a user hitting the "STOP" button is close to 100%, and the risk
of being the single component to break rare but valid traffic is extremely
low, which adds to the temptation to be able to abort a session early while
still not served and not pollute the servers.
In HAProxy, the user can choose the desired behaviour using the option
"abortonclose". By default (without the option) the behaviour is HTTP
compliant and aborted requests will be served. But when the option is
specified, a session with an incoming channel closed will be aborted while
it is still possible, either pending in the queue for a connection slot, or
during the connection establishment if the server has not yet acknowledged
the connection request. This considerably reduces the queue size and the load
on saturated servers when users are tempted to click on STOP, which in turn
reduces the response time for other users.
As stated in section 2.1, HAProxy does not yes support the HTTP keep-alive
mode. So by default, if a client communicates with a server in this mode, it
will only analyze, log, and process the first request of each connection. To
workaround this limitation, it is possible to specify "option httpclose". It
will check if a "Connection: close" header is already set in each direction,
and will add one if missing. Each end should react to this by actively
closing the TCP connection after each transfer, thus resulting in a switch to
the HTTP close mode. Any "Connection" header different from "close" will also
It seldom happens that some servers incorrectly ignore this header and do not
close the connection eventough they reply "Connection: close". For this
reason, they are not compatible with older HTTP 1.0 browsers. If this
happens it is possible to use the "option forceclose" which actively closes
the request connection once the server responds.
And now for something completely different.....TCP stack tuning! Even with all the tuning above, we were still seeing occasional high latency numbers. Willy Tarreau to the rescue again....he was kind enough to troubleshoot things by means of the haproxy log and a tcpdump. It turned out that some of the TCP/IP-related OS variables were set too low. You can find out what those values are by running:
sysctl -a | grep ^net
In our case, the main one that was out of tune was:
net.ipv4.tcp_max_syn_backlog = 1024
Because of this, when there were more than 1,024 concurrent sessions on the machine running HAProxy, the OS had to recycle through the SYN backlog, causing the latency issues. Here are all the variables we set in /etc/sysctl.conf at the advice of Willy:
net.ipv4.tcp_tw_reuse = 1(to have these values take effect, you need to run 'sysctl -p')
net.ipv4.ip_local_port_range = 1024 65023
net.ipv4.tcp_max_syn_backlog = 10240
net.ipv4.tcp_max_tw_buckets = 400000
net.ipv4.tcp_max_orphans = 60000
net.ipv4.tcp_synack_retries = 3
net.core.somaxconn = 10000
That's it for now. As I continue to use HAProxy in production, I'll report back with other tips/tricks/suggestions.