Wednesday, March 04, 2009

HAProxy, X-Forwarded-For, GeoIP, KeepAlive

I know the title of this post doesn't make much sense, I wrote it that way so that people who run into issues similar to mine will have an easier time finding it.

Here's a mysterious issue that I recently solved with the help of my colleague Chris Nutting:

1) Apache/PHP server sitting behind an HAProxy instance
2) MaxMind's GeoIP module installed in Apache
3) Application making use of the geotargeting features offered by the GeoIP module was sometimes displaying those features in a drop-down, and sometimes not

It turns out that the application was using the X-Forwarded-For headers in the HTTP requests to pass the real source IP of the request to the mod_geoip module and thus obtain geotargeting information about that IP. However, mysteriously, HAProxy was sometimes (once out of every N requests) not sending the X-Forwarded-For headers at all. Why? Because KeepAlive was enabled in Apache, so HAProxy was sending those headers only on the first request of the HTTP connection that was being "kept alive". Subsequent requests in that connection didn't have those headers set, so those requests weren't identified properly by mod_geoip.

The solution in this case was to disable KeepAlive in Apache. Willy Tarreau, the author of HAProxy, also recommends setting 'option httpclose' in the HAProxy configuration file. Here's an excerpt from the official HAProxy documentation:

option forwardfor [ except  ] [ header  ]
....
It is important to note that as long as HAProxy does not support keep-alive
connections, only the first request of a connection will receive the header.
For this reason, it is important to ensure that "option httpclose" is set
when using this option.

I hope this post will be of some use to people who might run into this issue.

3 comments:

Anonymous said...

Thank you, it's useful.

Sidharth K said...

Very useful article. Thanks.

I'm not prepared to use option httpclose which will disable keep-alive connections -- thus making my site less responsive... as each http connection will have to be explicitly reestablished by client browsers. Ugh!!

So I'm not using forwardfor because that requires option httpclose (see haproxy documentation).

Is there a way I can do anything? Like do statistics from the haproxy logs?

Any help? I'm using google analytics for my site so I'm not suffering that much.

BTW awstats is still useful without the forwardfor working but it would have been nice to get it fully working.

Any idea how to do that?

Thanks,

Sidharth

Anonymous said...

Go for haproxy 1.4 or later. Then use:

option forwardfor
option http-server-close

instead of

option forwardfor
option httpclose

to support Keep-Alive on the client-side. This will give you the low-latency advantage on slower networks (your clients...), while each request yields one connection on the server (backend) side including the X-Forwarded-For header for every req! :-)

Modifying EC2 security groups via AWS Lambda functions

One task that comes up again and again is adding, removing or updating source CIDR blocks in various security groups in an EC2 infrastructur...