Tuesday, December 10, 2013

Creating sensu alerts based on graphite data

We had the need to create Sensu alerts based on some of the metrics we send to Graphite. Googling around, I found this nice post by @ulfmansson talking about the way they did it at Recorded Future. Ulf recommends using Sean Porter's check-data.rb Sensu plugin for alerting based on Graphite data. It wasn't clear how to call the plugin, so we experimented a bit and came up with something along these lines (note that check-data.rb requires the sensu-plugin gem):

$ ruby check-data.rb -s graphite.example.com -t "movingAverage(stats.lb1.prod.assets-backend.session_current,10)" -w 100 -c 200

This run the check-data.rb script against the server graphite.example.com (-s option) requesting the value or the target metric movingAverage(stats.lb1.prod.assets-backend.session_current,10) (-t option) and setting a warning threshold of 100 for this value (-w option), and a critical threshold of 200 (-c option).  The target can be any function supported by Graphite. In this example, it is a 10-minute moving average for the number of sessions for the "assets" haproxy backend. By default check-data.rb looks at the last 10 minutes of Graphite data (this can be changed by specifying something like -f "-5mins").

To call the check in the context of sensu, you need to deploy it to the client which will run it, and configure the check on the Sensu server in a json file in /etc/sensu/conf.d/checks:

"command": "/etc/sensu/plugins/check-data.rb -s graphite.example.com -t \"movingAverage(stats.lb1.prod.assets-backend.session_current,10)\" -w 100 -c 200"

