Showing posts from November, 2012

Code performance vs system performance

Just a quick thought: as non-volatile storage becomes faster and more affordable, I/O will cease to be the bottleneck it currently is, especially for database servers. Granted, there are applications/web sites out there which will always have to shard their database layer because they deal with a volume of writes well above what a single DB server can handle (and I'm talking about mammoth social media sites such as Facebook, Twitter, Tumblr etc).  By database in this context I mean relational databases. NoSQL-like databases worth their salt are distributed from the get go, so I am not referring to them in this discussion.

For people who are hoping not to have to shard their RDBMS, things like memcached for reads and super fast storage such as FusionIO for writes give them a chance to scale their single database server up for a much longer period of time (and by a single database server I mostly mean the server where the writes go, since reads can be scaled more easily by sending t…

Quick troubleshooting of Sensu 'no keepalive from client' issue

As I mentioned in a previous post, we started using Sensu as our internal monitoring tool. We also integrated it with Pager Duty. Today we terminated an EC2 instance that had been registered as a client with Sensu. I started to get paged soon after with messages of the type:

 keepalive : No keep-alive sent from client in over 180 seconds

Even after removing the client from the Sensu dashboard, the messages kept coming. My next step was of course to get on the #sensu IRC channel. I immediately got help from robotwitharose and portertech.  They had me try the following:

1) Try to remove the client via the Sensu API.

I used curl and ran:

curl -X DELETE http://sensu.server.ip.address:4567/client/myclient

2) Try to retrieve the client via the Sensu API and make sure I get a 404

curl -v http://sensu.server.ip.address:4567/client/myclient

This indeed returned a 404.

3) Check that there is a single redis process running

BINGO -- when I ran 'ps -def | grep redis', the command returned T…