I finally figured out how to achieve this, using the diskstat_ Munin plugin which gets installed by default when you install munin-node.
If you run
/usr/share/munin/plugins/diskstat_ suggest
you will see the various symlinks you can create for the devices available on your server.
In my case, I have 2 EBS volumes on each of my database servers, mounted as /dev/sdm and /dev/sdn. I created the following symlinks for /dev/sdm (and similar for /dev/sdn):
ln -snf /usr/share/munin/plugins/diskstat_ /etc/munin/plugins/diskstat_latency_sdm
ln -snf /usr/share/munin/plugins/diskstat_ /etc/munin/plugins/diskstat_throughput_sdm
ln -snf /usr/share/munin/plugins/diskstat_ /etc/munin/plugins/diskstat_iops_sdm
Here's what metrics you get from these plugins:
- from diskstat_iops: Read I/O Ops/sec, Write I/O Ops/sec, Avg. Request Size, Avg. Read Request Size, Avg. Write Request Size
- from diskstat_latency: Device Utilization, Avg. Device I/O Time, Avg. I/O Wait Time, Avg. Read I/O Wait Time, Avg. Write I/O Wait Time
- from diskstat_throughput: Read Bytes, Write Bytes
My next step is to follow the advice of Mark Seger (the author of collectl) and graph the output of collectl in real time, so that the stats are displayed in fine-grained intervals of 5-10 seconds instead of the 5-minute averages that RRD-based tools offer.
5 comments:
With stats like those, "fine-grained" should be 1 second or less. You can see a lot of interesting information about how operations collate.
I'd suggest looking at that stuff at 250ms intervals - neat stuff can be witnessed on a busy server, often resulting in obvious tuning to be had.
You can do that with Reconnoiter.
postwait -- thanks! I gave reconnoiter a cursory try a couple of weeks ago but I was discouraged at the lack of tutorial-like documentation. I'll try again at some point though ;-)
not a problem getting sub-second stats with collectl:
collectl -i.250 will give you 1/4 second stats. I've played with intervals in the 0.01 range. Just depends on which stats you're collecting and how heavily you want to beat up your system. Just doing disk stats shouldn't be a problem. And being able to graph them with colplot is an added bonus
Mark -- speaking of rendering collectl plot files with colplot, I haven't had a lot of success doing that from the cmdline :-( What's a good email to send you an example?
Sorry greg, I just saw your comment about getting colplot to work as I wasn't notified there was a reply to this thread. You can always email me at mark.seger@hp.com or better yet send something to the collectl-utils mailing list of sourceforge OR post something in the forum there.
-mark
Post a Comment