A few articles back, we discussed server disk performance and how to quickly troubleshoot its issues.
Let's go over some of the basics, just to refresh your memory. Chances are, you were probably more concerned about space requirements than performance. If you server application works, then it's all good right? Well, not so fast.
Depending on what you're using, you’ll probably need to as much as 30-50% overhead storage than the raw amount of storage your data will EVER use just to maintain optimum disk performance. This is basically how ZFS, or most copy-on-write file systems, are set up. With more traditional storage solutions you have a few more options, but you still need to understand exactly how much room you have to work with.
If you have no idea how to go about figuring out how much overhead you need, well - that's why we wrote this article! Read on.
In that other article, we used the ‘iostat’ command to see what kind of stress your drives are under. After all, you gotta know what your current situation is before you make any upgrades.
So, let me show you how to test your server hardware using the ‘fio’ utility. What you get out of it is the total IOPS and average latency numbers. After that, you can come back to it anytime to check on I/O statistics to see if they're withing normal operating range.
The ‘fio’ utility can be installed from the repository of most Linux distros.
This config file will run a simultaneous read-write test from /dev/vda device using 16 threads and a 4-kilobyte record size.
Warning: only use this test with an empty, or a test, drive!
All your data on /dev/vda will be lost!
Don’t run this test on Amazon EC2 instance – you’ll be charged for IO!
firstname.lastname@example.org# fio –c test.cfg [storage-read-test] blocksize=4k filename=/dev/vda rw=randread direct=1 buffered=0 ioengine=libaio iodepth=16 [storage-write-test] blocksize=4k filename=/dev/vda rw=randwrite direct=1 buffered=0 ioengine=libaio iodepth=16
After the test completes- or you interrupt it with Ctrl+C key- you'll see your output looking something like this:
read : io=2131.2MB, bw=73341KB/s, iops=18221 , runt=22351msec slat (usec): min=2 , max=14641 , avg=24.51, stdev=112.54 clat (usec): min=31 , max=201456 , avg=834.13, stdev=3413.17
But you should really be paying attention to:
read: iops=18221 clat=834.13 (usec) write: iops=11123 clat=1345.12 (usec)
The storage device, in this case, can handle about 18200 IOPS read requests with less than 1 ms average latency, and 11000 IOPS write requests with 1.34 ms average latency.
Uh... so is that good or..?
I ran this on a SSD RAID array, and this is actually great performance for that kind of setup. As always, YMMV.
What is the optimal latency?
1-2 ms average latency is what you’ll want to see for your real server benchmarks. For most applications, it shouldn’t exceed 10 ms, and if it's more than 20 ms you will see lag spikes when that applications uses storage. If you're using enterprise storage devices or RAID arrays, keep that in mind.
Running this test for at a couple hours to see if you experience any sudden performance drop. If your application is write-heavy this is even more important to catch early.
Run this test at around 2-3 times of your total storage space to make sure that your total performance isn't effected by any tweaks and cache issues.
Remember: disk performance it’s not about linear write or read speed and not even about raw IOPS, but rather about the shortest latency time possible.
That's about it for this article. Keep up with us on twitter @serversuit for more articles and product updates!