Load Testing Your Storage Subsystem with Diskspd – Part II

Page content

In this post we’re going discuss how to implement load testing of your storage subsystem with DiskSpd. We’re going to craft tests to measure bandwidth and latency for specific access patterns and IO sizes. In the last post “Load Testing Your Storage Subsystem with Diskspd”  we looked closely at access patterns and I/O size and discussed the impact each has on key performance attributes. 

Diskspd command options

Let’s start with some common command options, don’t get caught up on the syntax. Diskspd’s documentation is fantastic. It’s included with the program download here. Here I’m going to tell you why I set these settings this way, so you can adjust them as needed for your environments.

  • Duration ( -d ) – this is the runtime of the test, the longer the better. The longer your test the more likely you’ll smooth out any performance anomalies, such as competing for shared resources. A longer test will likely invalidate any caches along the I/O path that may cache data and skew your results. We’re trying to measure the I/O capacity of the whole pipeline…not any caches.
     

  • Threads ( -T or -f ) – if -T is used this is the number of threads per file. If -f is used this is the number of threads for the whole test. For smaller systems I usually set number of threads to the number of cores. On larger systems I start at 16 and change it up or down based on results. For systems with very fast I/O paths you may need to add additional I/O threads to max out the throughput.
     

  • Outstanding I/Os ( -o) – the number of I/Os ready to be dispatched per thread. Your storage subsystem may be fantastic, so when testing for throughput you may need to stack on more I/Os to increase the pressure. I usually start with this equal to the number of spindles in my LUN. Then I increase outstanding I/Os until I start to see latency increase. Once you see that, congrats you just saturated your I/O subsystem! Use this in conjunction with threads when trying to saturate an I/O path. If latency is already at unacceptable levels, reduce outstanding I/Os…but you’ll likely start to see a reduction in throughput. Try to find the sweet spot between minimum latency and maximum bandwidth. If each match the physical attributes of your disk subsystem you’re heading in the right direction. If outstanding I/Os is set to 1, the I/O is synchronous, all other values are asynchronous.…more on this later. 
     

  • File Size ( -c ) – the file size, I like to have this be larger than the largest cache in the I/O pipeline. This includes your HBA, SAN controller…anything along the way between the running process and the disk. 
     

  • Block Size ( -b[K|M|G|b] ) – the size of the IO, this is what we’ll change this to match varying I/O patterns in SQL Server.
     

  • Disable hardware write and software caching ( -h ) – we want to disable software (file system) caching and request disabling hardware caching. Disabling hardware caching is only a request of the storage hardware and that’s one of the reasons why want to ensure the file size we use is larger than the largest cache in our I/O path. This is enabled on all of the tests in this post. Further, for durability reasons most major relational database systems, SQL Server included, do not use the file system cache. They rely on their own caching mechanisms. 
     

  • IO Pattern – discussed in detail in our previous post here

  • Random I/O ( -r 

  • Sequential I/O ( -s ) – if using multiple threads, use -si this will coordinate the threads’ access into the file ensuring a sequential access pattern.
     

  • Write Percentage-w ) – 0 is all reads, 100 is all writes.You can choose any value between, but I like to isolate read and write tests for analysis. 
     

  • Measure latency statistics ( -L ) – the whole reason we’re doing this is to understand our performance, go ahead and turn this on.

Impact of I/O Access Patterns

Here are some example Diskspd tests that implement sequential and random access patterns. These tests simulate index seeks/point queries and index scans/range queries.

  • Random

    diskspd.exe -d15 -o32 -t4 -b64K -h -r -L -w0 D:\TEST\iotest.dat

    This test will run for 15 seconds, with 32 outstanding IOs, using 4 threads, with 64k IOs. The hardware and software caches are disabled, access pattern is random and is read only. In our previous post we defined the characteristics of this access pattern, we should expect lower bandwidths and higher latencies in this test. This is due to the drives having to physically move to service the random I/O requests. This test is similar to an index seek/point query in SQL Server  (SSDs will still exhibit slightly higher latencies on random access as discussed in the last post here).

  • Sequential

    diskspd.exe -d15 -o32 -t4 -b64K -h -si -L -w0 D:\TEST\iotest.dat

    This test is the same as above, but uses a sequential access pattern. With sequential I/O we should see higher bandwidths with lower latencies. This is due to the data being physically contiguous on the drive. This test is similar to an index scan/range query in SQL Server.

Impact of I/O sizes

For these tests we’ll explore two I/O sizes. We’ll simulate a log buffer flush using a small 60KB synchronous, a small, single threaded, sequential write. Then we will simulate a backup operation with a much larger, multithreaded, sequential write.

  • Tranaction log simulation

    diskspd.exe -d15 -o1 -t1 -b60K -h -s -L -w100 D:\TEST\iotest.dat

    In this test we simulate the writing of full transaction log records. The test is configured for synchronous I/Os by setting the outstanding I/O and threads to 1. Each I/O is 60KB and writes sequentially to the data file. We’re really trying to measure latency in the I/O subsystem and determine if there are any potential bottlenecks. 

  • Backup operation simulation

    diskspd.exe -d15 -o32 -t4 -b1M-h -si -L -w0 D:\TEST\iotest.dat 

    In this test we simulate the writing of a backup file. The test is configured for asynchronous, parallel I/Os by setting the outstanding I/O parameter to 32 and threads to 4. Each I/O is 1MB and reads sequentially from the file. We’re really trying to tax the I/O subsystem and reach a saturation point so we can really determine how much data our disk subsystem can move for reads. 

In this post we showed you how use Diskspd to craft tests to measure bandwidth and latency, two key attributes of your disk subsystem. In our next post in this series run some tests that simulate SQL Server I/O access patterns and review output.