When it comes to adjusting (either increasing or decreasing) storage performance, two approaches are common: caching and tiering. Caching refers to a process whereby commonly-accessed data gets copied into the storage controller’s high-speed, solid-state cache. Therefore, a client request for cached data never needs to hit the storage’s disk arrays at all; it is simply served right out of the cache. As you can imagine this is very, very fast.
Tiering, in contrast, refers to the movement of a data set from one set of disk media to another; be it from slower to faster disks (for high-performance, high-importance data) or from faster to slower disks (for low-performance, low-importance data). For example, you may have your high-performance data living on a SSD, FC or SAS disk array, and your low-performance data may only require the IOps that can be provided by relatively low-performance SATA disks.
Both solutions have pros and cons. Cache is typically less configurable by the user, as the cache’s operation will be managed by the storage controller. It is considerably faster, as the cache will live on the bus — it won’t need to traverse the disk subsystem (via SAS, FCAL etc.) to get there, nor will it have to compete with other I/O along the disk subsystem(s). But, it’s also more expensive: high-grade, high-performance solid-state cache memory is more costly than SSD disks. Last but not least, the cache needs to “warm up” in order to be effective — though in the real world this does not take long at all!
Tiering’s main advantages are that it is more easily tunable by the customer. However, all is not simple: in complex environments, tuning the tiering may literally be too complex to bother with. Also, manual tiering relies on you being able to predict the needs of your storage, and adjust tiering automatically: how do you know tomorrow which business application will require the hardest hit? Again, in complex environments, this relatively simply question may be decidedly difficult to answer. On the positive side, tiering offers more flexibility in terms of where you put your data. Cache is cache, regardless of environment; data is either on the cache or it’s not. On the other hand, tiering lets you take advantage of more types of storage: SSD or FC or SAS or SATA, depending on your business needs.
But if you’re tiering for performance (which is the focus of this blog post), then you have to deal with one big issue: the very act of tiering increases the load on your storage system! Tiering actually creates latency as it occurs: in order to move the data from one storage tier to another, we are literally creating IOps on the storage back-end in order to accomplish the performance increase! That is, in order to get higher performance, we’re actually hurting performance in the meantime (i.e., while the data is moving around.)
In stark contrast, caching reduces latency and increases throughput as it happens. This is because the data doesn’t really move: the first time data is requested, a cache request is made (and misses — it’s not in the cache yet) and the data is served from disk. On it’s way to the customer, though, the data will stay in the cache for a while. If it’s requested again, another cache request is made (and hits — the data is already in the cache) and the data is served from cache. And it’s served fast!
(It’s worthwhile to note that NetApp’s cache solutions actually offer more than “simple” caching: we can even cache things like file/block metadata. And customers can tune their cache to behave how they want it.)
Below is a graph from a customer’s benchmark. It was a large SQL Server read, but what is particularly interesting is the behavior of the of the graph: throughput (in red) goes up while latency (in blue) actually drops!
If you were seeking a performance augmentation via tiering, there would have been two different possibilities. If your data was already tiered, throughput will go up while latency will remain the same. If your data wasn’t already tiered, throughput will decrease as latencies will increase as the data is tiered; only after the tiering is completed will you actually see an increase in throughput.
For gaining performance in your storage system, caching is simply better than tiering.