Why cache is better than tiering for performance

When it comes to adjusting (either increasing or decreasing) storage performance, two approaches are common: caching and tiering. Caching refers to a process whereby commonly-accessed data gets copied into the storage controller’s high-speed, solid-state cache. Therefore, a client request for cached data never needs to hit the storage’s disk arrays at all; it is simply served right out of the cache. As you can imagine this is very, very fast.

Tiering, in contrast, refers to the movement of a data set from one set of disk media to another; be it from slower to faster disks (for high-performance, high-importance data) or from faster to slower disks (for low-performance, low-importance data). For example, you may have your high-performance data living on a SSD, FC or SAS disk array, and your low-performance data may only require the IOps that can be provided by relatively low-performance SATA disks.

Both solutions have pros and cons. Cache is typically less configurable by the user, as the cache’s operation will be managed by the storage controller. It is considerably faster, as the cache will live on the bus — it won’t need to traverse the disk subsystem (via SAS, FCAL etc.) to get there, nor will it have to compete with other I/O along the disk subsystem(s). But, it’s also more expensive: high-grade, high-performance solid-state cache memory is more costly than SSD disks. Last but not least, the cache needs to “warm up” in order to be effective — though in the real world this does not take long at all!

Tiering’s main advantages are that it is more easily tunable by the customer. However, all is not simple: in complex environments, tuning the tiering may literally be too complex to bother with. Also, manual tiering relies on you being able to predict the needs of your storage, and adjust tiering automatically: how do you know tomorrow which business application will require the hardest hit? Again, in complex environments, this relatively simply question may be decidedly difficult to answer. On the positive side, tiering offers more flexibility in terms of where you put your data. Cache is cache, regardless of environment; data is either on the cache or it’s not. On the other hand, tiering lets you take advantage of more types of storage: SSD or FC or SAS or SATA, depending on your business needs.

But if you’re tiering for performance (which is the focus of this blog post), then you have to deal with one big issue: the very act of tiering increases the load on your storage system! Tiering actually creates latency as it occurs: in order to move the data from one storage tier to another, we are literally creating IOps on the storage back-end in order to accomplish the performance increase! That is, in order to get higher performance, we’re actually hurting performance in the meantime (i.e., while the data is moving around.)

In stark contrast, caching reduces latency and increases throughput as it happens. This is because the data doesn’t really move: the first time data is requested, a cache request is made (and misses — it’s not in the cache yet) and the data is served from disk. On it’s way to the customer, though, the data will stay in the cache for a while. If it’s requested again, another cache request is made (and hits — the data is already in the cache) and the data is served from cache. And it’s served fast!

(It’s worthwhile to note that NetApp’s cache solutions actually offer more than “simple” caching: we can even cache things like file/block metadata. And customers can tune their cache to behave how they want it.)

Below is a graph from a customer’s benchmark. It was a large SQL Server read, but what is particularly interesting is the behavior of the of the graph: throughput (in red) goes up while latency (in blue) actually drops!

If you were seeking a performance augmentation via tiering, there would have been two different possibilities. If your data was already tiered, throughput will go up while latency will remain the same. If your data wasn’t already tiered, throughput will decrease as latencies will increase as the data is tiered; only after the tiering is completed will you actually see an increase in throughput.

For gaining performance in your storage system, caching is simply better than tiering.


2 thoughts on “Why cache is better than tiering for performance

  1. Exceptional points until backup comes into question.
    if you look at Databases then your 100% on the money, but if you review the vast buckets of unstructured data then your statement can be a underlying solution in smaller unified pools with policy …
    By saying Caching is better than it is best to limit the scope of the topic to either block/file structured/unstructured.
    since my 10 year old compliant data will not likely become hot and does not need to be in a 10TB backup and can comfortably live on a pile of SSD I backup once a year when it does change vs every week in a differential and a full.

    • Hi Stephan,

      Dimitris from NetApp here.

      Maybe you are confusing tiering with archiving. Or maybe I didn’t get your comment.

      With archiving you’d send unchanged files to, say, 3TB SATA drives, and not back them up all the time (by using possibly some kind of archiving software or appliance that creates “stub” files pointing to the real files).

      Performance tiering and caching are altogether different animals than archiving.

      With good caching, even files on the 3TB drives would enjoy decent performance if they did have to be accessed.

      I’d say both are needed in a sufficiently large and complex environment (good caching and archiving).



Leave a Reply to Stephan Cancel reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s