Data Management Glossary
Cloud Tiering
What is Cloud Tiering?
Cloud tiering definition: Cloud tiering is increasingly becoming a critical capability in managing enterprise file workloads across the hybrid cloud. Cloud tiering (also referred to as cloud archiving or archive to the cloud) are techniques that offload less frequently used data, also known as cold data, from expensive on-premises file storage or Network Attached Storage (NAS) to cheaper levels of storage in the cloud, typically object storage classes such as Amazon S3. Cloud tiering is a variant of data tiering. The term “data tiering” arose from moving data around different tiers or classes of storage within a storage system, but has expanded now to mean tiering or archiving data from a storage system to other clouds and storage systems.
Cloud Tiering Transparently Extends Enterprise File Storage to the Cloud
Enterprises today are increasingly trying to move core file workloads to the cloud. Since file data can be voluminous, involving billions of files, migrating file data to the cloud can take months and create disruption.
A simple solution to this is to gradually offload files to the cloud (cloud tiering) without changing the end user experience. Cloud tiering (or archiving to specific cloud tiers) enables this by moving infrequently used cold data to a cheaper cloud storage tier, while the data continues to remain accessible from the original location. This enables users to transparently extend on-premises capacity with the cloud.
Cloud Tiering Can Yield Significant Savings If Done Correctly
Cloud object storage is cost-efficient if used correctly. Most cloud providers charge not only for the storage, but also to retrieve data, and they charge egress fees if the data has to leave the cloud. Cloud retrieval fees are usually in the form of charges for “get” and “put” API calls and cloud egress costs are charged by the amount of data that is read from anywhere outside the cloud. So, to keep enterprise storage costs low, infrequently accessed data such as snapshots, logs, backups and cold data are best suited for tiering to the cloud.
By tiering cold data to the cloud, the on-premises storage array needs to only keep hot data and the most recent logs and snapshots. Across Komprise customers, we have found that typically 60% to 80% of their actual data has not been accessed in over a year. By cloud tiering the cold data as well as older log files and snapshots, the capacity of the storage array, mirrored storage array (if mirroring/replication is being used) and backup storage is reduced dramatically. This is why tiering cold data can reduce the overall storage cost by as much as 70% to 80%.
The many advantages of cloud tiering of cold data include:
- Reduced storage acquisition costs. Flash storage, used for fast access to hot data, is expensive. By tiering off infrequently used data you can purchase a much smaller amount of flash storage, thereby reducing acquisition costs.
- Cut backup footprint and costs. By continuously tiering off cold data that is not being accessed you can reduce your backup footprint, backup license costs, and backup storage costs if the cold data is placed in robust storage (such as that provided by the major CSPs).
- Increase disaster recovery speeds and lower disaster recovery (DR) costs. As with backup, by tiering off the cold data, the amount of data mirrored/replicated is dramatically reduced as well.
- Improved storage performance. By running storage at a lower capacity and by removing access to cold data to another storage device or service, you can increase the performance of your storage array.
- Leverage the cloud to run AI, ML, compliance checks and other applications on cold data. With cold data in the cloud, you can access, search and process your cold data without putting any load on your storage array. The cold data that is tiered off has value. Being able to process and feed your cold data into your AI/ML/BI engines is critical to staying competitive. By tiering you can extract value from your cold data without burdening your storage array. This also helps to extend the life of your storage array.
Clearly, if cloud tiering is implemented correctly at the file level it will provide all of the above benefits whereas block tiering to the cloud will not. But not all cloud tiering choices are the same.
To learn more about the differences between cloud tiering at the file level vs the block level, and why so-called cloud pools such as NetApp FabricPool or Dell EMC Isilon CloudPools are not the right approach for cloud tiering, read “What you need to know before jumping into the cloud tiering pool”.