Data Management Glossary
Hierarchical Storage Management (HSM)
software, also known as tiered storage, was designed for distributed server
environments to automate the process of identifying cold data sets and automatically migrating them from primary disk to less expensive optical and tape storage devices. Going back to the era of the mainframe, HSM was also supposed to handle file recall requests automatically whenever a user clicked on a stub file.
Unfortunately, these early HSM products (see Wikipedia for a history) suffered from a number of deficiencies such as:
- They were custom designed for specific proprietary storage systems, which limited hardware choices and resulted in vendor lock-in.
- Many required file server agents that required substantial memory and compute resources, and operated in the direct data path, impacting performance.
- They used static stub files left in place of the moved data. These static stub files could be corrupted, deleted, and orphaned making it difficult if not impossible to locate the original source file.
- The early HSM solutions did not scale well. As file counts increased, HSM performance deteriorated significantly since they were traditional database-driven architectures.
- The solutions would disrupt storage s
ystem performance, interrupting active usage.
- File recalls could take a long time, especially if the requested file was stored on tape.
So bad were these deficiencies, that HSM became a “bad word” amongst IT professionals. Many of those IT pros believed that the only viable way to manage storage was to just keep adding more capacity to the primary tier.
As the data center landscape has changed, with organizations having a wide range of data storage options available. Flash memory devices have replaced high performance physical disk drives as Tier-1 storage. High performance and commodity physical hard disks now function as secondary and tertiary storage tiers. Cloud file storage and object storage options are available to handle large bulk, long-term storage requirements. All of these options are needed to combat the unstructured data onslaught (and data sprawl and high data storage costs) that most organizations are facing. However, the main problem remains; how to automatically detect “warm” and “cold” data sets then continuously migrate them to the most cost-effective storage tier while also managing the entire file life cycle. As outlined in this early review of Komprise:
In short, we have more storage options than ever but less intelligence about how and when to move our increasing data to which storage platform.
In a 2022 Blocks and Files review, Komprise Intelligent Data Management is referred to as an HSM or Information Lifecycle Management solution. The new category of software is now known as unstructured data management as well as the broader term: data services.