File-Based Cloud Tiering with Komprise Intelligent Data Management

At Komprise, we believe that data management functionality is a layer independent of storage. By considering data to be separate from the storage in which it resides, it is possible to manage data holistically across vendors, be they on-premises storage arrays or cloud providers; and across technologies, be they files or objects. This approach has allowed Komprise to create a data management solution that is vendor agnostic and integrates tightly with on-premises and cloud storage to create a hybrid data management platform that works across both.

This open approach to data management ensures Komprise offers cloud tiering with 75% lower cloud egress costs, 300% lower TCO, and provides you with full access to your data in the cloud without lock-in.

Komprise uses open standards to read and write data to ensure data is always stored in a format native to the storage service to ensure there is no data lock-in. This allows customers to manage their data independent of the storage devices or data management services they use. It future proofs the customer, allowing them to select and later change or update their storage devices. As an example, one of our customers tiered cold data to tape. Later, they shifted to an on-premises object store and recently moved to the cloud. Through it all, the one constant was Komprise.

In my last post, I reviewed what you need to know before jumping into the cloud tiering pool – and why the approach you use can result in 75% higher cloud retrieval costs. In this post, I’ll dive deeper into file-based vs. block-level tiering to the cloud and the benefits of file-based cloud tiering from Komprise.

Unlike traditional cloud tiering solutions that are storage-centric and use block tiering, Komprise tiers an entire file. Komprise replaces the file on the source storage array with a symbolic link. Symbolic link is a standard file system construct and is supported by NFS and SMB protocols. When a file is tiered to the cloud it is written in the format recognized by the cloud. For example, if the file is tiered to AWS S3, it is stored as an object and can be read by a standard S3 browser without any third-party software and that includes Komprise. If the fie is tiered to AWS’s EFS, it will be stored as an NFS file. If files are tiered to a SMB cloud storage such as AWS FSX, they will be stored as SMB files.

Unlike traditional cloud tiering solutions that are storage-centric and use block tiering, Komprise tiers an entire file.

Komprise uses a patented Transparent Move Technology™ (TMT) to transparently tier cold data. Komprise is the only vendor that provides:

  • transparent tiering from the source storage array,
  • with native access to the cold data on the target,
  • without getting in front of hot data on the source.

When a user tries to access a tiered file, the symbolic link directs the file system request to Komprise. Komprise fetches the file from the cloud and responds to the file system request. The tiered file is streamed back and cached by Komprise to minimize latency and eliminate further egress and API costs if the file is re-accessed. Komprise provides a custom rehydration policy that the user can configure to meet their needs. Data need not be re-hydrated on the first access. Komprise also provides a bulk recall feature if needed.

Data need not be re-hydrated on the first access. Komprise also provides a bulk recall feature if needed.

Komprise works seamlessly with storage arrays’ block tiering solutions, virus scanners, and backup software. Typically, storage array’s block tiering is used to tier snapshots and certain log files that are almost never accessed and is used to provide storage efficiency. Most backup software has a configuration setting to prevent following symbolic links. When the backup software sees a symbolic link, it simply backs it up. The tiered file remains in the cloud. In order to restore a tiered file, its link can be restored from the backup and followed to access the file.

Diagram showing Komprise File-Tiering for Cloud Tiering vs Using Storage Tiering for Cloud Tiering 

Komprise is designed to transparently tier data to the cloud while providing the most cost savings and enabling maximum flexibility without any vendor or data lock-in.

Block versus File Tiering: Why Komprise Transparent File Tiering

It is instructive to compare the features of the two approaches side by side to understand the implications and advantages of each approach. The table also highlights the sorts of questions you should be asking when evaluating a cloud tiering solution.  Here is a summary:

Core Features of Tiering KOMPRISE TMT™ FILE TIERING Storage Array’s Block Tiering
Transparent, continuous tiering YES YES
Flexible tiering policies YES

Komprise provides a range of ages as well as exclusions based on size, file type, directory. In the next major release, Komprise will allow users to granularly specify what to tier based on custom queries.

NO

Generally, can only specify an age. But that too can be limited where cold data is specified as anything 6 months or older.

Tiering across multi-vendor storage arrays YES

Komprise is vendor agnostic. Komprise can be deployed across most common storage arrays allowing one, consistent, global way to tier data. Komprise provides a single pane of glass across multi-vendor storage systems.

NO

Each storage array only supports tiering from its storage devices, and you must manage each cluster independently. There is no single pane of glass to manage tiering across the clusters.

Prevent rehydrating tiered data on first access YES

Komprise allows you to configure just when an access data is rehydrated.

NO

Tiered data that is accessed is immediately rehydrated. Requires that extra storage be kept reserved for such rehydration there by reducing cost advantages.

Avoid performance impact on hot data YES

Komprise is involved only when cold data is accessed.

NO

Since the core tiering engine is used to tier data, high latency to the target can impact performance. Furthermore, the block tiering approach requires a constant traffic to the cloud to defragment blocks stored in objects. This constant traffic drives up cloud costs and impacts performance. For this reason, storage arrays clearly indicate that if more than 300TB is to be tiered, local object storage should be used.

Fast access to tiered data YES

Komprise streams the data and does not wait for the entire file to be read. Komprise caches the data locally to ensure future requests are fast.

YES

Blocks are read back instead of the entire file. The blocks are stored back on the array to ensure future requests are fast. However, this approach request that some amount of storage be left unused to house accessed data. This reduces cost savings.

Bulk recall of tiered data YES

Komprise provides a bulk recall feature. In many cases it may be necessary to bring back a large set of data.

NO

Data is brought back as it is accessed.

Native cloud access tiered data (NO DATA LOCK-IN) YES

Komprise writes tiered data in the format used by the target. For instance, if tiered to S3, it will write the data in S3 format. The data can be accessed from the source or, in this case, the cloud using a standard S3 browser.

This enables processing of tiered data without burdening the source storage array.

NO

The data is in proprietary form and the entire file may not be on the target. The tiered data can only be read from the source.

Showback or chargeback YES

Today, in the UI Komprise provides percentage of a share’s total storage that is local, and which has been tiered.
In an upcoming release a full report will be provided for easier chargeback.

NO
Decommissioning without rehydration (NO VENDOR LOCK-IN) YES

With Komprise you can migrate from one vendor’s storage array to another without re-hydrating the huge quantity of tiered data. The tiered data will be transparently accessible from the new storage array.

NO

It is very difficult to switch vendors once you start tiering with a storage array’s tiering solution.

At Komprise, we believe it is critical to know first before you make important decisions or investments, especially with today’s large data sets. An incorrect decision can cost millions of dollars and lost time.

In my next post I’ll review why it’s important to tier cold data to the cloud and then I’ll review some of the important considerations for effective cold data tiering to the cloud. Later, I’ll review a cost model comparison of tiering 1PB of data to cloud storage with Komprise vs. alternative options. I want to make sure it’s clear why when it comes to cloud tiering and cloud data management you don’t compromise – you Komprise!

Getting Started with Komprise:

Contact | Demo