Data Management Glossary
Cloud Data Migration
What is Cloud Data Migration?
Cloud data migration is the process of relocating either all or a part of an enterprise’s data to a cloud infrastructure. Cloud data migration is often the most difficult and time-consuming part of an overall cloud migration project. Other elements of cloud migration involve application migration and workflow migration.
Cost, Complexity and Time: Why Cloud Data Migrations are Difficult
Cloud data migrations are usually the most laborious and time-consuming part of a cloud migration initiative. Why? Data is heavy – data footprints are often in hundreds of terabytes to petabytes and can involve billions of files and objects. Some key reasons why cloud data migrations fail include:
- Lack of Proper Planning: Often cloud data migrations are done in an ad-hoc fashion without proper analytics on the data set and planning
- Improper Choice of Cloud Storage Destination: Most public clouds offer many different classes and tiers of storage – each with their own costs and performance metrics. Also, many of the cloud storage classes have retrieval and egress costs, so picking the right cloud storage class for a data migration involves not just finding the right performance and price to store the data but also the right access costs. Intelligent tiering and Intelligent archiving techniques that span both cloud file and object storage classes are important to ensure the right data is in the right place at the right time.
- Ensuring Data Integrity: Data migrations involve migrating the data along with migrating metadata. For a cloud data migration to succeed, not only should all the data be moved over with full fidelity, but all the access controls, permissions, and metadata should also move over. Often, this is not just about moving data but mapping these from one storage environment to another.
- Downtime Impact: Cloud data migrations can often take weeks to months to complete. Clearly, you don’t want users to not be able to access the data the need for this entire time. Minimizing downtime, even during a cutover, is very important to reduce productivity impact.
- Slow Networks, Failures: Often cloud data migrations are done over a Wide Area Network (WAN), which can have other data moving on it and hence deliver intermittent performance. Plus, there may be times when the network is down or the storage at either end is unavailable. Handling all these edge conditions is extremely important – you don’t want to be halfway through a month-long cloud data migration only to encounter a network failure and have to start all over again.
- Time Consuming – Since cloud data migrations involve moving large amounts of data, they can often involve a lot of manual effort in managing the migrations. This is laborious, tedious and time consuming.
- Sunk Costs: Cloud data migrations are often time-bound projects – once the data is migrated, the project is complete. So, if you invest in tools to address cloud data migrations, you may have sunk costs once the cloud data migration is complete.
Cloud Data Migrations can be of Network Attached Storage (NAS) or File Data, or of Object data or of Block data. Of these, Cloud Data Migration of File Data and Cloud Data Migration of Object data are particularly difficult and time-consuming because file and object data are much larger in volume.
To learn more about the seven reasons why cloud data migrations are dreaded, watch the webinar.
Cloud Data Migration Strategies
Different cloud data migration strategies are used depending on whether file data or object data need to be migrated. Common methods for moving these two types of data through cloud migration solutions are described in further detail below.
Cloud Data Migration for File Data aka NAS Cloud Data Migrations
File data is often stored on Network Attached Storage. File data is typically accessed over NFS and SMB protocols. File data can be particularly difficult to migrate because of its size, volume, and richness. File data often involves a mix of large and small files – data migration techniques often do better when migrating large files but fail when migrating small files. Data migration solutions need to address a mix of large and small files and handle both efficiently. File data is also voluminous – often involving billions of files. Reliable cloud data migration solutions for file data need to be able to handle such large volumes of data efficiently. File data is also very rich and has metadata, access control permissions and hierarchies. A good file data migration solution should preserve all the metadata, access controls and directory structures. Often, migrating file data involves mapping this information from one file storage format to another. Sometimes, file data may need to be migrated to an object store. In these situations, the file metadata needs to be preserved in the object store so the data can be restored as files at a later date. Techniques such as MD5 checksums are important to ensure the data integrity of file data migrations to the cloud.
Cloud Data Migration for Object Data (S3 Data Migrations or Object-to-Cloud Data Migrations or Cloud-to-Cloud Data Migrations)
Cloud data migrations of object data is relatively new but quickly gaining momentum as the majority of enterprises are moving to a multi-cloud architecture. The Amazon Simple Storage Service (S3) protocol has become a de-facto standard for object stores and public cloud providers. So most cloud data migrations of object data involve S3 based data migrations.
There are 3 common use cases for cloud object data migrations:
- Data migrations from an on-premises object store to the public cloud: Many enterprises have adopted an on-premises object storage Most of these object storage solutions follow the S3 protocol. Customers are now looking to analyze data on their on-premises object storage and migrate some or all of that data to a public cloud storage option such as Amazon S3 or Microsoft Azure Blob.
- Cloud-to-cloud data migrations and cloud-to-cloud data replications: Enterprises looking to switch public cloud providers need to migrate data from one cloud to another. Sometimes, it may also be cost-effective to replicate across clouds as opposed to replicating within a cloud. This also improves data resiliency and provides enterprises with a multi-cloud strategy. Cloud-to-cloud data replication differs from cloud data migration because it is ongoing – as data changes on one cloud, it is copied or replicated to the second cloud.
- S3 data migrations: This is a generic term that refers to any object or cloud data migration done using the S3 protocol. The Amazon Simple Storage Service (s3) protocol has become a de-facto standard. Any Object-to-Cloud, Cloud-to-Cloud or Cloud-to-Object migration can typically be classified as a S3 Data Migration.
Secure Cloud Data Migration Tools
Cloud data migrations can be performed by using free tools that require extensive manual involvement or commercial data migration solutions. Sometimes Cloud Storage Gateways are used to move data to the cloud, but these require heavy hardware and infrastructure setup. Cloud data management solutions offer a streamlined, cost-effective, software-based approach to manage cloud data migrations without requiring expensive hardware infrastructure and without creating data lock-in. Look for elastic data migration solutions that can dynamically scale to handle data migration workloads and adjust to your demands.