This article was adapted from its original version on The New Stack.
In a hybrid cloud environment, you often wind up with tool sprawl, making it hard to manage disparate environments. Simplify this by getting data visibility so you can apply rules and policies to keep IT and data assets under control.
Data management is hard enough when your data lives in a single data center or cloud. But when you opt for a hybrid cloud strategy, you face a whole new level of complexity when it comes to tracking, securing and governing your data. The main reason why is that in a hybrid cloud model, you have many more data vendors, tools and protocols to contend with than you would when all of your data lives in a single environment.
You might, for example, have some data that lives in local file systems on on-prem Windows and Linux servers. Meanwhile, you also host some data in an NFS or SMB file share running on your corporate network. At the same time, you use a cloud-based object storage service like AWS S3 or Azure Blob Storage. You might have other storage solutions, such as NetApp, in the mix to boot.
Not only does each storage vendor or protocol in a scenario like this involve a different storage location, but it also entails an entirely independent set of tools for identifying, managing, backing up and protecting data. Securing data on a Linux file system requires you to use Unix tooling to set file permissions, for example, whereas with cloud-based data, you’d use your cloud vendors’ access management framework, like AWS IAM.
The bottom line is that determining where your data lives — let alone managing it effectively — requires you to juggle a disparate set of tools when you have a hybrid cloud strategy. You have to navigate a variety of data silos and master numerous protocols and platforms to keep your data secure and enforce governance policies.
A Better Approach to Hybrid Cloud Data Management
You can’t erase the siloed nature of data in a hybrid cloud. It just comes with the territory. What you can do, however, is to take steps to simplify and streamline the way you work with data across the various silos that exist within a hybrid cloud.
There are four key practices to follow in this regard:
- Achieve full data visibility—The first is simply to know which data you have via the creation of a global file index. After all, you can’t govern data very effectively if you don’t know where it exists or which protocols or platforms it depends on. Building a data index that identifies all your data across the various assets in your hybrid environment can ensure you know where your data resides at all times. Some storage vendors can index their storage platform only. This is proprietary and limited to that silo, so IT would need to integrate the indexes manually along with any data stored in the cloud.
- Build for accuracy—The second step toward better hybrid cloud data management is ensuring that your data index is continuously updated. It’s very likely that your data architecture changes constantly. You may move data from one location to another within your hybrid environment, for example, or introduce new types of data services. It’s critical that your data index remain flexible and scalable so that it can reflect these changes as they occur. Your index needs to support new data formats, storage locations, protocols and so on so it can keep adapting with your business.
- Operate by rules and policy—Third, strive to deploy an actionable data management strategy. You should be able to write policies that define how data should be managed based on attributes you define and then enforce those policies automatically across your hybrid environment. Consider an organization that needs to delete data of a certain type (such as ex-employee or ex-customer data) after a set period of time to meet compliance requirements. Instead of attempting to meet that rule imperatively — which would mean going out and finding the data and then deleting it manually — the organization can adopt a declarative approach wherein it writes a policy that says “when data is tagged with [insert attribute here], delete it after one year.” Then the rule would be continuously enforced across the environment, regardless of where exactly the data is stored.
- Maintain excellent user experience—Finally, the best hybrid cloud data management practices should enforce data governance rules without disrupting user access and/or the way that your workloads operate. They shouldn’t slow down performance or cause application errors even as they move data around, modify access controls and so on.
There’s no denying that hybrid cloud architectures make data management inherently more complex. With the right approach, however, it’s possible to manage that complexity in a way that ensures both efficiency and consistency, no matter how many data silos, tools or protocols exist within your cloud environment.