This blog was adapted from the original article on Beta News.

Data observability is one of those hot and trendy terms which means different things to different people. While there are many definitions, this one from IBM is easy to understand: “Data observability refers to the practice of monitoring, managing and maintaining data in a way that ensures its quality, availability and reliability across various processes, systems and pipelines within an organization.”
Unstructured data observability is a commendable goal for IT: to be proactive and automatically fix things that aren’t working, are anomalous, suspicious and/or could potentially cause a disastrous outcome.
Such outcomes could include a network failure, a security breach, a server reaching capacity, or in the unstructured data management world — something else entirely.
Komprise COO Krishna Subramanian talks about what data observability means in unstructured data management and how you can start to generate this intelligence.
What do we mean by unstructured data observability?
KS: People managing unstructured data don’t often think about observability. They’re simply trying to maintain high performance file data storage systems and control costs. They prefer not to hear from users (much less executives and department heads) that access time is sluggish or their data seems to be somehow ‘missing.’
A data observability practice can improve performance and protect data for long-term needs:
- Data observability is more than monitoring and alerts when it comes to unstructured data.
- It can provide a complete view of the files in an organization across storage.
- Metrics can show IT how fast data is growing, and alert them to any out-of-ordinary data storage and access patterns.
How does unstructured data observability benefit IT teams?
KS: This visibility gives organizations the means to reactively solve problems and hopefully, proactively prevent future problems from occurring. Unstructured data observability with analytics and reporting can also help increase collaboration across teams, improve planning, and help troubleshoot issues faster and more efficiently.
Unstructured data observability gives critical insights such as sensitive data being stored out of compliance. It may be useful to integrate data observability tools with IT service management software such as ServiceNow and Splunk.
Which data observability metrics and findings that IT should be tracking?
KS: This list is bound to grow with AI requirements, but here are a few points you can track on your unstructured data:
• Data growth rates
• Top file types and if they change or grow suddenly
• Top data owners and if they change or grow suddenly
• Zombie data amounts and changes
• Orphaned data amounts and changes
• Storage capacity metrics
• Data access speeds
• Percent hot data
• Percent cold data
• Percent data on source
• Percent data modified, moved, new, free and full, on storage
• Sensitive data
What emerging technology trends are shaping the future of unstructured data observability?
KS: AI and data governance are two major areas that can aid unstructured data observability.
- AI can help by spotting anomalies and trends faster.
- Scanning and filtering tools can enrich the contextual information about your file and object data across hybrid cloud storage.
- Richer metadata can improve unstructured data observability by providing more dimensions for analytics, making it easier to spot trends, anomalies and issues.
- For instance, metadata tags for PII or IP can show if regulated data is being stored according to compliance rules.
- Unusual activity such as a large amount of deletes or a user’s personal folders spiking quickly in size can indicate a security or compliance incident.
- Unstructured data management solutions allow IT users to drill down into directories and file shares to investigate alerts coming from IT monitoring and cybersecurity systems.
Does data observability help with ransomware defense?
KS: Absolutely. With more insights on data in storage, IT teams can make better decisions for its management, such as reducing the on-premises footprint of attack. For instance, if you can see that the organization has 65 percent of data that is ‘cold’ and hasn’t been touched in more than two years, you can move that data to object storage in the cloud.
Now it’s out of the data center attack surface. Further, if stored in immutable object storage, this cold data is protected as ransomware actors cannot modify or delete it.
Learn more about how to circumvent file data risks for ransomware.
What are best practices for implementing a data observability strategy within complex, hybrid environments?
KS: Many organizations say data observability is a modern moniker for data monitoring. If instead, they viewed unstructured data observability as a broader topic encompassing data analytics and reporting that feed into actionable unstructured data management, then observability becomes a much richer function. It can help resolve issues faster and proactively enrich the value of the data that is growing the fastest and costing you the most.
