Data Management Glossary
Unstructured Data Management
What is Unstructured Data Management?
Unstructured Data Management is a category of software that has emerged to address the explosive growth of unstructured data in the enterprise and the modern reality of hybrid cloud storage. Data storage and data backup technology vendors are now increasingly recognizing the importance of unstructured data management as data outlives infrastructure and increasingly data mobility is needed as more data migrates to cloud data storage.
- Goes Beyond Storage Efficiency
- Must be Multi-Directional
- Doesn’t Disrupt Users and Workflows
- Should Create New Uses for Your Data
- Puts Your Data First
In August 2021, Komprise published the State of Unstructured Data Management Report:
Highlights of the Unstructured Data Management Report
Unstructured Data is Growing, as are its Costs
- 65.5% of organizations spend more than 30% of their IT budgets on data storage and data management.
- Most (62.5%) will spend more on storage in 2021 versus 2020.
Getting More Unstructured Data to the Cloud is a Key Priority
- 50% of enterprises have data stored in a mix of on-premises and cloud-based storage.
- Top priorities for cloud data management include: migrating data to the cloud (56%) cutting storage and data costs (46%) and governance and security of data in the cloud (41%).
IT Leaders Want Visibility First Before Investing in More Data Storage
- Investing in analytics tools was the highest priority (45%) over buying more cloud or on-premises storage or modernizing backups.
- One-third of enterprises acknowledge that over 50% of data is cold while 20% don’t know, suggesting a need to right-place data through its lifecycle.
Unstructured Data Management Goals & Challenges: Visibility, Cost Management and Data Lakes
- 44.9% wish to avoid rising costs.
- 44.5% want better visibility for planning.
- 42% are interested in tagging data for future use and enabling data lakes.
2022 State Unstructured Data Management Report
In August 2022, Komprise published the 2nd annual State of Unstructured Data Management Report: Komprise Survey Finds 65% of Enterprise IT Leaders are Investing in Unstructured Data Analytics. The Top 5 trends from the report are summarized here. They are:
- User Self-Service: In data management, self-service typically refers to the ability for authorized users outside of storage disciplines to search, tag and enrich and act on data through automation—such as a research scientist wanting to continuously export project files to a cloud analytics service.
- Moving Data to Analytics Platforms: A majority (65%) of organizations plan to or are already delivering unstructured data to their big data analytics platforms.
- Cloud File Storage Gains Favor: Cloud NAS topped the list for storage investments in the next year (47%).
- User Expectations Beg Attention: Organizations want to move data without disrupting users and applications (42%).
- IT and Storage Directors want Flexibility: A top goal for unstructured data management (42%) is to adopt new storage and cloud technologies without incurring extra licensing penalties and costs, such as cloud egress fees.
In a 2022 interview, Komprise co-founder and COO Krishna Subramanian defined unstructured data this way:
Unstructured data is any data that doesn’t fit neatly into a database, and isn’t really structured in rows and columns. So every photo on your phone, every X-ray, every MRI scan, every genome sequence, all the data generated by self-driving cars – all of that is unstructured data. And perhaps more relevant to more businesses, artificial intelligence (AI) and machine learning (ML) – they depend on, and usually output, unstructured data too.
Unstructured data is growing every day at a truly astonishing rate. Today, 85% of the world’s data is unstructured data.
And it’s more than doubling, every two years.
In part two of the interview, she noted:
Unstructured data doesn’t have a common structure. But it does have something called metadata. So every time you take a picture on your phone, there’s certain information that the phone captures, like the time of day, the location where the picture was taken, and if you tag it as a favorite, it’ll have that metadata tag on it too. It might know who’s in the photo, there are certain metadata that are kept.
All filing systems store some metadata about the data. A product like Komprise Intelligent Data Management has a distributed way to search across all the different environments where you’ve stored data, and create a global index of all that metadata around the data. And that in itself is a difficult problem, because again, unstructured data is so huge. A petabyte of data might be a few billion files, and a lot of these customers are dealing with tens to hundreds of petabytes.
So you need a system that can create an efficient index of hundreds of billions of files that could be distributed in different places. You can’t use a database, you have to have a distributed index, and that’s the technology we use under the hood, but we optimize it for this use case. So you create a global index. Learn more about unstructured data tagging.
The Future of Unstructured Data Management
In an end of the year blog post, Komprise executives review unstructured data management and data storage predictions for 2023 and the implications of adopting data services, processing data at the edge, multi-cloud challenges, the importance of getting smart data migration strategies, and more.