The State of Unstructured Data Management 2024
Read the Latest Report
Enterprise IT Builds AI Infrastructure on a Budget
The fourth annual survey finds that most (70%) of enterprises are still experimenting with AI and “preparing for AI” remains a top data storage and data management priority for IT leaders. Yet leaders said that cost optimization is an even higher priority this year and they are trying to fit AI into existing IT budgets. Only 30% say they will increase their IT budgets to support AI projects.
Unstructured data management highlights:
- Top data storage priority: Cost Optimization
- 57% say preparing for AI is the top business challenge for unstructured data management
- 44% are creating AI-ready infrastructure and 32% are building their own learning models
- 47% say AI data governance/security is the top future capability, up from 28% in 2023
Download the report today to understand the primary unstructured data management challenges and opportunities to deliver greater cost savings and data value.
This report summarizes the responses of 300 global enterprise storage IT directors, VPs and C-level executives at companies with more than 1,000 employees in the United States.
UNSTRUCTURED DATA MANAGEMENT 2024 REPORT COVERAGE
What did the Komprise 2024 State of Unstructured Data Management report find about enterprise AI readiness, and how does the picture look two years later?
The 2024 report — the fourth annual Komprise survey of 300 enterprise storage IT directors, VPs, and C-level executives — captured a pivotal moment: most organizations (70%) were still experimenting with AI and preparing for AI remained a top priority, yet only 30% said they would increase their IT budget to support AI projects. Two years on, that cautious posture has been overtaken by urgency. The contrast between 2024 and 2026 tells the story of how fast this market has moved:
- Budget commitment doubled — 40% of IT leaders will increase their IT budget to pay for AI in 2026, compared to 30% in 2024; the experiment phase is over and production deployment has begun
- Data volumes surged — nearly 50% of enterprises were storing more than 5PB of unstructured data in 2024; by 2026 that figure had risen to 74%, a 57% increase in just one year
- The top challenge shifted from preparation to execution — in 2024, preparing for AI was the top business challenge; by 2026, the top challenge was reducing data risk from AI — the conversation moved from “how do we get ready” to “how do we manage the consequences”
- Classification urgency doubled — data classification and tagging was the second-leading challenge in prepping data for AI in 2024 at 41%; by 2026 it had become the top challenge at 56%; the data preparation problem did not get solved — it got bigger
- The 2026 report is the essential update — for organizations using the 2024 findings to inform their strategy, the 2026 State of Unstructured Data Management report reflects how dramatically the landscape has shifted and should be the current planning reference; download it at komprise.com
What did the 2024 report reveal about storage costs, and why is the cost pressure even more acute in 2026?
Most organizations spend at least 30% of their IT budget on data storage, according to the 2024 report, with these costs growing and becoming unsustainable. The 2024 survey documented a cost trajectory that has since been compounded by forces nobody fully anticipated at the time:
- The flash price shock — IDC characterizes today’s memory constraints as more than a temporary shortage, pointing instead to a long-term shift in global silicon wafer allocation. As a result, DRAM and NAND supply growth in 2026 is projected to stay well below historical levels, leaving enterprises already dedicating more than 30% of IT budgets to storage facing rising hardware costs alongside relentless data growth.
- Storage spend is still rising — 85% of IT and data storage leaders are projecting an increase in data storage spend in 2026; the cost problem the 2024 report documented has not been solved — it has accelerated
- The multiplier effect — the 2024 report highlighted that storage costs are compounded by backup and DR replication; every petabyte of unstructured data on primary NAS is backed up identically, multiplying the true cost by 3x to 4x; this multiplier has grown more expensive as flash prices have risen
- Intelligent tiering is the fastest path to relief — for the fourth year in a row, IT directors said they would spend more on storage than the previous year; the only way to break this cycle is to actively move cold data off expensive primary storage through intelligent tiering rather than buying more capacity; the Komprise Flash Stretch Assessment helps qualified enterprises managing 500TB or more model exactly how much they could save before making any commitment
- Cost optimization remains the top storage priority — the top data storage priorities for the next year are cost optimization (64%), data preparation and classification for AI (61%), and cloud migration (54%); what changed from 2024 is that AI data preparation now sits directly alongside cost optimization, because the same ungoverned data estate that drives cost bloat is also the obstacle to AI readiness
What did the 2024 report reveal about data governance and security gaps, and how has the risk landscape changed?
AI data governance and security was the top future capability cited in the 2024 report at 47%, up from just 28% in 2023 — the sharpest year-over-year jump in the survey. What the 2024 report identified as a future concern has become a present-tense crisis. The trajectory from 2024 to 2026:
- Only 13% restricted corporate data in AI in 2024 — only 13% restricted what corporate data could be used in AI, while 31% had no restrictions for users, apps, or data in AI; this is the shadow AI problem in raw numbers — most enterprises in 2024 had essentially open access policies for AI tools
- The consequences arrived in 2025 — nearly 80% of IT leaders now say their organization has experienced negative outcomes from employee use of generative AI, including leaking of sensitive data into AI tools at 44%; 13% report that these incidents resulted in financial, customer, or reputational damage
- The top business challenge flipped — in 2024 the top business challenge was preparing for AI; by 2026 it had shifted to reducing data risk from AI; the risk is no longer theoretical
- Security concern is near-universal — 90% of IT leaders are now concerned about shadow AI from a privacy and security standpoint; the 13% who restricted AI access in 2024 were ahead of their peers; the question now is not whether to govern AI data access but how to do it at petabyte scale across distributed storage environments
- Komprise is the metadata and orchestration layer for enterprise unstructured AI data that addresses this directly — Sensitive Data Management, Deep Analytics, the Global Metadatabase, and Smart Data Workflows provide the classification, detection, remediation, and audit trail capabilities that the 2024 report identified as urgent future needs and that are now non-optional requirements
What were the top technical unstructured data management challenges in 2024 and have they been solved?
Moving data without disruption to users and applications was the top technical challenge in 2024 at 54%, a position it had held for two consecutive years; using AI to classify and segment data was the second-leading challenge at 48%. Neither challenge has been resolved. Both have intensified:
- Data movement disruption remains unsolved at scale — the challenge of moving petabytes of unstructured data without breaking applications, losing metadata, or forcing users to change workflows is fundamentally harder at 2026 data volumes than it was in 2024; Komprise Transparent Move Technology and Elastic Data Migration — which delivers 27x faster NFS and 25x faster SMB WAN performance via Hypertransfer — are the production-proven answer to both the reliability and the performance dimensions of this challenge
- AI classification went from aspiration to requirement — in 2024, using AI to classify and segment data was an emerging tactic; by 2026, classifying data for AI is the top technical challenge at 58%, cited more frequently than data movement; the challenge evolved from “how do we use AI to classify data” to “how do we classify data fast enough to feed AI”
- The Global Metadatabase closes the visibility gap — the 2024 report identified a persistent inability to see data across silos as a root cause of both movement and classification challenges; Komprise is the metadata and orchestration layer for enterprise unstructured AI data precisely because the Global Metadatabase provides the unified, continuously updated, cross-silo metadata index that makes both challenges tractable at petabyte scale
- KAPPA addresses the classification gap directly — KAPPA Data Services, launched in 2026 and included in Komprise Intelligent Data Management, allows IT teams to build custom metadata extraction functions in a few lines of Python that run across petabytes using serverless processing; this is the production answer to the “using AI to classify and segment data” challenge the 2024 report documented as the second-hardest problem in unstructured data management
What did the 2024 report reveal about AI infrastructure investment, and what should IT leaders do differently in 2026?
The leading tactic to address AI data concerns in 2024 was to upgrade data storage and data management technologies, cited by 53%; nearly half (44%) were creating AI-ready infrastructure. In 2026, that investment has accelerated but the results have been uneven. The gap between organizations that invested strategically and those that invested reactively is now visible in outcomes:
- Infrastructure investment alone is not sufficient — buying better storage and more compute without addressing data governance, classification, and curation upstream produces AI pipelines fed with ungoverned, noisy, sensitive data; to meet security and AI requirements, IT leaders will invest in upgrading data storage and data management platforms at 64%, versus 53% in 2024 — but the question is whether those investments include the metadata and orchestration layer, not just the storage hardware
- The 2024 skills gap has widened — AI data management was a top skills gap in 2024; by 2026 it had grown to 62% of respondents versus 43% in 2024 — the fastest-widening skills gap in the survey; organizations that delayed building AI data management expertise in 2024 are now significantly behind
- The right sequence matters — the most effective AI infrastructure investments follow a clear order: gain visibility across all unstructured data silos first through the Global Metadatabase, then classify and govern the data estate, then tier cold data to free capacity and budget via a Flash Stretch Assessment, then automate AI data curation through Smart Data Workflows; organizations that reversed this sequence — buying AI tools before governing their data — are the ones experiencing the negative outcomes the 2026 survey documents
- The 2026 report is the current planning reference — the 2024 report set the context; the 2026 State of Unstructured Data Management captures where 300 enterprise IT leaders are now, what is working, what is not, and where investment is heading; for any organization using the 2024 findings to plan 2026 and 2027 strategy, the updated report is available at komprise.com and should replace the 2024 edition as the primary planning reference