Top 5 Priorities for Unstructured Data Management in 2025

Kumar-Goswami-Komprise-CEO-Headshot-2-1As a CEO and cofounder, I really enjoy this time of year to reflect upon the past 12 months and do some thinking about the year ahead. Malcolm Gladwell’s famous tome, The Tipping Point, comes to mind. Gladwell defines a tipping point as “the moment of critical mass, the threshold, the boiling point.”

Today I am seeing three tipping points that are irrevocably changing enterprise IT and our world as we know it:

  • The mass availability of AI. The tipping point for that was, of course, ChatGPT’s launch in the fall of 2022. As AI expands in maturity and applications, many organizations are finding that they are unprepared to safely and ethically launch these tools to the workforce.
  • Unstructured data growth, which is a blessing and a curse. The blessing is that we have the volume and diversity of data to generate new intelligence like we’ve never seen before. The curse is that this data is stretching the limits of IT budgets to store, backup and corral for re-use and to protect from cyber-attacks.
  • The rising importance of the right infrastructure at the right time. AI needs specialized GPU compute and storage which are scarce. Datacenters are running out of power and infrastructure costs are rising, so cost optimization remains a top CIO priority. Climate disasters and cyberattacks are becoming more deadly – calling for more data protection. Analyzing and right-placing data across silos of infrastructure will be a proactive strategy for the foreseeable future.

These three tipping points are also linked. AI depends upon easy access to unstructured data – large quantities of it. Unstructured data growth can be monetized to help offset the costs of storing and protecting it, if we determine how to use it safely and efficiently in AI. A flexible, hybrid cloud infrastructure with systems to intelligently move, manage and monitor data as needs change is the foundation for both unstructured data and AI.

Komprise was founded in 2014 to address the challenges of uncontrolled unstructured data with a storage-agnostic, unstructured data management SaaS. As our company has evolved and customer needs have evolved with it, we are now seeing many more use cases for managing unstructured data strategically and with intelligence. Recently, Gartner published a report on the need for data storage management services (DSMS) to optimize storage, improve unstructured data governance, and reduce unstructured data risks.

The Gartner report notes that by 2028, large enterprises will triple their unstructured data capacity across their on-premises, edge and public cloud locations. Independent unstructured data management solutions are also imperative for developing and managing successful, sustainable enterprise AI initiatives.

Here are five top priorities for managing unstructured data that I believe CIOs, CTOs, VPs of infrastructure and storage directors should consider in the coming year.

1. Prepare your Data for the AI Tsunami.

Like it or not, AI is now part of every IT organization’s strategic plan. Even if your organization is not planning to launch any AI applications internally or externally for customers, chances are, the tools you’re using to run your business are enhanced with AI. Your employees are using GenAI tools to get work done. Therefore, you need to be prepared. That means developing an AI data governance plan. Get started by gathering deep intelligence on your unstructured data across storage silos. Consider an unstructured data management platform that indexes all your data so that you always know how much you have and how fast it is growing, what is valuable and what is not (based on access patterns), what data needs to be protected from AI and what data is obscure and can be tagged for additional classification. These efforts will make your data more usable, more searchable, and protect against unnecessary risks. Read more on metadata tagging with Komprise.

eweekblog_websitefeaturedimage_1200x600

2. Ingest Data to AI with Automated Workflows.

We have entered the next phase of AI: from model training to retrieval augmented generation (RAG) and inferencing. Users need easy ways to search across corporate data stores, find the right data sets (which in turn requires data classification, per above), exclude sensitive data and move the right data to AI tools efficiently and without losing data or incurring lengthy delays. It’s critical to monitor and audit AI outcomes for accuracy and sensitive data leakage. Komprise Elastic Data Migration moves data 25 faster than common data migration tools with built-in risk assessment and troubleshooting capabilities. And Komprise Smart Data Workflows is a simple technology for setting up and auditing automated AI data pipelines.

Komprise-Smart-Data-Workflows-Diagram-9

3. Protect Against Ransomware and Sensitive Data Leakage.

IT managers in charge of data storage know that they must work closely with security teams to protect data for the organization. Increasingly, security is integrated into the entire infrastructure stack including data storage technologies. Storage managers can do more by reducing the large attack surface of their file data, which makes it highly vulnerable to ransomware attacks. By moving cold, inactive data to immutable object storage where it cannot be modified, you reduce the ransomware attack surface by 80% or more – while saving that much or more on your overall storage and anti-ransomware budget. Read the Katten Law case study. Furthermore, you can use Komprise Smart Data Workflows to connect an AI data classification tool which can quickly locate and tag sensitive data and segregate it from AI workflows. Too often, IT leaders discover hidden, sensitive data sets residing on noncompliant storage where it is not adequately protected from cyber-attack. Read the blog for more tips on protecting unstructured data from ransomware.

4. Optimize Data Management for Costs and Sustainability. 

Since the early days, Komprise has delivered a strong valuable proposition for our customers across all industries: you can use our solution to save 70% or more on your annual data storage and backup spending. Within minutes of installing Komprise in your environment, our familiar “data donut” dashboard shows you all kinds of useful insights on your data estate – such as how much data you have, common file types, top owners, departments with the highest spend, and so on. See metrics such as last access time to determine how much data is more than a year old (or whatever parameter you set) and then set a policy to continuously and transparently tier data as it ages to lower-cost secondary storage. You can also use Komprise to dig deeper into your data: finding duplicate data, orphaned data, zombie data, and any other data which can be archived to the cloud or even deleted. These tactics cut storage and backup costs while reducing your carbon footprint. Read more about our non-disruptive approach to cold data tiering using our patented Transparent Move Technology.

5. Deliver Governed Self-Service to Non-IT Users.

In the age of self-service, IT and departmental users alike benefit from tools which allow them to search and access data, reports and analytics without filing a help desk ticket. For instance, researchers, scientists, and engineers may want to tag their own data (such as by project name or keyword) so they can easily find it later. IT can still be in control by deciding where the data should live for cost and performance needs while the departments dictate how the data is classified and who has access to which data sets. It is a wonderful collaboration between IT and business units – and Komprise can make this happen. Komprise also has useful reports such as Showback, so that department heads can see how they are being billed in chargeback environments. Learn more about our shareable reports.

As I look toward 2025 and what’s next in our industry, I’d like to thank Komprise customers, partners and employees for a fantastic 2024. It is clear to me that unstructured data management is at the intersection of enterprise IT infrastructure optimization, data governance and AI innovation.

Getting Started with Komprise:

Contact | Komprise Blog