Artificial Intelligence Needs Unstructured Data: Are You Ready?


I’m sure I am not alone in saying that we often learn about the latest hot things going on from our kids. Just the other day, my 16-year-old son showed me an app that literally blew me away. The software, ChatGPT by OpenAI, responds to natural language requests to quickly create articles or answer complex questions. Other apps on the market today are generating art, writing code or troubleshooting software bugs—projects and challenges which can take hours, days or weeks yet are accomplished with astonishing accuracy and relevance in minutes.

This kind of technological innovation, while obviously still nascent and experimental, is mind-boggling to say the least. There is much that we have yet to understand about the potential for AI and its impact on not only work and economic output but our personal lives. Below I’m going to share my observations on the future of unstructured data management in response to rapid AI progress.

But first, let me summarize where we’ve been. Komprise Intelligent Data Management is an enterprise game changer because of the savings that we bring to customers. On average, customer save 70% on storage, backups and cloud spend from managing file and object data more efficiently across hybrid cloud storage; Komprise achieves this through analytics, smart data migration and cloud tiering. But this is only the beginning. The enormous opportunity at hand is to fully leverage unstructured data for use in AI and ML engines.

Enterprises need to be ready for this wave of change and it starts by getting unstructured data prepped, as this data is the critical ingredient for AI/ML. This entails new data management strategies which create automated ways to index, segment, curate, tag and move unstructured data continuously to feed AI and ML tools. Unforeseen changes to society, fueled by AI, are coming soon and you don’t want to be caught flat-footed.

Komprise is helping customers modernize their data management infrastructure and strategy to take advantage of the AI/ML innovation landscape. Here are some fundamental requirements for taking your unstructured data to the next level with AI:

1. If you aren’t indexing your unstructured data today, that’s a problem.

A major barrier to data analytics is finding the precise data you need to mine. Most people in “data” jobs– data analysts, data scientists, researchers, marketers—spend most of their time looking for the data that will fit a project’s requirements. One of our customers told us how their researchers from one location used to call those in another to find the data they needed for experiments. This doesn’t scale. Data indexing is a powerful way to categorize all your unstructured data across your enterprise and make it searchable by key metadata such as file size, file extension, date of file creation, date of last access, and custom (user-created) metadata such as experiment name or instrument ID. Komprise is unique in the unstructured data management sector because of our Global File Index, which is created as soon as you connect our solution to the file and object storage systems across your total data estate. This gives central IT, departmental IT teams and data researchers the equivalent of Google Search across your enterprise.

2. Make new uses of unstructured data while still being cost-efficient.

Now that your data is indexed, users can find precisely the data sets they need and create policies to automate the movement of data in a query to the location of choice—such as a cloud data lake for AI analysis. Our May 2022 announcement of Smart Data Workflows demonstrated our commitment to automation and ease of use by delivering a simple way to connect the dots to deliver the right data to the right place (and to the right people or applications) for action. Imagine creating custom workflows that enrich and optimize your data. For example: Komprise can tag and automatically tier instrument data to low-cost cloud storage as it is created. Cloud AI and ML tools can then ingest the data for analysis. Once the analysis is complete, Komprise can automatically move the data to a colder, cheaper tier. Meanwhile all of this happens automatically and at significantly lower costs to IT.

3. Collaborate with departments on unstructured data needs.

Another critical piece to the puzzle is giving users and departments more insight into their data assets so they can work with IT on creating the best data management policies that support ongoing and future analytics initiatives. In October 2022, we announced new self-service features whereby central IT can authorize departmental end users to interactively monitor usage metrics, data trends, tag and search data and identify datasets for analytics, tiering and deletion. Not only does this bridge the gap between IT and departments on data management decisions but both parties benefit: IT meets savings and governance goals while departments regain control over the data they need to protect and mine for future value.

As the year 2022 reaches its end, predictions for 2023 show a need for caution and smart spending in a roller-coaster economic environment. IT organizations will need to institute further cost controls to stem wasteful spending and they will need to think more about sustainability in all their practices to cope with a global energy and supply chain crisis. They will need to do all of this while keeping their eyes on the prize: getting their data and data infrastructure ready for the AI age, which is just around the corner.



Getting Started with Komprise:

Contact | Data Assessment