We launched Komprise AI Days, a new small-format networking event, on April 2 in Boston. It was a great opportunity to connect with customers, partners and prospects and discuss what enterprise IT leaders are thinking about now when it comes to unstructured data and preparing for AI.
A highlight of the afternoon was a customer and partner panel that featured IT infrastructure leaders from Mass General Brigham, Yale University and DC Consulting. Here are the key trends that came out of the day:
Get a handle on the massive unstructured data estate
With many enterprises now storing 20, 30 or even 50 PB of data, everyone in the room agreed that data cannot all be treated the same and it cannot all be active. Avoid unnecessary costs and ransomware risks by offloading cold data, such as by setting up policies to automate transparent tiering to lower-cost storage and deploy a chargeback model (or showback) that provides departments and users with reports showing how much data they’re using, its age, and who are the largest consumers.
Don’t buy more storage without getting analysis across silos to know how frequently data is being accessed and the costs for storing cold data.
Storage pros are getting involved with AI
Search-oriented AI bots or agents have emerged as the first use case, and because AI doesn’t do anything without data, storage teams have been tasked to deliver the proper data to these new AI services. One of the requirements is to deliver data classification to restrict what the AI bot can access. As research teams build out their models, IT’s role is to segregate and deliver the data services they need. While data owners will remain responsible for their own data, storage teams are recognizing they will play a bigger role in providing the appropriate guardrails for responsible use of data with AI, such as by automating data classification and tagging sensitive data.
Researchers need help from IT to curate the right data for their projects.
Metadata tagging came up frequently, especially as a requirement when working with research teams. Tagging will be important to avoid repetitive research processes, to be more efficient in managing unstructured data and improve the AI experience and lower costs. For example, organizations may want to be able to tag data as internal, high risk, or for public consumption. Read how Komprise supports data tagging for AI.
Self-service data tagging will become more prevalent.
Given that data storage teams don’t know the data beyond attributes like access time, growth and cost, IT needs to provide tools for end-user organizations to tag their own data for classification and to avoid repetitive search and curate processes. Komprise COO Krishna Subramanian discussed the Deep Analytics role in Komprise Intelligent Data Management. IT can set up users who can search across only the data they have permission to access and then tag their own data based on, for example:
- Grant information
- Project information
- If a project is active or cold
Learn more about Komprise Deep Analytics.
Meeting internal demands for AI with costs in mind.
Storage leaders said they are looking to maximize the use of high-performance infrastructure economically so researchers can balance resources and “so everything isn’t custom.” One IT leader mentioned standardization: “Genomics is a good example. Once a data set is created, multiple researchers and labs will create copies from it. How do you find that data and clean it up so it doesn’t consume expensive disk space? How does the IT infrastructure team provide them with a central source that they can process?”
AI requires flexibility, rapid data mobility.
“This is our first foray of moving semi-live data to the cloud, and we need to figure out what characteristics to maintain as if on-premises, as we waterfall data from one tier to another tier,” said an IT director. “Everyone is talking about a deep archive. We instead chose a cool storage layer so that the user experience wouldn’t change.” Having the data near the compute is a top priority along with moving data quickly between hybrid cloud storage environments and across teams.
“Everyone is talking about a deep archive. We instead chose a cool storage layer so that the user experience wouldn’t change.”
Communicate frequently with departments during times of rapid change.
Start with your “friendlies.” Leverage grassroots relationships in the organization, people whom you already have good working relationships with, when introducing change or new strategies. Initiate annual introductions of the IT infrastructure and storage team to departments and let them know what’s new and what’s on the roadmap. Make sure they know what you have to offer and understand what their workload will be to avoid surprises. You don’t want to risk a ransomware incident when a department director puts a server under their desk.
Managing shadow IT and shadow AI is the work of diplomacy.
“We tried for years to consolidate between central IT and shadow IT groups,” a CTO said. “Those groups were set up to meet specialized needs and it’s hard to let them go. We consolidated cloud services but many teams run their own backups and maintain smaller storage systems.” Attendees agreed that a consolidation strategy is only successful when it is part of the mission and comes from the CIO, rather than being a grassroots initiative.
IT can further its case by being competitive internally .“Enterprise IT needs to prove that they are the gold standard in terms of technology and service. Ultimately the goal is to free up time and budget so departments, divisions, lines of business can focus on what they do best. Let’s give our subject matter experts to do it for them so they think of us first before going out and doing on their own.”
Our next AI Days: Unstructured Ignition event is in Houston was on May 13th. Thanks to all of our customers, partners and prospective customers for making these great. Here are a few posts from the Houston event here and here.