“Moving data to the cloud can help you optimize your infrastructure, but the bigger value is in leveraging the compute power and data services in the cloud,” remarked Komprise co-founder and COO Krishna Subramanian in our second Cloud Field Day 14 presentation.
Subramanian defines unstructured data as any data we can access as a file or as an object that does not fit neatly into database rows and columns. These data sets are piling up in the data center, at the edge and in the cloud. An unstructured data management solution that can look across all your sources of unstructured data, provide an analytical view of this data, mobilize data and allow users to index, search and deliver only what is needed to data consumers is a smart strategy to modernize your data storage practice.
In this session, Subramanian briefly introduces the Komprise SaaS platform and introduces our latest product update: Smart Data Workflows. Here’s an overview of what her session covered. You can watch the full session here:
With Smart Data Workflows, IT users can create automated workflows for all the steps required to find the right unstructured data across storage assets, tag and enrich the data and send it to external tools for analysis. This eliminates manual effort in unstructured data management and helps organizations speed time to value from new cloud-native tools.
With Smart Data Workflows, you can deliver only the right file and object data into a data lake: preventing the dreaded data swamp.
Does Komprise alter the data? No. Data remains in native format. When a file is moved to an object store, Komprise does not “munge it up.” We call it file-object duality. Read about it in this post: Why Cloud Native Data Access Matters.
The Power of Global Unstructured Data Visibility
Krishna introduced the Global File Index, which is a unified view of your data without moving the data. Today enterprise IT organizations are flying blind. They don’t know what data is sitting where, who is using the data, how the data is growing and what the data is costing them. End users can’t find the data they need when they need it. With today’s data volumes, organizations must have full visibility to make good decisions. Once you have this data visibility, Komprise makes it actionable. This is where the magic happens.
With billions of files and objects, analytics plus continuous mobilization is essential because data has a lifecycle and data management is not a one-time thing.
Smart Data Workflow Use Cases
Before the demonstration, Krishna reviewed a series of Smart Data Workflow use cases, including:
Smart Data Workflow Demonstration
Komprise CTO Mike Peercy delivered a demo related to autonomous vehicle data–because let’s face it, none of us will be driving in 10 years. The topic was also discussed in this recent webinar with AWS: A Modern Data Strategy for the Automotive Industry.
Here is the flow:
- ENABLE SHARES with autonomous vehicles data for processing
- Show Deep Analytics (which is the UI for the GFI – Global File Index)
- Show query for crash reports in 2019
- Show query with TAG : Stopped in traffic – there will be NONE
- Create Plan to analyze contents of 2019 files using LOCAL FUNCTION to tag matching files
- Activate the Plan to analyze and tag files
- Show query with TAG: Stopped in traffic – there will be MANY
Questions from Cloud Field Day Delegates
What options do customers have to scale up/out performance?
Mike walked through the scale-out architecture of Komprise Observers in his Chalk Talk session. Observers are like virtual machines that reside next to the storage, whether in the cloud or on-premises and they scale out into a fault-tolerant grid. Learn more here.
Can you add locations as you grow and easily manage that in an automated way?
Yes. Mike explained the Komprise multisite capabilities, and discusses a central hub approach for Observers for multi-edge site deployments.
How do you deal with encrypted content?
The content of files is invisible to Komprise. So how do you classify a file if you don’t know what it is? Komprise only looks at the metadata that the storage systems show. Komprise does not look inside files, but can trigger a mechanism via the API for the customer to look inside the file and then tag and mobilize the data as needed.
The tag that is being applied is now in the Komprise Global File Index. The actual file itself doesn’t change, correct?
Correct. The tags are within Komprise and Komprise is not in the hot data path. The interaction between Komprise tags and cloud tags is on the roadmap. Pretty cool stuff. Learn more about automated unstructured data tagging.
Metadata in Focus
This led to an interesting discussion about metadata, initiated by @datachick Karen Lopez.
A little #CFD14 light reading: Metadata. Everyone knows the basic definition… “data that describes data.”
But what does that mean, and how does it help systems and programs understand your data? Read on for a nice summary with real life examples https://t.co/hPqbCVBUlf
— Chris Hayner (@hayner80) June 24, 2022
Subramanian clarified that Komprise does metadata-level find, search, curate, enrich, mobilize and lifecycle management. Komprise is enabling the workflow that calls an external function. Komprise calls an application or some cognitive service (as determined by the customer) via the API that does the processing.
— Ather Beg (@AtherBeg) June 24, 2022
Read about the first Cloud Field Day presentation here in the blog.