Smart Data Workflows Architecture: Cloud Field Day 14


When Komprise co-founder and CTO Mike Peercy goes to the whiteboard, people pay attention. Well, that’s been my experience and I certainly do. Mike once again got the colored pens out for Cloud Field Day 14 to walk through the architecture behind Smart Data Workflows.

It all starts with data sources: cloud, edge, data center and these can reside in different sites, locations and networks. This is why Komprise supports multi-site deployments. When it comes to the cloud, the focus of the session, customers may have buckets, containers, object stores and cloud services such as natural language processing (NLP) and other analytics services.

So the question is how do we get the right data from customer sources to these cloud services? That’s where Komprise comes in.

Mike briefly explains the components of the Komprise architecture: Director and Observers and the Global File Index (GFI), a high-performance indexing service. The GFI is essentially a picture of all the data: it is the universe of the customer’s data, which is typically made up of a few billion files. Metadata that is exposed by NFS, SMB, object protocols is what we compile and place in the GFI.


Mike then reviews 3 Smart Data Workflow examples:

1) Data Enrichment: A Smart Data Workflow builds a policy to copy data to a storage bucket in the cloud, where this data is consumed by a cloud service which extracts more custom metadata. He describes the autonomous vehicle use case demo and how you can filter and feed AI and big data tools with the right unstructured file and object data in an automated fashion. At around the 10-minute mark there are some good questions from @bknudtson and @CraigRodgersms to clarify the role of Komprise as well as other possible use cases.


2) Cloud Tiering: One of the main use cases for Komprise is Transparent Move Technology. Large files like .DAT files that you don’t need can move to the cloud for lower-cost storage. In this example, Mike creates a custom Deep Analytics query with specific parameters and sets up a data management policy to take action when these parameters are met. Here we get into a Komprise architecture discussion and review the power of file-object duality that Dynamic Links deliver as part of Komprise TMT.


3) Smart Data Migration: We encourage an approach where you know your data first, tier it to the right location and then migrate the hot data. This is done through the Komprise Director. The Observers do the work. The cost model is summarized in the Cloud Tiering Done Right session earlier in the day.


The session wraps up with a lively discussion and questions about the Komprise business model, pricing and how to get started with a custom demo.

You can watch Mike Peercy’s Chalk Talk video below:


Getting Started with Komprise:

Contact | Data Assessment