Recently Komprise announced Smart Data Workflows, a systematic process to discover relevant file and object data across cloud, edge and on-premises datacenters and take action on it such as feeding data in native format to AI and machine learning (ML) tools and data lakes. At the core of this new capability are Komprise Deep Analytics and Deep Analytics Actions, which provide a searchable Global File Index (GFI) of all unstructured data. This way, customers can query and find just the data they need and then use Komprise data management policies to systematically operate on the specific data of interest.
Increasingly our customers want to provide capabilities for departmental IT groups and power users to access and manage unstructured data. Here are some common Deep Analytics use cases we’ve seen storage teams adopt:
Business unit metrics with interactive dashboards
In many organizations, a central infrastructure/cloud team provides services to various departments. There is a need to provide insight on departmental data usage and growth, especially in a “showback” or “chargeback” model. Instead of IT teams manually generating and distributing reports to business units, with Komprise they can simply set up views for authorized departmental users to monitor and understand their storage spending directly in the Komprise dashboard.
By providing easy analytics and reporting to departments of how much data they are using, what that data costs the company and who’s consuming it, departmental users become better informed to work with central IT on cost optimization.
Business-unit data tiering, retention and deletion
An organization may have a global retention policy that data in general should not be kept for longer than five years. However, some data may have shorter time frames because of its nature; for instance, research data such as images or genomics interim files may only be valid for weeks or months. Instead of having this temporary data around for the same five years as all other data, departments can now create Komprise Deep Analytics queries to identify data sets with exceptions—such as by tag or by data type or project name.
This data can have a different Komprise data management policy that tiers it more aggressively. Once the query and policy are set, Komprise will continuously find data that fits the criteria and act on it; neither the departmental users nor storage IT teams need to babysit each project individually. Conversely, some data may need to be retained for 20 years to meet regulatory requirements. Again, Komprise Deep Analytics queries can identify such data in partnership with the business users who have better insights into what data fits the criteria.
Identifying and deleting duplicates
Research data often ends up with multiple copies in different places. For instance, a dataset is generated by instruments and then copied in five different labs for analysis. The labs may copy the data further for multiple runs. When the project is complete, you can use Komprise to identify suspected duplicate files for review and possible deletion. This capability is helpful as a regular clean-up task to find and delete large, suspected duplicate files, which frees up valuable primary storage and cuts storage costs.
Mobilizing specific data sets for third-party tools
Enterprises use third-party tools or services for specialized data processing or analysis, but those tools may not be licensed for use in multiple locations or may only be available in the cloud. Finding the right data to feed these tools can also be a challenge. Data owners can leverage Deep Analytics queries to find specific data sets and then either tag the data sets or simply save the query for central IT to configure a policy to copy the data to wherever the tool or service can operate on it. Komprise automatically handles the file-to-object translation if the processing tool lives in the cloud, ensuring that the objects created are in native format and directly consumable by the cloud service.
Using data tags from on-premises sources in the cloud
Enterprises often don’t have a good system of searching historical unstructured data. Data tags are system-specific, so while users might have fastidiously tagged data in their electronic lab notebook or some source application, once the data is tiered to the cloud or other file storage, this tagging information is lost. Komprise addresses this limitation with a universal tagging approach that works across file storage and clouds. You can tag data in Komprise or by ingesting tags via API. Komprise retains the tags along with all the standard file metadata even as you move the data to cloud storage or across clouds and different storage architectures. This ensures that users can search for the same data regardless of where the data lives throughout its lifecycle. Read more about how Komprise data tagging works in this blog.
Our customers often share feedback after they dig deeper into Komprise Deep Analytics and Smart Data Workflows:
“Deep Analytics allows us to do fine-grained searches and get surgical about what we can find and archive. Komprise empowers our end users to archive data the right way via tagging.”
“In higher education, we have people within shadow IT groups at our professional schools who want to use Komprise for research purposes. We don’t have time in central IT to do searches for them.”
“The most value we can provide to our users is to empower them to leverage Komprise to get their data to the cloud.”
Deep Analytics Actions and Smart Data Workflows differentiate Komprise from any other unstructured data management platform on the market today, delivering data storage cost savings and greater data value extraction. In the coming weeks we’ll be announcing new functionality and interesting use cases to help our customers and partners understand what’s possible with a smarter approach to data migration, data tiering, data mobility and data management.
———————