Komprise June Update: Product, People and News
The landscape of data storage, unstructured data management and cloud are constantly in flux. Komprise sits right in the middle of all this change! Read on for some definitions of a few top terms in the space as well as updates on Komprise product news, events and media mentions. Also, check out our Data Management Glossary for more definitions!
What is NFS versus SMB?
NFS and SMB are network file-sharing protocols, as defined here in our data management glossary:
Network File System (NFS): The NFS protocol is one of several distributed file system standards for network-attached storage (NAS). It was originally developed in the 1980s by Sun Microsystems and is now managed by the Internet Engineering Task Force (IETF).
Server Message Block (SMB): SMB is a network communication protocol for providing shared access to files, printers, and serial ports between nodes on a network. SMB is also known as Common Internet File Systems (CIFS).
Microsoft partner Cloud Infrastructure Services delivers these comparisons: “NFS is unbeatable when it comes to medium sized or small files. For larger files, the performance of both protocols is similar. NFS is more appropriate for Linux users, while SMB is more appropriate for Windows users.”
What is NAS?
Network Attached Storage (NAS) is a storage device connected to a network that allows storage and retrieval of data from a centralized location for authorized network users and heterogeneous clients. These devices generally consist of an engine that implements the file services (NAS device) and one or more devices on which data is stored (NAS drives). The purpose of a NAS system is to provide a local area network (LAN) with file-based, shared storage in the form of an appliance optimized for quick data storage and retrieval. NAS is a relatively expensive storage option, so it should only be used for hot data that is frequently-accessed.
What is Metadata?
Metadata is data that describes other data, such as author, date created, date modified and file size. Metadata can be created manually or through automation and is useful in managing unstructured data since it provides a common framework to identify and classify a variety of data including videos, audios, genomics data, seismic data, user data, documents and logs. TechTarget describes several different types of metadata in this article.
What is Cloud Tiering?
Cloud tiering extends your current storage infrastructure to include cloud data storage resources. Cloud Tiering lets storage administrators set policies to move infrequently accessed data to lower cost storage. The result is better use of expensive primary storage with capacity only used by “hot data” requiring faster access. Cloud tiering can also save significantly on storage spending and protect cold data for disaster recovery, auditing and research needs.
What is Data Tagging?
Data tagging is the process of adding metadata to your file data in the form of key value pairs. These values give context to your data, so that others can easily find it in search and execute actions on it, such as move to confinement or a cloud-based data lake. Data tagging is valuable for research queries and analytics projects or to comply with regulations and policies. To learn more about data tagging with Komprise, read this blog.
What is Unstructured Data Management?
Unstructured Data Management is a category of software that has emerged to address the explosive growth of unstructured data in the enterprise and the modern reality of hybrid cloud storage. As further explained in ITProToday: “Unstructured data is more difficult to manage than unstructured data as it doesn’t have a uniform format, even if the data source is the same. Indeed, managing it in the way structured data is managed is something of a novel idea, as it’s only been feasible to mine it for information since big data analytics and AI have taken off.” Komprise Intelligent Data Management delivers a unique approach to this market segment, with a comprehensive platform that includes: data insight through analytics, data mobility, open standards, cloud native access and a non-disruptive user experience with Transparent Move Technology.
What is a Data Lake?
A data lake is data stored in its natural state. The term typically refers to unstructured data that is sitting on different storage environments and clouds. The data lake supports data of all types – for example, you may have videos, blogs, log files, seismic files and genomics data in a single data lake. Komprise COO Krishna Subramanian discussed tactics for cloud data lakes in RTInsights: “In the past few years, cloud-based data lake platforms have matured and are now ready for prime time. Cloud providers’ cheaper scale-out object storage delivers a platform for massive, petabyte scale projects that simply isn’t viable on-premises.”
What is a Data Lakehouse?
Since data lakes can become swampy and unusable with piles of raw unstructured data, a new type of architecture came to the forefront in 2021: the data lakehouse. As defined by Bernard Marr in Forbes: “Data lakehouses enable structure and schema like those used in a data warehouse to be applied to the unstructured data of the type that would typically be stored in a data lake. This means that data users can access the information more quickly and start putting it to work.” Not surprisingly, Databricks does a nice job of explaining the data lakehouse.
What is a Data Fabric?
“Data fabric is a solution that allows organizations to manage their data—whether it’s in different types of apps, platforms, or regions—to address complex data issues and use cases,” according to Dataconomy. “Data fabric aims to make an organization’s data as useful as possible – and as quickly and safely as possible – by establishing standard data management and governance processes for optimization, making it visible, and providing insights to numerous business users.” Datanami adds to the definition: “Conceptually, a big data fabric is essentially a metadata-driven way of connecting a disparate collection of data tools that address key pain points in big data projects in a cohesive and self-service manner. Specifically, data fabric solutions deliver capabilities in the areas of data access, discovery, transformation, integration, security, governance, lineage, and orchestration.”
Komprise Smart Data Workflows
In May, Komprise announced new capabilities for its Intelligent Data Management solution. Komprise Smart Data Workflows is a systematic process to discover relevant file and object data across cloud, edge and on-premises datacenters and feed data in native format to AI and machine learning (ML) tools and data lakes.
Komprise in the Media
Komprise director of product marketing wrote about data tagging and how it works in unstructured data management.
Komprise President & COO Krishna Subramanian outlines a 5-step maturity model for unstructured data management. This is a must read for the modern storage team!
The life sciences industry has been in the spotlight since the onset of Covid-19. In this article, Krishna Subramanian discusses the intersection of cloud maturity and unstructured data analytics: a perfect tipping point for innovation in life sciences.