Why Unstructured Data Management Matters: An Industry View

Industryseries_438931535-2048x933 The world of data is ever-changing in terms of its types, volumes, uses and risks. Understanding these differences is critical so that IT leaders, data analysts, data scientists and other data stakeholders can manage and use it effectively for new initiatives. The first distinction is unstructured data versus structured data. Structured data is presented in rows and columns, and typically stored in a database.

Structured and semi-structured data can include transaction data, customer relationship data, back-office data, point of sale details, financial and claims data, or click-stream data from a website that is typically fed into a data warehouse and accessed, analyzed and shared via reporting and business intelligence tools. Unstructured data, which comprises at least 80% of all data in the world, does not follow a standard, identifiable structure. IT teams can’t easily store it in a relational database. And it is growing exponentially.

There is an estimated 120 ZB of data in the world today, according to Statista. IDC expects data to grow to 175 ZB by 2025. To consider what that means, this Cisco blog gives a few analogies:

“If each Terabyte in a Zettabyte were a kilometer, it would be equivalent to 1,300 round trips to the moon and back.”

Unstructured data can include emails, documents, web files, audio and video files, genomics files, CAD files, images and instrument and research data. While unstructured data is harder to manage and costly to store, it is fueling the next generation of AI and ML technologies which are reshaping society as we know it.

Unstructured Versus Structured Data

This unstructured data is growing particularly fast in certain industries. Here are some examples of common unstructured data types:

Life Sciences: Imaging, genome sequencing, research
Healthcare: Imaging, PACS, digital pathology
Media & Entertainment: Post-production, animation, VFX, content delivery
Government: CAD/CAM, GIS, bodycam surveillance
Oil and Gas: Seismic data, compliance
Transportation: Autonomous vehicles
Financial Services: Claims data, call center recordings
Legal Services: Contracts, court filings, transcripts, video files

In this series, we look at several industry examples of unstructured data types, growth and data management challenges as well as the potential value this data can bring to its sector.

The first post focuses on the Life Sciences sector, an $8-10 billion global industry with leaders including Eli Lilly, Pfizer, Johnson & Johnson, Merck and Abbvie.

Here’s a teaser:
Pharma and biotechs have been at the center of global innovation, with rampant revenue growth and demand fueled by the Covid-19 pandemic. Investments in cloud, AI and digital technologies have intensified, delivering groundbreaking changes in how companies develop, test and deliver products to market.

Common file types in life sciences include: clinical images, genome sequencing and other instrument data, as well as research documents. These data types don’t work well with traditional data analytics tools; life sciences companies are increasingly moving research data to the cloud to leverage affordable and scalable processing and analytics services for research data, offered by the large cloud providers (CSPs).

Here is a Pfizer case study on cold data tiering to AWS.

In this unstructured data management by industry series we’ll cover:

Common data management challenges, including data silos, poor visibility into data, cost optimization needs, continual change in regulations, and too much time spent on data preparation and deployment. Additionally, ransomware protection and the growing AI requirement for automating data workflows, right placing data, and ensuring the right data governance strategy is in place are all part of a comprehensive, storage-agnostic unstructured data management strategy.

Read the Post: Life Sciences and Unstructured Data Management

Read the Post: Healthcare and Unstructured Data Management

Read the Post: Automakers’ Data Management Needs Span Safety, Performance and AI

Read the Post: Unstructured Data Management in Government

Read the Post: The Data Equation to Higher Eds Teknonic Shifts

Read the Post: Making A Case for Legal Unstructured Data Management

Learn more about: industry, industry use cases, Unstructured Data, unstructured data types

Recent Articles

StorageNewsletterThumb-1

Exclusive Interview with Kumar Goswami, CEO and Cofounder, Komprise

IntelligentCIO_2-e1549472646268

How AI FinOps and data management can slash enterprise AI spending

aibillsblog_websitefeaturedimage_1200x600

Unstructured Data Quality’s Role in AI Inferencing Costs

This article was adapted from its original version on TDWI. Enterprise AI budgets are expanding at a pace that is…