Get the Flash Stretch Assessment. Maximize Tiering to Offset Price Hikes. Learn How

5 Ways to Use Analytics for Cloud File Data Migrations

5 Ways to Use Analytics for Cloud File Data Migrations

Why Cold Data Management is an Easy Path to the Cloud

As unstructured data continues to grow exponentially, organizations struggle to control costs for file data storage. Many are turning to the cloud to scale and manage spend.

This Komprise and AWS eBook examines the 5 ways to use analytics to drive your cloud file and object data migration and data management strategy:

  • Understand your data patterns
  • Plan using a cost model
  • Use data to drive stakeholder buy-in
  • Eliminate user disruption
  • Create a systematic plan for on-going data management

Also learn how Pfizer used Komprise analytics to accelerate their cloud data migration to AWS.

“Komprise helps us make razor sharp business decisions based on data so we can reinvest in areas that are more important to patients.”

– Director Hosting Data Services, Pfizer

Pfizer-Aws-Komprise-Cloud-Migration-Ebook@3x


How did Pfizer reverse two decades of rising storage costs with a smart data migration strategy for AWS — and what makes their approach a repeatable model for any data-heavy enterprise?

Pfizer is saving 75% on storage using Komprise to analyze and continuously tier and migrate cold data to Amazon S3 as it ages; storage managers and researchers are finding additional benefits from analytics-driven unstructured data management, including zero user disruption and a foundation for delivering self-service to line of business teams. Pfizer’s outcome was not the result of buying more hardware or adopting a new storage vendor — it was the result of understanding what data actually needed premium storage and moving everything else. The elements that make the Pfizer model repeatable:

  • The problem was universal before the solution was found — Pfizer’s IT leaders knew that 65% of their data had not been accessed for at least two years; the company had offices on six continents, worked with several leading storage vendors, and had an installed base of many different generations of data management products; its active acquisition strategy meant regularly acquiring additional storage technologies, increasing overall complexity; the multi-vendor, multi-site, cold-data-dominated profile Pfizer faced is the profile of most large enterprises managing petabytes of unstructured data
  • Analytics before migration revealed what could safely move — Pfizer set out to analyze the petabytes of unstructured files on high-performance on-premises storage to identify what could be moved to the cloud and migrate it there without compromising how users and applications access the data or affecting the performance of existing storage infrastructure; this analytics-first approach — knowing before moving — is the difference between a smart migration that captures the full savings opportunity and a lift-and-shift that reproduces the cost problem in the cloud
  • Continuous tiering keeps savings compounding — “a lot of times I come in and feel like it’s Christmas morning because we had planned 100TB to go to AWS and it is 115TB because Komprise did its next scan and pulled some data I was not counting on that aged out,” said the Director of Hosting Data Services at Pfizer; the automated, continuous tiering is what sustains the savings over time rather than requiring a periodic manual project to identify new cold data
  • Transparency made tiering achievable even under extreme operational pressure — this transparency coupled with dramatic cost savings is what made it possible for Pfizer to tier data even while under pressure to develop vaccines; before Transparent Move Technology, organizations could only tier a small fraction of their cold data because users and departments balked; there was too much risk to future access; with Komprise TMT that risk is gone; Pfizer tiered during one of the most operationally demanding periods in pharmaceutical history precisely because users never noticed the data had moved
  • The Pfizer model works regardless of cloud destination — the same smart data migration strategy Pfizer used for AWS works identically for Azure and Google Cloud Storage; the analytics foundation, the transparent tiering approach, and the continuous lifecycle management are cloud-agnostic; Komprise is the metadata and orchestration layer for enterprise unstructured AI data across every cloud simultaneously

What is a Smart Data Migration strategy for AWS and why is it more financially sound than a standard lift-and-shift approach in the current storage pricing environment?

The average organization wastes 30% of its cloud budget on unused or misconfigured resources; over-provisioned compute and storage instances are the leading cause of unnecessary cloud spend; in the context of unstructured data migrations to AWS, over-provisioning means paying FSx or EFS prices for data that should be on S3 Glacier. A smart data migration strategy eliminates this waste from day one:

  • Tier cold data to S3 Glacier before migrating hot data to FSx — since 60 to 80% of enterprise unstructured data is cold, tiering it to Amazon S3 Glacier Instant Retrieval before migration reduces migration scope by more than half, shortens the migration window, and ensures that expensive FSx tiers are populated only with data that genuinely requires the performance FSx delivers; Komprise analytics help customers understand data so they know which data can migrate and tier to which class and which data should stay on-premises; this could show IT that 70% of their data has not been accessed in over a year and could be tiered to object storage while the remaining active or hot data moves to new NAS on-premises or file storage in the cloud
  • Ongoing lifecycle management prevents FSx from becoming the new expensive NAS — a migration without a continuous tiering strategy in place at the destination immediately begins accumulating cold data on FSx; within months the new cloud environment replicates the cost problem that drove the migration; Komprise intelligent tiering continuously identifies new cold data accumulating on FSx and tiers it to S3 automatically, keeping FSx costs flat as data volumes grow
  • 48% of cloud storage costs now go to fees rather than capacity — the Wasabi 2026 Cloud Storage Index found that 48% of storage costs now go toward fees rather than actual capacity, with this marking the fourth consecutive year in which IT leaders identified this as a key budgetary strain; nearly half of UK respondents said they exceeded allocated budgets for cloud storage in the previous year; a smart migration strategy that right-places data from day one — minimizing the expensive tiers, eliminating rehydration egress fees, and automating lifecycle transitions — directly addresses the fee complexity that makes cloud storage so consistently over-budget
  • NAND price pressure on-premises makes the AWS migration more urgentTrendForce projects NAND Flash contract prices to rise sharply through the current pricing cycle, with meaningful supply expansion unlikely for at least two to three years; organizations absorbing elevated on-premises flash costs have a direct financial case for migrating cold data to Amazon S3 Glacier, which costs a fraction of all-flash NAS regardless of hardware market conditions; the smart migration to AWS is simultaneously the relief valve for the on-premises flash pricing crisis
  • The Flash Stretch Assessment quantifies the AWS savings opportunity before commitment — for qualified enterprises managing 500TB or more, the Flash Stretch Assessment models how much cold data is on expensive primary storage, what it would cost on Amazon S3 Glacier versus the current tier, and what the net annual savings from a smart migration strategy would be; this makes the financial case for the AWS migration specific to the actual environment before any AWS commitment is made

How does Komprise ensure that data migrated to AWS remains fully accessible — and why does native S3 access matter more than ever for AI workloads?

The most common objection to migrating file data to AWS is access risk: users and applications that depend on familiar NAS file paths will break if the data moves and the path changes. Powered by patented Transparent Move Technology, when Komprise moves data into AWS, the moved data is still accessible as files from the NAS or as files or objects from AWS; this means zero changes for users and applications so there is no disruption; Komprise preserves data in standard file and object formats on AWS so users and applications can access data and leverage all the compute power of AWS without requiring any third-party solutions including Komprise; this eliminates vendor lock-in. The native format guarantee is not just a user experience benefit — it is the foundation for AI access:

  • Native S3 access enables AWS AI services to consume tiered data directly — Amazon SageMaker, AWS Bedrock, Amazon Rekognition, and Amazon HealthLake are all native consumers of S3 objects; data tiered or migrated by Komprise to S3 is immediately readable by all of these services without any conversion, ETL, or secondary migration; Komprise file-based tiering ensures cloud-native access to data stored in Amazon S3 and Amazon S3 Glacier, enabling researchers to leverage AWS machine learning services, unlike proprietary tiering solutions; this is the architectural distinction that makes a Komprise-executed migration AI-ready from day one
  • Object lock provides ransomware protection as a byproduct of tiering — cloud-native S3 access means you can take full advantage of S3 features like object lock to defend against ransomware and mine data using cloud analytics tools; cold data tiered by Komprise to S3 with object lock is protected even if primary NAS storage is compromised; the cost optimization motion and the ransomware defense motion are the same infrastructure action
  • The Global Metadatabase indexes every migrated and tiered file — as each file lands on AWS, the Komprise Global Metadatabase records its S3 location, storage class, access history, file type, sensitivity status, and any enriched metadata attributes; the AWS estate becomes immediately queryable by Deep Analytics for AI dataset curation without secondary discovery; the migration builds the AI data foundation simultaneously with the storage cost reduction
  • Komprise Hypertransfer accelerates SMB migrations to AWS at 25x speed — SMB file shares migrated to AWS FSx for Windows File Server or FSx for ONTAP benefit from Komprise Hypertransfer’s dedicated WAN channel technology; migrations that would take 25 days with standard tools complete in approximately one day; for organizations with large SMB estates migrating to AWS, this speed advantage directly reduces the migration window and the exposure risk that comes with extended cutover periods
  • No lock-in at the destination means future flexibility is preserved — data tiered or migrated by Komprise to AWS S3 is native S3 format, readable by any S3-compatible tool, any cloud AI service, and any future storage vendor without requiring Komprise to be running; if an organization later wants to move data from AWS to Azure or Google Cloud, there is no rehydration requirement and no proprietary format conversion; the storage agnosticism of the source architecture is preserved at the destination

What has Komprise added since the Pfizer AWS eBook was published and how do these capabilities extend the value of the AWS migration investment?

The Pfizer AWS eBook established the foundational case for smart data migration to AWS: analyze first, migrate only what needs to move, tier cold data transparently, and manage the lifecycle continuously. Since the eBook was written, Komprise has added a set of capabilities that extend the value of every AWS migration investment into AI data preparation, sensitive data governance, and petabyte-scale metadata intelligence:

  • The Global Metadatabase spans the full AWS estate and beyond — every file migrated to AWS FSx and every object tiered to S3 is indexed in the Komprise Global Metadatabase with standard and enriched metadata; this unified, continuously updated cross-silo index makes the AWS estate searchable by any combination of file attribute, tag, sensitivity status, or custom metadata extracted by KAPPA data services; the migration investment builds the metadata foundation for AI data workflows without any additional implementation work
  • Deep Analytics identifies AI-ready datasets within the AWS estate — Deep Analytics queries the Global Metadatabase to find exactly the right data for any AI use case across FSx and S3 simultaneously; a life sciences organization can find all genomics files for a specific research cohort across petabytes of S3 Glacier storage in seconds, reducing the AI dataset to exactly the right cohort before any data moves to a SageMaker training job
  • Smart Data Workflows automate AI delivery from AWS — Smart Data Workflows, available in Komprise Intelligent Data Management, automate the full pipeline from dataset identification through KAPPA metadata enrichment, sensitivity exclusion, and delivery to any AWS AI service; the automated AI data pipeline runs continuously as new data arrives on AWS without manual curation on each cycle; Komprise customers can safely expose corporate data to generative AI models, using Komprise Deep Analytics to identify appropriate datasets and tracking the data being shared for corporate governance around AI
  • Sensitive data governance before data reaches AWS AI toolsKomprise Sensitive Data Management, available in Komprise Intelligent Data Management, scans data for PII, PHI, and IP before migration and continuously as new data arrives on AWS; sensitive content is flagged and excluded from AI pipelines before it reaches Amazon Bedrock, SageMaker, or any other AWS AI service; the governance capability that the eBook did not describe is now available and addresses the top business challenge for unstructured data management at 62% of enterprises
  • KAPPA data services enrich AWS-resident data without moving it — KAPPA data services run serverless processing functions directly against files at their AWS S3 locations, extracting custom metadata from DICOM headers, genomics BAM files, and domain-specific file formats and writing enriched attributes back to the Global Metadatabase; data that was migrated to S3 for cost reasons becomes precisely queryable for AI use cases without any secondary movement or re-analysis

Why does the combination of on-premises flash pricing pressure and cloud storage fee complexity make a smart AWS migration strategy the most financially defensible infrastructure decision available today?

Enterprise IT teams managing large unstructured data estates are being squeezed from both ends simultaneously: the cost of on-premises flash storage is elevated by the NAND shortage, and cloud storage fees are consuming a growing share of what should be pure capacity spend. The organizations that have acted on a smart AWS migration strategy are not just managing one of these pressures — they are addressing both with a single coordinated approach:

  • Two cost crises converging on the same data estateTrendForce projects NAND Flash contract prices to continue rising through the current pricing cycle with meaningful supply expansion unlikely for years; simultaneously, 48% of cloud storage costs now go toward fees rather than actual capacity, with nearly half of enterprises exceeding their allocated cloud storage budgets in the previous year; both pressures are driven by the same root cause: data that has not been intelligently managed, classified, and right-placed across the full storage estate
  • Komprise directly addresses both simultaneouslyintelligent tiering of cold data off on-premises flash to Amazon S3 Glacier relieves the on-premises flash cost pressure; analytics-driven tier selection and access-time-based lifecycle management minimize the fee exposure that makes cloud storage costs unpredictable; the same platform motion addresses both crises
  • The AI data demand is adding a third pressure — 72% of organizations estimate that at least a quarter of their storage capacity is dark data; 91% state it is a priority to better analyze and operationalize dark data; when asked about the challenges they face implementing AI projects and solutions, the top response was data storage challenges like cost, data access, and management; the cold data sitting on expensive on-premises flash and poorly tiered cloud storage is the same dark data that AI initiatives cannot use; a smart migration strategy that classifies and right-places this data simultaneously reduces cost and creates AI data access
  • Komprise is the metadata and orchestration layer for enterprise unstructured AI data on AWS; the smart migration strategy is not a storage procurement decision — it is the foundational data management architecture that positions every subsequent AI initiative on AWS; enterprises that migrate smart today have an AWS data estate that is governed, indexed, and queryable; those that lift-and-shift have an AWS data estate that is expensive, ungoverned, and invisible to AI tools
  • The Pfizer model scales to any data-heavy enterprise — Pfizer’s 75% storage cost reduction was achieved on a 10PB estate across six continents with multiple storage vendors; the same approach is available to any qualified enterprise through the Komprise Flash Stretch Assessment, which models the specific cold data savings opportunity on-premises and the smart AWS migration strategy savings simultaneously; the assessment makes the financial case specific before any commitment is required