Get the Flash Stretch Assessment. Maximize Tiering to Offset Price Hikes. Learn How

How to Manage Medical Imaging Data Growth and Costs

How to Manage Medical Imaging Data Growth and Costs

Medical images contain untold value for healthcare organizations, yet they’re exceeding the limits of on-premises storage. It’s time to take control with a new data management strategy.

Digital PACS, digital pathology and VNA systems are all generating and now storing petabytes of medical imaging data—lab slides, X-rays, MRIs, CT scans and more. These ever-expanding datasets are pushing the limitations of storage systems and challenging IT department’s ability to effectively manage data.

To get more flexibility and cost savings from storage, healthcare organizations are adopting unstructured data management software to tier cold medical imaging data out of expensive storage to cost-effective environments such as the cloud.

This white paper examines the benefits of augmenting your medical imaging solution with data management software that transparently tiers cold data from your storage and backups, explains what you should look for in a solution, and includes a case study of a healthcare provider that did this successfully.

It’s Time Medical Imaging Data Management

“Internal storage for large image files is expensive—costing millions a year for some organizations on Porsche-grade NAS devices. The data must be secured, replicated and backed up. Meanwhile, in most cases, imaging data is rarely accessed after a few days.”

Read the blog post: Unstructured Data in HealthcareMedical-Imaging-Data-WhitePaper-LandingPage-thumbnail


Why is medical imaging data storage so expensive and why is the cost problem getting worse in 2026?

Healthcare organizations have shifted to digital media for medical imaging. Digital pathology, digital PACS and VNA systems are all generating and now storing petabytes of medical imaging data — lab slides, X-rays, MRIs, CT scans and more. These ever-expanding datasets are pushing the limitations of storage systems and challenging IT departments’ ability to effectively manage data. In 2026 every dimension of that challenge has intensified:

  • Data growth is structural and accelerating — healthcare organizations are experiencing explosive growth in unstructured data, often 35 to 40% annually, across DICOM imaging, digital pathology, genomics, and EHR systems distributed across hybrid storage environments; a health system generating 600,000 whole-slide pathology images annually at 1TB per day cannot solve this with procurement alone
  • Hardware prices have compounded the costIDC describes the current memory shortage as a potentially permanent reallocation of global silicon wafer capacity, with 2026 NAND and DRAM supply growth expected to remain below historical norms; internal storage for large image files is expensive — costing millions a year for some organizations on Porsche-grade NAS devices; those same NAS devices now cost significantly more to expand
  • The backup multiplier is invisible but significant — NAS data must also be secured, replicated and backed up, which typically triples the costs; meanwhile in most cases imaging data is rarely accessed after a few days or weeks; a healthcare organization paying for 1PB of PACS storage is effectively paying for 3 to 4PB of total infrastructure for data that stopped being accessed 90 days after acquisition
  • Retention requirements prevent deletion — with increasing regulations, healthcare providers typically must retain medical imaging files for many years; clinical researchers may also need access to the data indefinitely; the archive only grows
  • The Flash Stretch Assessment quantifies the opportunity — for qualified healthcare organizations managing 500TB or more of imaging and clinical data, the Komprise Flash Stretch Assessment identifies exactly how much cold imaging data is consuming expensive primary storage and models what transparent tiering to lower-cost destinations would save before any commitment is made

How does Komprise tier cold medical imaging data without disrupting clinical workflows or PACS systems?

The reason most healthcare IT teams have not already tiered their cold imaging data is not lack of awareness — it is fear of disruption. Health systems are generally risk-averse — they are handling sensitive patient information after all — and tolerance for downtime is usually quite low. The Komprise intelligent tiering approach addresses this directly through Transparent Move Technology, which tiers data without touching PACS or VNA applications:

  • Transparent access from day one — using Komprise, medical imaging files that have gone cold are transparently moved from the NAS to the cloud; clinicians, technicians, and other healthcare employees can still access these imaging files exactly as they did before; there are no stubs, no agents on PACS systems, and no change to any clinical application or workflow
  • Policy-driven by age and type — Komprise provides a tightly integrated solution with NAS devices to automatically move older images to the cloud based on policy for significantly cheaper storage and without affecting user experience; typical policies tier DICOM studies older than 90 days while keeping recent and active studies on high-performance NAS for fast clinical retrieval
  • No PACS modification requiredKAPPA data services operate as a layer above the storage environment, not inside the application stack; this process must happen without disrupting clinical workflows, modifying PACS or VNA systems, or creating persistent duplicate copies of data that increase storage costs
  • File and object duality preserves cloud-native access — data tiered by Komprise to AWS S3, Azure Blob, or Google Cloud Storage is accessible both as a file from the original PACS path and as a native S3 object from the cloud destination; cloud AI services, analytics platforms, and research tools can access tiered imaging data directly without routing through the source PACS
  • Proven at scale — Komprise is used by large hospitals throughout the nation; auto ingesting, caching, and copying just the right data improves AI accuracy by 120%+ and cuts AI costs by 96%+; NewYork-Presbyterian achieved 10x faster AI data ingestion and 96% lower cloud costs for its digital pathology AI program using this approach

What is KAPPA data services and why does it transform DICOM data from stored files into AI-ready assets?

The white paper was written before KAPPA data services existed. It identified the right problem — that cold imaging data sitting in PACS archives has enormous clinical and research value — but lacked the tooling to unlock it. KAPPA data services is the answer to that problem, and it changes the economics of medical imaging AI entirely:

  • The fundamental mismatch — healthcare organizations are sitting on billions of imaging files that hold the keys to more effective treatments and personalized care; largely this data is unusable; these files contain extraordinarily rich clinical information ripe for mining, but when it comes to powering clinical AI, that wealth of information is effectively locked away; the reason is a fundamental mismatch between how DICOM data is stored and what AI pipelines need
  • What KAPPA does — KAPPA data services offer a fundamentally different approach; KAPPA is a serverless compute framework for unstructured data that allows organizations to run custom functions directly on files in place; KAPPA handles the processing and infrastructure at scale, without touching PACS or VNA applications and without relying on brittle plugins that are hard to set up and expensive to maintain
  • DICOM header extraction at petabyte scale — KAPPA data services extract custom attributes from DICOM headers including patient demographics, modality type, body region, diagnosis code, study date, and institution-specific custom tags using a few lines of Python; these attributes are written back to the Komprise Global Metadatabase, making the full PACS archive searchable and queryable by clinical and research criteria without moving a single file
  • Chaining workflows for end-to-end AI preparation — with KAPPA you can chain a DICOM extraction service with a service that converts images to JPEG for faster AI ingest to accomplish the end-to-end workflow; the result is that imaging data that was previously opaque becomes AI-ready
  • Available in Komprise Intelligent Data Management — KAPPA data services require the full Komprise Intelligent Data Management platform; healthcare organizations that begin with Komprise Analysis or Elastic Data Migration to address storage costs can upgrade to unlock KAPPA, the Global Metadatabase, Deep Analytics, Smart Data Workflows, Sensitive Data Management, and Intelligent AI Ingest

How does Komprise protect PHI in medical imaging archives and reduce ransomware risk for healthcare IT teams?

Healthcare IT teams carry a dual burden that no other industry faces at the same intensity: HIPAA compliance on one side and an escalating ransomware threat on the other. As imaging data accumulates across NAS, VNA, and cloud silos, it drives up storage and backup costs while becoming increasingly difficult to manage, secure, and leverage; at the same time, strict HIPAA requirements, long-term retention mandates, and rising ransomware threats demand greater control and visibility. Komprise addresses both:

  • PHI detection across the full imaging estateKomprise Sensitive Data Management uses built-in PHI scanners, custom regex, and KAPPA-powered extraction from DICOM headers to detect protected health information across petabyte-scale imaging archives; flagged studies can be automatically moved to protected storage tiers, excluded from research AI workflows, or confined by policy without modifying the underlying PACS or VNA
  • Audit trails for HIPAA compliance — every data classification, movement, and access event is logged with complete lineage; when a HIPAA audit or breach investigation requires demonstrating what data was accessed, when, by which workflow, and from which source, Komprise provides the complete record without manual reconstruction
  • Ransomware defense as a byproduct of tiering — cold imaging studies tiered to immutable object storage destinations — AWS S3 Object Lock, Azure Blob with versioning — are protected even if primary PACS storage is compromised; Komprise identifies and tags sensitive data such as PHI and PII across unstructured data stores without moving files, reducing ransomware risk by shrinking the footprint of unmanaged and exposed data
  • The attack surface shrinks as tiering progresses — by transparently moving cold DICOM studies off primary NAS to immutable object storage, Komprise reduces the volume of imaging data exposed on the primary attack surface by up to 80%; a ransomware event that encrypts primary storage cannot reach the tiered cold archive
  • Komprise is the metadata and orchestration layer for enterprise unstructured AI data — in healthcare this means governing PHI, curating AI-ready imaging datasets, and protecting clinical archives as simultaneous outcomes of a single intelligent data management deployment

How can healthcare IT teams build a clinical AI data pipeline from existing PACS archives without a secondary data migration or separate AI infrastructure project?

The most common barrier to clinical AI adoption is not the model — it is the data. Ninety percent of healthcare organizations report at least partial implementation of AI tools for medical imaging; clinical AI depends on access to data with precise clinical context; without it the risk of garbage in, garbage out is very real. Komprise eliminates the need for a separate AI data infrastructure project by treating the existing PACS archive as an AI asset rather than a cost burden:

  • The Global Metadatabase spans the full imaging estate — Komprise Intelligent Data Management continuously indexes all DICOM files across PACS, VNA, NAS, and cloud storage, enabling researchers to self-tag and discover relevant datasets for AI, digital pathology, and analytics without IT manually curating data; this unified, queryable metadata layer is the foundation for every subsequent clinical AI workflow
  • Deep Analytics identifies AI-ready cohorts — Deep Analytics queries the Global Metadatabase to find exactly the right imaging studies for any AI use case; a clinical AI team can identify all chest CT scans for male patients over 50 with a specific diagnosis code across petabytes of PACS storage in seconds, reducing millions of files to a precise, governed cohort before any data moves
  • Smart Data Workflows automate end-to-end delivery — once the right dataset is identified, Smart Data Workflows orchestrate KAPPA-based DICOM metadata extraction, sensitive data exclusion, format conversion, and delivery to the AI pipeline automatically; imaging data that was previously opaque becomes AI-ready without manual IT curation on each project
  • Storage cost savings fund AI infrastructure — Komprise enables IT teams to reduce data storage costs while ensuring compliance and preparing high-value datasets for AI through automated classification and workflow-driven data curation; the capacity reclaimed from cold PACS data through intelligent tiering directly funds the AI infrastructure investment that clinical teams need; storage optimization and AI readiness are the same motion
  • No secondary migration required — data tiered by Komprise to cloud object storage is immediately accessible to cloud AI services as native S3 objects; a clinical AI team using AWS SageMaker, Azure AI, or Google Vertex can access tiered PACS data directly at the cloud destination without a secondary migration, conversion, or ETL project; the path from PACS archive to AI pipeline is a single, governed, automated workflow

Learn more about Komprise for Hospitals and Healthcare Systems