Komprise Transparent File Tables
Query-Ready Unstructured Data as Iceberg tables. Zero Files Moved.
Why Komprise for Unstructured Data in AI and Analytics?
- Unstructured Data Sources
- Access Method
- Data Discovery
- Schema
- Data Movement
- Data Quality
- Sensitive Data Tagging
- Governance
- Time to Insight
Without Komprise
Cloud Only
Non-Standard
Custom connectors and APIs per tool
None
Manual, complex, costly, per silo
None
Just raw data
Heavy
All raw data ingested
Poor
Raw files with noise, duplicates, no schema and gaps
None
No standard way to detect or handle
Manual
fragmented across tools
Slow
Weeks to months to transfer and prepare
With Komprise
NAS & Cloud
NAS, object, SaaS, cloud at exabyte scale
Standard
Open Apache Iceberg
Automated
Query one Global Metadatabase across silos
Rich Schema
Komprise enriched
Zero
No raw data moved
High
Curated, classified, enriched metadata
Automated
Built-in PII and RegEx detection and handling
Automated
access control and auditing
Immediate
query data where it lives
Query Unstructured Data Without Moving a File
Expose all your NAS and cloud data to Snowflake, Databricks and other lakehouses as Apache Iceberg tables without moving any data:
- Export Komprise Transparent File Tables via Deep Analytics and choose the entire Global Metadatabase or curated subsets.
- Komprise exports just the schema and metadata – no raw files moved.
- Data teams see Komprise tables in Databricks, Snowflake, other tools. No Komprise expertise needed.
- Patented Transparent Move Technology enables ingest of raw data only when needed.
Deliver Enriched, AI-Ready Data to Every Pipeline
Komprise enriches unstructured data schema with rich metadata, so AI and analytics have high-quality data:
- Komprise automatically discovers and classifies across silos into the Komprise Global Metadatabase, creating a single searchable source of truth.
- Enrich files with content scanning, header extraction, and sensitive data tagging using KAPPA data services and Smart Data Workflows.
- Curate subsets you want to expose in AI and data lakehouses using Deep Analytics queries.
Govern Access Across Every Data Source
Transparent File Tables carry Komprise governance policies wherever the data goes, so access controls and audit trails follow it into every analytics environment.
- Enforce user access permissions when serving unstructured data to Snowflake, Databricks, or any analytics tool.
- Detect and handle sensitive data including PII and PHI before it reaches any AI or analytics pipeline, supporting HIPAA, GDPR, and enterprise compliance requirements.
- Maintain complete audit trails of who queried what data, when, and from which tool.
Dig Deeper
Press Release
Deliver the Right Data to the Data Lakehouse
How Transparent File Tables will cut costs and increase accuracy for AI workflows.
blog
Introducing Transparent File Tables
Query-ready unstructured data to AI and data lakehouses without moving a single file.
VIDEO
Data on the Move Discussion
Frequently Asked Questions
What is Komprise Transparent File Tables?
Komprise Transparent File Tables is a capability within Komprise Intelligent Data Management that exposes enterprise unstructured data, stored across NAS, object storage, and cloud, as Apache Iceberg tables queryable directly in Snowflake, Databricks, and other leading analytics and AI platforms. No raw files are moved. Instead, Komprise exports a structured schema built from enriched file metadata, along with a dynamic pointer to each file via patented Transparent Move Technology. Data engineers and analysts work with the Iceberg table as they would any structured dataset, using familiar tools and standard SQL, without any Komprise expertise, new APIs, or changes to existing workflows. If a full file is needed by an AI pipeline, Komprise fetches it on demand at the moment it is required.
The result is that 99% of enterprise unstructured data that has historically been dark to AI and analytics becomes queryable, joinable, and actionable without the cost or complexity of moving petabytes to a destination platform first.
How does Komprise expose unstructured data to Snowflake and Databricks without moving it?
Komprise Transparent File Tables separates the metadata from the raw files and uses that separation to eliminate unnecessary data movement.
The process works in three steps. First, Komprise indexes all file and object data across every storage environment into the Komprise Global Metadatabase. That index captures system metadata (file type, size, age, owner, access patterns) alongside enriched metadata extracted from file content and headers using KAPPA data services and Smart Data Workflows. This turns every file into a structured, searchable record.
Second, a data engineer or IT administrator uses Deep Analytics to query the Global Metadatabase and define exactly which subset of files to expose. That curated query result is then exported as an Apache Iceberg table containing the enriched metadata schema and a dynamic file pointer for each row. No raw files are copied or moved at this stage.
Third, the Iceberg table is registered in Snowflake, Databricks, or any Iceberg-compatible platform. Data teams query it using standard SQL, join it with structured datasets from ERP systems, CRM tools, or other databases, and build dashboards, models, and AI pipelines on top of it. When an AI pipeline calls for the actual file content, Komprise delivers the file on demand from wherever it lives in the original storage environment.
What is Apache Iceberg and why does Komprise use it?
Apache Iceberg is an open-source, vendor-neutral table format originally developed at Netflix and now one of the most widely adopted open standards for large-scale analytics. It brings database-level reliability and query performance to data stored in cloud object storage and data lakes, supporting ACID transactions, schema evolution, and time travel queries. Snowflake, Databricks, AWS, Google Cloud, and most major analytics platforms support Apache Iceberg natively.
Komprise uses Apache Iceberg because it is the most interoperable and open choice available for connecting unstructured data to the analytics and AI platforms where enterprise data teams already work. By exporting Transparent File Tables in Iceberg format, Komprise gives data engineers standard, engine-agnostic access to unstructured data without requiring proprietary connectors, new APIs, or vendor lock-in. The same table can be queried from Snowflake, Databricks, or any other Iceberg-compatible tool without modification.
Source: Apache Iceberg documentation, Apache Software Foundation
How does Komprise Transparent File Tables differ from traditional ETL for unstructured data?
Traditional ETL approaches were designed for structured data. When applied to unstructured data, they require the entire raw dataset to be copied to a destination platform before any analysis can begin. At petabyte scale, that means weeks or months of transfer time, significant egress and destination storage costs, and a copy that is immediately stale and requires ongoing synchronization. The result is a process that is expensive, slow, and fundamentally mismatched to the size and diversity of enterprise unstructured data.
Komprise Transparent File Tables inverts that model. Rather than moving raw files to where the analytics tools are, Komprise exports only the metadata schema as a structured Iceberg table and uses dynamic file pointers to make the original files accessible on demand from where they already live. There is no bulk data movement at export time. There is no destination storage cost for petabytes of raw files. There is no ETL pipeline to maintain per data source, because the Global Metadatabase provides a single consolidated view across every NAS and cloud storage environment.
The practical difference for data teams is the ability to query, join, and analyze unstructured data in Snowflake or Databricks within hours rather than months, and to fetch only the specific files an AI pipeline actually needs rather than ingesting everything upfront.
What governance controls apply when querying data through Komprise Transparent File Tables?
Governance is built into Komprise Transparent File Tables from the point of export, not added afterward.
Access permissions are enforced at the Komprise level before any table is created. Data teams only see files they are authorized to access based on existing storage permissions, which are reflected in the Iceberg table and carried through into Snowflake, Databricks, and any other downstream platform.
Sensitive data handling is addressed before export. Smart Data Workflows, combined with the 68 pre-built content scanners in Komprise Sensitive Data Management, detect files containing PII, PHI, and other regulated content across the unstructured data estate. Sensitive files can be excluded from Transparent File Tables by policy, tagged and quarantined, or routed to governed storage before any table is exported. This supports HIPAA, GDPR, and enterprise compliance requirements without requiring manual review at petabyte scale.
Audit trails are maintained throughout. Every query, export, and file retrieval via Transparent Move Technology is logged, giving IT and compliance teams full visibility into who accessed what data, from which tool, and when.
The governance model means that regulated industries including pharmaceutical, life sciences, and genomics, financial services, and healthcare can use Transparent File Tables for AI and analytics workflows with the controls their compliance teams require already in place.