Amazon Glacier (AWS Glacier)

Arctic_Glacier

What is Amazon S3 Glacier (AWS Glacier)?

Amazon S3 Glacier, also known as AWS Glacier, is a class of cloud storage available through Amazon Web Services (AWS).  Amazon S3 Glacier is a lower-cost storage tier designed for use with data archiving and long-term backup services on the public cloud infrastructure.

Amazon S3 Glacier was created to house data that doesn’t need to be accessed frequently or quickly. This makes it ideal for use as a cold storage service, hence the inspiration for its name.

Amazon S3 Glacier retrieval times range from a few minutes to a few hours with three different speed options available: Expedited (1-5 minutes), Standard (3-5 hours), and Bulk (5-12 hours).

Amazon S3 Glacier Deep Archive offers 12-48-hour retrieval times. The faster retrieval options are significantly more expensive, so having your data organized into the correct tier within AWS cloud storage is an important aspect of keeping storage costs down.

Other Glacier features:
  • The ability to store an unlimited number of objects and data
  • Data stored in S3 Glacier is dispersed across multiple geographically separated Availability Zones within the AWS region
  • An average annual durability of 99.999999999%
  • Checksum uploads to validate data authenticity
  • REST-based web service
  • Vault, Archive, and Job data models
  • Limit of 1,000 vaults per AWS account

Main Applications for Amazon S3 Glacier Storage

There are several scenarios where Glacier is an ideal solution for companies needing a large volume of cloud storage.

  1. Huge data sets. Many companies that perform trend or scientific analysis need a huge amount of storage to be able to house their training, input, and output data for future use.
  2. Replacing legacy storage infrastructure. With the many advantages that cloud-based storage environments have over traditional storage infrastructure, many corporations are opting to use AWS storage to get more out of their data storage systems. AWS Glacier is often used as a replacement for long term tape archives.
  3. Healthcare facilities’ patient data. Patient data needs to be kept for regulatory or compliance requirements. Glacier and Glacier Deep Archive are ideal archiving platforms to keep data that will hardly need to be accessed.
  4. Cold data with long retention times. Finance, Research, Genomics, and Electronic Design Automation and Media, Entertainment are some examples of industries where cold data and inactive projects may need to be retained for long periods of time even though they are not actively used.  AWS Glacier storage classes are a good fit for these types of data.  The project data will need to be recalled before it is actively used to minimize retrieval delays and costs.

Amazon S3 Glacier vs S3 Standard

Amazon’s S3 Standard storage and S3 Glacier are different classes of storage designed to handle workloads on the AWS cloud storage platform.

  • S3 Glacier is best for cold data that’s rarely or never accessed
  • Amazon S3 Standard storage is intended for hot and warm data that needs to be accessed daily and quickly

The speed and accessibility of S3 Standard storage comes at a much higher cost compared to S3 Glacier and the even more economical S3 Glacier Deep Archive storage tiers. Having the right data management solution is critical to help you identify and organize your hot and cold data into the correct storage tiers, saving a substantial amount on storage costs.

Benefits of a Data Management System to Optimize Amazon S3 Glacier

migrationisvpartner-150x150A comprehensive suite of unstructured data management and unstructured data migration capabilities allow organizations to reduce their data storage footprint and substantially cut their storage costs. These are a few of the benefits of integrating an analytics-driven data management solution like Komprise Intelligent Data Management with your AWS storage:

Get full visibility of your AWS and other storage data

Across AWS and other cloud platforms to understand how much NAS data is being accrued and whether it’s hot or cold so you make better data storage investment and data mobility decisions.

Intelligent tiering and life cycle management for AWS storage

Optimize and improve how you manage files and objects across EFS, FSX, S3 Standard and S3 Glacier storage classes based on access patterns.

Intelligent AWS data retrievals

Don’t get hit with unexpected data retrieval fees on S3 Glacier – Komprise enables intelligent recalls based on access patterns so if an object on Glacier becomes active again, Komprise will move it up to an S3 storage class.

Bulk retrievals for improved AWS user performance

Improve performance across entire projects from S3 Glacier storage classes – if an archived project is going to become active, you can prefetch and retrieve the entire project from S3 Glacier using Komprise so users don’t have to face long latencies to get access to the data they need.

Minimize AWS storage costs

With analytics-driven cloud data management that monitors retrieval costs, egress costs and other costs to minimize them by promoting data up and recalling it intelligently to more active storage classes.

Access AWS data natively

Access data that has been moved across AWS as objects from Amazon S3 storage classes or as files from File and NAS storage classes without the need for additional stubs or agents.

Reduce AWS cloud storage complexity

Reduce the complexity of your cloud storage and NAS environment and manage your data more easily through an intuitive dashboard.

Optimize the AWS storage savings

Komprise Intelligent Data Management allows you to better manage all the complex data storage, retrieval, egress and other costs. Know first. Move smart. Take control.

Easy, on-demand scalability

Komprise provides you with the capacity to add and manage petabytes without limits or the need for dedicated infrastructure.

Integrate data lifecycle management

Integrate easily with an AWS Advanced Tier partner such as Komprise for lifecycle management or other use cases.

Move data transparently to any tier within AWS

Your users won’t experience any difference in terms of data access. You’ll notice a huge difference in cost savings and unstructured data value with Komprise.

Create automated data management policies and data workflows

Continuously manage the lifecycle of the moved data for maximum savings. Build Smart Data Workflows to deliver the right data to the right teams, applications, cloud services, AI/ML engines, etc. at the right time.

Streamline Amazon S3 Glacier Operations with Komprise Intelligent Data Management

Komprise’s Intelligent Data Management allows you to seamlessly analyze and manage data across all of your AWS cloud storage classes so you can move data across file, S3 Standard and S3 Glacier storage classes at the right time for the best price/performance. Because it’s vendor agnostic, its standards-driven analytics and data management work with  the largest storage providers in the industry and have helped companies save up to 50% on their cloud storage costs.

If you’re looking to get more out of your AWS storage, contact a data management expert at Komprise today and see how much you could save on data storage costs. Read the white paper: Smart Data Migration for AWS.

Komprise-Smart-Data-Migration-Page-SOCIAL

Getting Started with Komprise:

Want To Learn More?

Amazon S3 Glacier Instant Retrieval

Amazon S3 Glacier Instant Retrieval is an archive storage class that was introduced in November, 2021. According to Amazon, it delivers the lowest-cost archive storage with milliseconds retrieval for rarely accessed data.

Komprise works closely with AWS to ensure enterprise customers have visibility into data across storage environments. With analytics-driven unstructured data management, Komprise right places data to the right storage class: Hot data on high performance managed file services in AWS and cold data on lower cost Amazon S3 Glacier object storage such as Amazon S3 Glacier Instant Retrieval and Amazon S3 Infrequent Access.

Learn more about Amazon S3 Storage Classes.

Learn more about Komprise for AWS.

Getting Started with Komprise:

Want To Learn More?

Amazon Tiering

What is Amazon Tiering?

Amazon Web Services (AWS) offers several storage services that support data tiering based on different storage classes. These data storage classes allow customers to optimize their storage costs and performance by choosing the most suitable option for their data based on its access patterns and durability requirements.

Learn more about Komprise file and object data migration, data tiering and ongoing data management.

AWS Storage Tiering Options

Amazon S3 Storage Classes: Amazon Simple Storage Service (S3) provides multiple storage classes to accommodate different data access patterns and cost requirements:

  • Standard: This is the default storage class for S3 and offers high durability, availability, and performance for frequently accessed data.
  • Intelligent-Tiering: This storage class automatically moves objects between two access tiers (frequent access and infrequent access) based on their usage patterns. It optimizes costs by automatically transitioning objects to the most cost-effective tier.
  • Standard-IA (Infrequent Access): This storage class is suitable for data that is accessed less frequently but still requires rapid access when needed. It offers lower storage costs compared to the Standard class.
  • One Zone-IA: Similar to Standard-IA, but the data is stored in a single Availability Zone, which provides a lower-cost option for customers who don’t require data redundancy across multiple zones.
  • Glacier, Glacier IT and Glacier Deep Archive: These storage classes are designed for long-term archival and data retention. Data stored in Amazon S3 Glacier is accessible within minutes to hours, while Glacier Deep Archive is for data with retrieval times of 12 hours or more.

Amazon EBS Volume Types: Amazon Elastic Block Store (EBS) provides different volume types for block storage in AWS. While not strictly tiering, these volume types offer varying performance characteristics and costs:

  • General Purpose SSD (gp2): This is the default EBS volume type and provides a balance of price and performance for a wide range of workloads.
  • Provisioned IOPS SSD (io1/io2): These volume types are designed for applications that require high I/O performance and consistent low-latency access to data.
  • Throughput Optimized HDD (st1): This volume type offers low-cost storage optimized for large, sequential workloads that require high throughput.
  • Cold HDD (sc1): This volume type provides the lowest-cost storage for infrequently accessed workloads with large amounts of data.

Amazon S3 Glacier and Glacier Deep Archive: These are the storage classes within Amazon S3 designed specifically for long-term data archival and retention. The retrieval times are longer compared to other storage classes, but they offer significantly lower storage costs for data that is rarely accessed.

Amazon tiering options are designed to help AWS customers effectively manage their data storage costs and performance based on the specific requirements of their workloads and data access patterns.

Komprise Intelligent Data Management for AWS

Komprise is an AWS Migration and Modernization competency partner, working closely with AWS teams to follow best practices and support cloud data storage services including Amazon EFS, Amazon FSx and Amazon S3 (including Amazon S3 Glacier Flexible Retrieval and Glacier Instant Retrieval storage classes). The Komprise analytics-driven SaaS platform allows customers to analyze, mobilize and manage their file and object data using AWS allowing enterprise customers to:

  • Understand AWS NAS & Object Data Usage and Growth
  • Estimate ROI of AWS Data Storage
  • Migrate Smarter to Amazon FSx for NetApp ONTAP
  • Easily Integrate AWS Data Lifecycle Management
  • Access Moved Data as Files Without Stubs or Agents
  • Gain Native Data Access in the AWS Cloud Without Storage Vendor Lock-In
  • Rapidly Migrate Object Data Into AWS Storage
  • Reduce AWS Unstructured Data Complexity
  • Scale On-Demand with Modern, SaaS Architecture

Komprise-blog-Anthony-Fiore-AWS-SOCIAL

Learn more about Komprise Intelligent Data Management for AWS Storage.

Getting Started with Komprise:

Want To Learn More?

AWS Storage

What is AWS Cloud Storage?

The AWS cloud service has a full range of options for individuals and enterprises to store, access and analyze data. AWS offers options across all three types of cloud data storage object storage, file storage and block storage.

Here are the Amazon StorageAWS Storage choices:

  • Amazon Simple Storage Service (S3): S3 is a popular AWS service that provides scalable and highly durable object storage in the cloud.
  • AWS Glacier: Glacier provides low-cost highly durable archive storage in the cloud. It’s best for cold data as access times can be slow.
  • Amazon Elastic File System (Amazon EFS): EFS provides scalable network file storage for Amazon EC2 instances.
  • Amazon Elastic Block Store (Amazon EBS): This service provides low-latency block storage volumes for Amazon EC2 instances.
  • Amazon EC2 Instance Storage. An instance store is ideal for temporary storage of information that changes frequently, such as buffers, caches and scratch data, and consists of one or more instance store volumes exposed as block devices.
  • AWS Storage Gateway. This is a hybrid storage option that integrates on-premises storage with cloud storage. It can be hosted on a physical or virtual server.
  • AWS Snowball. This data migration service transports large amounts of data to and from the cloud and includes an appliance that’s installed in the on-premises data center.

Komprise-AWS-Snowball-blog-SOCIAL-1

Each of these Amazon storage classes has several tiers at different price points – so it is important to put the right data in the right storage class at the right time to optimize price and performance.

Komprise Intelligent Data Management for AWS Storage

Komprise helps organizations get more value from their AWS storage investments while protecting data assets for future use through analysis and intelligent data migration and cloud data tiering.

AWS-Use-Case-Table-2

Learn more at Komprise for AWS.

Getting Started with Komprise:

Want To Learn More?

Cloud Cost Optimization

Cloud cost optimization is a process to reduce operating costs in the cloud while maintaining or improving the quality of cloud services. It involves identifying and addressing areas to reduce the use of cloud resources, select more cost-effective cloud services, or deploy better management practices, including data management.

The cloud is highly flexible and scalable, but it also involves ongoing and sometimes hidden costs, including usage fees, egress fees, storage costs, and network fees. If not managed properly, these costs can quickly become a significant burden for organizations.

In one of our 2023 data management predictions posts, we noted:

Managing the cost and complexity of cloud infrastructure will be Job No. 1 for enterprise IT in 2023. Cloud spending will continue, although at perhaps a more measured pace during uncertain economic times. What will be paramount is to have the best data possible on cloud assets to make sound decisions on where to move data and how to manage it for cost efficiency, performance, and analytics projects. Data insights will also be important for migration planning, spend management (FinOps), and to meet governance requirements for unstructured data management. These are the trends we’re tracking for cloud data management, which will give IT directors precise guidance to maximize data value and minimize cloud waste.

Source: ITPro-Today

Steps to Optimize Cloud Costs

To optimize cloud costs, organizations can take several steps, including:

  • Right-sizing: Choose the correct size and configuration of cloud resources to meet the needs of the application, avoiding overprovisioning or underprovisioning.
  • Resource utilization: Monitor the use of cloud resources to reduce waste and improve cost efficiency.
  • Cost allocation: Implement cost allocation and tracking practices to better understand cloud costs and improve accountability.
  • Reserved instances: Use reserved instances to reduce costs by committing to a certain level of usage for a longer term.
  • Cost optimization tools: These tools identify areas for savings and help manage cloud expenses.

The Challenge of Managing Cloud Data

Managing cloud data costs takes significant manual effort, multiple tools, and constant monitoring. As a result, companies are using less than 20% of the cloud cost-saving options available to them. “Bucket sprawl” makes matter worse, as users easily create accounts and buckets and fill them with data—some of which is never accessed again.

When trying to optimize cloud data, cloud administrators contend with poor visibility and complexity of data management:

  • How can you know your cloud data?
  • How fast is cloud data growing and who’s using it?
  • How much is active vs. how much is cold?
  • How can you dig deeper to optimize across object sizes and storage classes?

How can you make managing data and costs manageable?

  • It’s hard to decipher complicated cost structures.
  • Need more information to manage data better, e.g., when was an object last accessed?
  • Factoring in multiple billable dimensions and costs is extremely complex: storage, access, retrievals, API,
    transitions, initial transfer, and minimal storage-time costs.
  • There are unexpected costs of moving data across different storage classes (e.g., Amazon S3 Standard to S3
    Glacier). If access isn’t continually monitored, and data is not moved back up when it gets hot, you will face
    expensive retrieval fees

These issues are further compounded as enterprises move toward a multicloud approach and require a single set
of tools, policies, and workflow to optimize and manage data residing within and across clouds.

Komprise_Cloud_Data_ManagementKomprise Cloud Data Management

Reduce cloud storage costs by more than 50% with Komprise.

Cloud providers offer a range of storage services. Generally, there are storage classes with higher performance
and costs for hot and warm data, such as Amazon S3 Standard and S3 Standard-IA, and there are storage classes
with much lower performance and costs that are appropriate for cold data, such as S3 Glacier and S3 Glacier Deep
Archive. Data access fees and retrieval fees for the lower cost storage classes are much higher than that of the
higher performance and higher cost storage classes. To maximize savings, you need an automated unstructured data management solution that takes into account data access patterns to dynamically and cost optimally move data across storage classes (e.g., Amazon S3 Standard to S3 Standard-IA or S3 Standard-IA to S3 Glacier) and across multi-vendor storage services (e.g., NetApp Cloud Volumes ONTAP to Amazon S3 Standard to S3 Standard-IA to S3 Glacier to S3 Glacier Deep Archive). While some limited manual data movement through Object Lifecycle Management policies based on modified times
or intelligent tiering is available from the cloud providers, these approaches offer limited savings and involve hidden
costs.

Komprise automates full lifecycle management across multi-vendor cloud storage classes using intelligence from data
usage patterns to maximize your savings without heavy lifting. Read the white paper to see how you can save +50% on cloud storage cost savings.

Watch the video: How to save costs and manage your multi-cloud sorry

Getting Started with Komprise:

Want To Learn More?

Cloud Data Management

CloudDataManagement_Diagram-scaled

What is Cloud Data Management?

Cloud data management is a way to manage data across cloud platforms, either with or instead of on-premises storage. A popular form of data storage management, the goal is to curb rising cloud data storage costs, but it can be quite a complicated pursuit, which is why most businesses employ an external company offering cloud data management services with the primary goal being cloud cost optimization.

Cloud data management is emerging as an alternative to data management using traditional on-premises software. The benefit of employing a top cloud data management company means that instead of buying on-premises data storage resources and managing them, resources are bought on-demand in the cloud. This cloud data management services model for cloud data storage allows organizations to receive dedicated data management resources on an as-needed basis. Cloud data management also involves finding the right data from on-premises storage and moving this data through data archiving, data tiering, data replication and data protection, or data migration to the cloud.

Advantages of Cloud Data Management

How to manage cloud storage? According to two 2023 surveys (here and here), 94% of respondents say they’re wasting money in the cloud, 69% say that data storage accounts for over one quarter of their company’s cloud costs and 94% said that cloud storage costs are rising. Optimal unstructured data management in the cloud provides four key capabilities that help with managing cloud storage and reduce your cloud data storage costs:

  1. Gain Accurate Visibility Across Cloud Accounts into Actual Usage
  2. Forecast Savings and Plan Data Management Strategies for Cloud Cost Optimization
  3. Cloud Tiering and Archiving Based on Actual Data Usage to Avoid Surprises
    • For example, using last-accessed time vs. last modified provides a more predictable decision on the objects that will be accessed in the future, which avoids costly archiving errors.
  4. Radically Simplify Cloud Migrations
    • Easily pick your source and destination
    • Run dozens or hundreds of migrations in parallel
    • Reduce the babysitting

Komprise-Hypertransfer-Migration-White-Paper-SOCIAL-2

The many benefits of cloud data management services include speeding up technology deployment and reducing system maintenance costs; it can also provide increased flexibility to help meet changing business requirements.

Challenges Faced with Enterprise Cloud Data Management

But, like other cloud computing technologies, enterprise cloud data management services can introduce challenges – for example, data security concerns related to sending sensitive business data outside the corporate firewall for storage. Another challenge is the disruption to existing users and applications who may be using file-based applications on premise since the cloud is predominantly object based.

Cloud data management service solutions should provide you with options to eliminate this disruption by transparently moving and managing data across common formats such as file and object.

Komprise Intelligent Data Management

Features of a Cloud Data Management Services Platform

Some common features and capabilities cloud data management solutions should deliver:

  • Data Analytics: Can you get a view of all your cloud data, how it’s being used, and how much it’s costing you? Can you get visibility into on-premises data that you wish to migrate to the cloud? Can you understand where your costs are so you know what to do about them?
  • Planning and Forecasting: Can you set policies for how data should get moved either from one cloud storage class to another or from an on-premises storage to the cloud. Can you project your savings? Does this account for hidden fees like retrieval and egress costs?
  • Policy based data archiving, data replication, and data management: How much babysitting do you have to do to move and manage data? Do you have to tell the system every time something needs to be moved or does it have policy based intelligent automation?
  • Fast Reliable Cloud Data Migration: Does the system support migrating on-premises data to the cloud? Does it handle going over a Wide Area Network? Does it handle your permissions and access controls and preserve security of data both while it’s moving the data and in the cloud?
  • Intelligent Cloud Archiving, Intelligent Tiering and Data Lifecycle Management: Does the solution enable you to manage ongoing data lifecycle in the cloud? Does it support the different cloud storage classes (eg High-performance options like File and Cloud NAS and cost-efficient options like Amazon S3 and Glacier)?

In practice, the design and architecture of a cloud varies among cloud providers. Service Level Agreements (SLA) represent the contract which captures the agreed upon guarantees between a service provider and its customers.

It is important to consider that cloud administrators are responsible for factoring:

  • Multiple billable dimensions and costs: storage, access, retrievals, API, transitions, initial transfer, and minimal storage-time costs
  • Unexpected costs of moving data across different storage classes. Unless access is continually monitored and data is moved back up when it gets hot, you’ll face expensive retrieval fees.

This complexity is the reason why only a mere 20% of organizations are leveraging the cost-saving options available to them in the cloud.

How do Cloud Data Management Services Tools work?

As more enterprise data runs on public cloud infrastructure, many different types of tools and approaches to cloud data management have emerged. The initial focus has been on migrating and managing structured data in the cloud. Cloud data integration, ETL (extraction, transformation and loading), and iPaaS (integration platform as a service) tools are designed to move and manage enterprise applications and databases in the cloud. These tools typically move and manage bulk or batch data or real time data.

Cloud-based analytics and cloud data warehousing have emerged for analyzing and managing hybrid and multi-cloud structured and semi-structured data, such as Snowflake and Databricks.

In the world of unstructured data storage and backup technologies, cloud data management has been driven by the need for cost visibility, cost reduction, cloud cost optimization and optimizing cloud data. As file-level tiering has emerged as a critical component of an intelligent data management strategy and more file data is migrating to the cloud, cloud data management is evolving from cost management to automation and orchestration, governance and compliance, performance monitoring, and security. Even so, spend management continues to be a top priority for any enterprise IT organizing migrating application and data workloads to the cloud.

What are the challenges faced with Cloud Data Management security?

Most of the cloud data management security concerns are related to general cloud computing security questions organizations face. It’s important to evaluate the strengths and security certifications of your cloud data management vendor as part of your overall cloud strategy

Is adoption of Cloud Data Management services growing?

As enterprise IT organizations are increasingly running hybrid, multi-cloud, and edge computing infrastructure, cloud data management services have emerged as a critical requirement. Look for solutions that are open, cross-platform, and ensure you always have native access to your data. Visibility across silos has become a critical need in the enterprise, but it’s equally important to ensure data does not get locked into a proprietary solution that will disrupt users, applications, and customers. The need for cloud native data access and data mobility should not be underestimated. In addition to visibility and access, cloud data management services must enable organizations to take the right action in order to move data to the right place and the right time. The right cloud data management solution will reduce storage, backup and cloud costs as well as ensure a maximum return on the potential value from all enterprise data.

How is Enterprise Cloud Data Management Different from Consumer Systems?

While consumers need to manage cloud storage, it is usually a matter of capacity across personal storage and devices. Enterprise cloud data management involves IT organizations working closely with departments to build strategies and plans that will ensure unstructured data growth is managed and data is accessible and available to the right people at the right time.

Enterprise IT organizations are increasingly adopting cloud data management solutions to understand how cloud (typically multi-cloud) data is growing and manage its lifecycle efficiently across all of their cloud file and object storage options.

Analyzing and Managing Cloud Storage with Komprise

  • Get accurate analytics across clouds with a single view across all your users’ cloud accounts and buckets and save on storage costs with an analytics-driven approach.
  • Forecast cloud cost optimization by setting different data lifecycle policies based on your own cloud costs.
  • Establish policy-based multi-cloud lifecycle management by continuously moving objects by policy across storage classes transparently (e.g., Amazon Standard, Standard-IA, Glacier, Glacier Deep Archive).
  • Accelerate cloud data migrations with fast, efficient data migrations across clouds (e.g., AWS, Azure, Google and Wasabi) and even on-premises (ECS, IBM COS, Pure FlashBlade).
  • Deliver powerful cloud-to-cloud data replication by running, monitoring, and managing hundreds of migrations faster than ever at a fraction of the cost with Elastic Data Migration.
  • Keep your users happy with no retrieval fee surprises and no disruption to users and applications from making poor data movement decisions based on when the data was created.

A cloud data management platform like Komprise, named a Gartner Peer Insights Awards leader, that is analytics-driven, can help you save 50% or more on your cloud storage costs.

Komprise_Cloud_Data_Management-768x407

Learn more about your options for migrating file workloads to the cloud: The Easy, Fast, No Lock-In Path to the Cloud.

What is Cloud Data Management?

Cloud Data Management is a way to analyze, manage, secure, monitor and move data across public clouds. It works either with, or instead of on-premises applications, databases, and data storage and typically offers a run-anywhere platform.

Cloud Data Management Services

Cloud data management is typically overseen by a vendor that specializes in data integration, database, data warehouse or data storage technologies. Ideally the cloud data management solution is data agnostic, meaning it is independent from the data sources and targets it is monitoring, managing and moving. Benefits of an enterprise cloud data management solution include ensuring security, large savings, backup and disaster recovery, data quality, automated updates and a strategic approach to analyzing, managing and migrating data.

Cloud Data Management platform

Cloud data management platforms are cloud based hubs that analyze and offer visibility and insights into an enterprises data, whether the data is structured, semi-structured or unstructured.

Getting Started with Komprise:

Want To Learn More?

Cloud Data Storage

Cloud data storage is a service for individuals or organizations to store data through a cloud computing provider such as AWS, Azure, Google Cloud, IBM or Wasabi. Storing data in a cloud service eliminates the need to purchase and maintain data storage infrastructure, since infrastructure resides within the data centers of the cloud IaaS provider and is owned/managed by the provider. Many organizations are increasing data storage investments in the cloud for a variety of purposes including: backup, data replication and data protection, data tiering and archiving, data lakes for artificial intelligence (AI) and business intelligence (BI) projects, and to reduce their physical data center footprint. As with on-premises storage, you have different levels of data storage available in the cloud. You can segment data based on access tiers: for instance, hot and cold data storage.

komprise_cloud_intelligent

Types of Cloud Data Storage

Cloud data storage can either be designed for personal data and collaboration or for enterprise data storage in the cloud. Examples of personal data cloud storage are Google Drive, Box and DropBox.

Increasingly, corporate data storage in the cloud is gaining prominence – particularly around taking enterprise file data that was traditionally stored on Network Attached Storage (NAS) and moving that to the cloud.

Cloud file storage and object storage are gaining adoption as they can store petabytes of unstructured data for enterprises cost-effectively.

Enterprise Cloud Data Storage for Unstructured Data

(Cloud File Data Storage and Cloud Object Data Storage)

Enterprise unstructured data growth is exploding – whether its genomics data, video and media content, or log files or IoT data.  Unstructured data can be stored as files on file data storage or as objects on cost-efficient object storage. Cloud storage providers are now offering a variety of file and object storage classes at different price points to accommodate unstructured data. Amazon EFS, FSX, Azure Files are examples of cloud data storage for enterprise file data, and Amazon S3, Azure Blob and Amazon Glacier are examples of object storage.

Advantages of Cloud Data Storage

There are many benefits of investing in cloud data storage, particularly for unstructured data in the enterprise. Organizations gain access to unlimited resources, so they can scale data volumes as needed and decommission instances at the end of a project or when data is deleted or moved to another storage resource. Enterprise IT teams can also reduce dependence on hardware and have a more predictable storage budget. However, without proper cloud data management, cloud egress costs and other cloud costs are often cited as challenges.

In summary, cloud data storage allows:
  • The opportunity to reduce capital expenses (CAPEX) of data center hardware along with savings in energy, facility space and staff hours spend maintaining and installing hardware.
  • Deliver vastly improved agility and scalability to support rapidly changing business needs and initiatives.
  • Develop an enterprise-wide data lake strategy that would otherwise be unaffordable.
  • Lower risks from storing important data on aging physical hardware.
  • Leverage cheaper cloud storage for archiving and tiering purposes, which can also reduce backup costs.
Challenges and Considerations
  • Cloud data storage can be costly if you need to frequently access the data for use outside of the cloud, due to egress fees charged by cloud storage providers.
  • Using cloud tiering methodologies from on-premises storage vendors may result in unexpected costs, due to the need for restoring data back to the storage appliance prior to use. Read the white paper Cloud Tiering: Storage-Based vs. Gateways vs. File-Based
  • Moving data between clouds is often difficult, because of data translation and data mobility issues with file objects. Each cloud provider uses different standards and formats for data storage.
  • Security can be a concern, especially in some highly regulated sectors such as healthcare, financial services and e-commerce. IT organizations will need to fully understand the risks and methods of storing and protecting data in the cloud.
  • The cloud creates another data silo for enterprise IT. When adding cloud storage to an organization’s storage ecosystem, IT will need to determine how to attain a central, holistic view of all storage and data assets.

For these reasons, cloud optimization and cloud data management are essential components of an enterprise cloud data storage and overall data storage cost savings strategy. Komprise has strategic alliance partnerships with hybrid and cloud data storage technology leaders:

Learn more about your options for migrating file workloads to the cloud: The Easy, Fast, No Lock-In Path to the Cloud.

Getting Started with Komprise:

Want To Learn More?

Cloud Migration

CloudMigrationDiagram.png

Cloud migration refers to the movement of data, processes, and applications from on-premises data storage or legacy infrastructure to cloud-based infrastructure for storage, application processing, data archiving and ongoing data lifecycle management. Komprise offers an analytics-driven cloud migration software solution – Elastic Data Migration – that integrate with most leading cloud service providers, such as AWS, Microsoft Azure, Google Cloud, Wasabi, IBM Cloud and more.

Benefits of Cloud Migration

Migrating to the cloud can offer many advantages – lower operational costs, greater elasticity, and flexibility. Migrating data to the cloud in a native format also ensures you can leverage the computational capabilities of the cloud and not just use it as a cheap storage tier. When migrating to the cloud, you need to consider both the application as well as its data. While application footprints are generally small and relatively easier to migrate, cloud file data migrations need careful planning and execution as data footprints can be large. Cloud migration of file data workloads with Komprise allows you to:

  • Plan a data migration strategy using analytics before migration. A pre-migration analysis helps you identify which files need to be migrated, plan how to organize the data to maximize the efficiency of the migration process. It’s important to know how data is used and to determine how large and how old files are throughout the storage system. Since data footprints often reach billions of files, planning a migration is critical.
  • Improve scalability with Elastic Data Migration. Data migrations can be time consuming as they involve moving hundreds of terabytes to  petabytes of data.  Since storage that data is migrating from is usually still in use during the migration, the data migration solution needs to move data as fast as possible without slowing down user access to the source storage.  This requires a scalable architecture that can leverage the inherent parallelism of the data sets to migrate multiple data streams in parallel without overburdening any single source storage. Komprise uses a patented elastic data migration architecture that maximizes parallelism while throttling back as needed to preserve source data storage performance.
  • Shrink cloud migration time. When compared to generic tools used across heterogeneous cloud and physical storage, Komprise cloud data migration is nearly 30x faster. Performance is maximized at every level with the auto parallelize feature, minimizing network usage and making migration over WAN more efficient.

Komprise-Hypertransfer-Migration-BLOG-SOCIAL-final-768x402

  • Reduce ongoing cloud data storage costs with smart migration, intelligent tiering and data lifecycle management in the cloud. Migrating to the cloud can reduce the amount spent on IT needs, storage maintenance, and hardware upgrades as these are typically handled by the cloud provider. Most clouds provide multiple storage classes at different price points – Komprise intelligently moves data to the right storage class in the cloud based on your policy and performs ongoing data lifecycle management in the cloud to reduce storage cost.  For example, for AWS, unlike cloud intelligent tiering classes, Komprise tiers across both S3 and Glacier storage classes so you get the best cost savings.
  • Simplify storage management. With a Komprise cloud migration, you can use a single solution across your multivendor storage and multicloud architectures. All you have to do is connect via open standards – pick the SMB, NFS, and S3 sources along with the appropriate destinations and Komprise handles the rest. You also get a dashboard to monitor and manage all of your migrations from one place. No more sunk costs of point migration tools because Komprise provides ongoing data lifecycle management beyond the data migration.
  • Greater resource availability. Moving your data to the cloud allows it to be accessed from wherever users may be, making your it easier for international businesses to store and access their data from around the world. Komprise delivers native data access so you can directly access objects and files in the cloud without getting locked in to your NAS vendor—or even to Komprise.

Cloud Migration Process

The cloud data migration process can differ widely based on a company’s storage needs, business model, environment of current storage, and goals for the new cloud-based system. Below are the main steps involved in migrating to the cloud.

Step 1 – Analyze Current Storage Environment and Create Migration Strategy

A smooth migration to the cloud requires proper planning to ensure that all bases are covered before the migration begins. It’s important to understand why the move is beneficial and how to get the most out of the new cloud-based features before the process continues.

Step 2 – Choose Your Cloud Deployment Environment

After taking a thorough look at the current resource requirements across your storage system, you can choose who will be your cloud storage provider(s). At this stage, it’s decided which type of hardware the system will use, whether it’s used in a single or multi-cloud solution, and if the cloud solution will be public or private.

Step 3 – Migrate Data and Applications to the Cloud

Application workload migration to the cloud can be done through generic tools.  However, since data migration involves moving petabytes of data and billions of files, you need a data management software solution that can migrate data efficiently in a number of ways including through a public internet connection, a private internet connection, (LAN or a WAN), etc.

Step 4 – Validate Data After Migration

Once the migration is complete, the data within the cloud can be validated and production access to the storage system can be swapped from on-premises to the cloud.  Data validation often requires MD5 checksum on every file to ensure the integrity of the data is intact after migration.

Komprise Cloud Data Migration

With Elastic Data Migration from Komprise, you can affordably run and manage hundreds of migrations across many different platforms simultaneously. Gain access to a full suite of high-speed cloud migration tools from a single dashboard that takes on the heavy lifting of migrations, and moves your data nearly 30x faster than traditional available services—all without any access disruption to users or apps.

Our team of cloud migration professionals with over two decades of experience developing efficient IT solutions have helped businesses around the world provide faster and smoother data migrations with total confidence and none of the headaches. Contact us to learn more about our cloud data migration solution or sign up for a free trial to see the benefits beyond data migration with our analytics-driven Intelligent Data Management solution.

Learn more about your options for migrating file workloads to the cloud: The Easy, Fast, No Lock-In Path to the Cloud.

PttC_pagebanner-2048x639

Getting Started with Komprise:

Want To Learn More?

Data Tiering

Data Tiering refers to a technique of moving less frequently used data, also known as cold data, to cheaper levels of storage or tiers. The term “data tiering” arose from moving data around different tiers or classes of storage within a storage system, but has expanded now to mean tiering or archiving data from a storage system to other clouds and storage systems. See also cloud tiering and choices for cloud data tiering.

komprise-file-tiering-image-768x404

Data Tiering Cuts Costs Because 70%+ of Data is Cold

As data grows, storage costs are escalating. It is easy to think the solution is more efficient storage. But the real cause of storage costs is poor data management. Over 70% of data is cold and has not been accessed in months, yet it sits on expensive storage and consumes the same backup resources as hot data. As a result, data storage costs are rising, backups are slow, recovery is unreliable, and the sheer bulk of this data makes it difficult to leverage new options like Flash and Cloud.

Data Tiering Was Initially Used within a Storage Array

Data Tiering was initially a technique used by storage systems to reduce the cost of data storage by tiering cold data within the storage array to cheaper but less performant options – for example, moving data that has not been touched in a year or more from an expensive Flash tier to a low-cost SATA disk tier.

Typical storage tiers within a storage array include:
  • Flash or SSD: A high-performance storage class but also very expensive. Flash is usually used on smaller data sets that are being actively used and require the highest performance.
  • SAS Disks: Usually the workhorse of a storage system, they are moderately good at performance but more expensive than SATA disks.
  • SATA Disks: Usually the lowest price-point for disks but not as performant as SAS disks.
  • Secondary Storage, often Object Storage: Usually a good choice for capacity storage – to store large volumes of cool data that is not as frequently accessed, at a much lower cost.

Cloud-Data-Tieringv2-1-300x225

Cloud Data Tiering is now Popular

Increasingly, customers are looking at another option – tiering or archiving data to a public cloud.

  • Public Cloud Storage: Public clouds currently have a mix of object and file storage options. The object storage classes such as Amazon S3 and Azure Blob (Azure Storage) provide tremendous cost efficiency and all the benefits of object storage without the headaches of setup and management.

Tiering and archiving less frequently used data or cold data to public cloud storage classes is now more popular. This is because customers can leverage the lower cost storage classes within the cloud to keep the cold data and promote them to the higher cost storage classes when needed. For example, data can be archived or tiered from on-premises NAS to Amazon S3 Infrequent Access or Amazon Glacier for low ongoing costs, and then promoted to Amazon EFS or FSX when you want to operate on it and need performance.

But in order to get this level of flexibility, and to ensure you’re not treating the cloud as just a cheap storage locker, data that is tiered to the cloud needs to be accessible natively in the cloud without requiring third-party software. This requires file-tiering, not block-tiering.

Block Tiering Creates Unnecessary Costs and Lock-In

Block-level tiering was first introduced as a technique within a storage array to make the storage box more efficient by leveraging a mix of technologies such as more expensive SAS disks as well as cheaper SATA disks.

Block tiering breaks a file into various blocks – metadata blocks that contain information about the file, and data blocks that are chunks of the original file. Block-tiering or Block-level tiering moves less used cold blocks to lower, less expensive tiers, while hot blocks and metadata are typically retained in the higher, faster, and more expensive storage tiers.

Block tiering is a technique used within the storage operating system or filesystem and is proprietary. Storage vendors offer block tiering as a way to reduce the cost of their storage environment. Many storage vendors are now expanding block tiering to move data to the public cloud or on-premises object storage.

But, since block tiering (often called CloudPools – examples are NetApp FabricPool and Dell EMC Isilon CloudPools) is done inside the storage operating system as a proprietary solution, it has several limitations when it comes to efficiency of reuse and efficiency of storage savings. Firstly, with block tiering, the proprietary storage filesystem must be involved in all data access since it retains the metadata and has the “map” to putting the file together from the various blocks. This also means that the cold blocks that are moved to a lower tier or the cloud cannot be directly accessed from the new location without involving the proprietary filesystem because the cloud does not have the metadata map and the other data blocks and the file context and attributes to put the file together. So, block tiering is a proprietary approach that often results in unnecessary rehydration of the data and treats the cloud as a cheap storage locker rather than as a powerful way to use data when needed.

The only way to access data in the cloud is to run the proprietary storage filesystem in the cloud which adds to costs. Also, many third-party applications such as backup software that operate at a file level require the cold blocks to be brought back or rehydrated, which defeats the purpose of tiering to a lower cost storage and erodes the potential savings. For more details, read the white paper: Block vs. File-Level Tiering and Archiving.

Know Your Cloud Tiering Choices

CloudTieringMigrations-WebinarOnDemandthumb

File Tiering Maximizes Savings and Eliminates Lock-In

File-tiering is an advanced modern technology that uses standard protocols to move the entire file along with its metadata in a non-proprietary fashion to the secondary tier or cloud. File tiering is harder to build but better for customers because it eliminates vendor lock-in and maximizes savings. Whether files have POSIX-based Access Control Lists (ACLs) or NTFS extended attributes, all this metadata along with the file itself is fully tiered or archived to the secondary tier and stored in a non-proprietary format. This ensures that the entire data can be brought back as a file when needed. File tiering does not just move the file, but it also moves the attributes and security permissions and ACLS along with the file and maintains full file fidelity even when you are moving a file to a different storage architecture such as object storage or cloud. This ensures that applications and users can use the moved file from the original location, and they can directly open the file natively in the secondary location or cloud without requiring any third-party software or storage operating system.

Since file tiering maintains full file fidelity and native access based on standards at every tier, it also means that third party applications can access the moved data without requiring any agents or proprietary software. This ensures that savings are maximized since backup software and other third -arty applications can access moved data without rehydrating or bringing the file back to the original location. It also ensures that the cloud can be used to run valuable applications such as compliance search or big data analytics on the trove of tiered and archived data without requiring any third-party software or additional costs.

File-tiering is an advanced technique for archiving and cloud tiering that maximizes savings and breaks vendor lock-in.

Data Tiering Can Cut 70%+ Storage and Backup Costs When Done Right

In summary, data tiering is an efficient solution to cut storage and backup costs because it tiers or archives cold, unused files to a lower-cost storage class, either on-premises or in the cloud. However, to maximize the savings, data tiering needs to be done at the file level, not block level. Block-level tiering creates lock-in and erodes much of the cost savings because it requires unnecessary rehydration of the data. File tiering maximizes savings and preserves flexibility by enabling data to be used directly in the cloud without lock-in.

Why Komprise is the easy, fast, no lock-in path to the cloud for file and object data.

PttC_pagebanner-2048x639

Getting Started with Komprise:

Want To Learn More?

Network Attached Storage (NAS)

Diagram_NAS.png

What is Network Attached Storage?

Network Attached Storage (NAS) definition: A NAS system is a storage device connected to a network that allows storage and retrieval of data from a centralized location for authorized network users and heterogeneous clients. These devices generally consist of an engine that implements the file services (NAS device), and one or more devices on which data is stored (NAS drives).

The purpose of a NAS system is to provide a local area network (LAN) with file-based, shared storage in the form of an appliance optimized for quick data storage and retrieval. NAS is a relatively expensive storage option, so it should only be used for hot data that is accessed the most frequently. Many enterprise IT organizations today are looking to migrate NAS and Object data to the cloud to reduce costs improve agility and efficiency.

NAS Storage Benefits

Network attached storage devices are used to remove the responsibility of file serving from other servers on a network and allows for a convenient way to share files among multiple computers. Benefits of dedicated network attached storage include:

  • Faster data access
  • Easy to scale up and expand upon
  • Remote data accessibility
  • Easier administration
  • OS-agnostic compatibility (works with Windows and Apple-based devices)
  • Built-in data security with compatibility for redundant storage arrays
  • Simple configuration and management (typically does not require an IT pro to operate)

NAS File Access Protocols

Network attached storage devices are often capable of communicating in a number of different file access protocols, such as:

Most NAS devices have a flexible range of data storage systems that they’re compatible with, but you should always ensure that your intended device will work with your specific data storage system.

Enterprise NAS Storage Applications

In an enterprise, a NAS array can be used as primary storage for storing unstructured data and as backup for data archiving or disaster recovery (DR). It can also function as an email, media database or print server for a small business. Higher-end NAS devices can hold enough disks to support RAID, a storage technology that allows multiple hard disks into one unit to provide better performance times, redundancy, and high availability.

Data on NAS systems (aka NAS device) is often mirrored (replicated) to another NAS system, and backups or snapshots of the footprint are kept on the NAS for weeks or months. This leads to at least three or more copies of the data being kept on expensive NAS storage. A NAS storage solution does not need to be used for disaster recovery and backup copies as this can be very costly. By finding and data tiering (or data archiving) cold data from NAS, you can eliminate the extra copies of cold data and cut cold data storage costs by over 70%.

Check out our video on NAS storage savings to get a more detailed explanation of how this concept works in practice.

Network Attached Storage (NAS) Data Tiering and Data Archiving

Since NAS storage is typically designed for higher performance and can be expensive, data on NAS is often tiered, archived and moved to less expensive storage classes. NAS vendors offer some basic data tiering at the block-level to provide limited savings on storage costs, but not on backup and DR costs. Unlike the proprietary block-level tiering, file-level tiering or archiving provides a standards-based, non-proprietary solution to maximize savings by moving cold data to cheaper storage solutions. This can be done transparently so users and applications do not see any difference when cold files are archived. Read this white paper to learn more about the differences between file tiering and block tiering.

NAS Migration to the Cloud

smart-data-migrations-icon-circle-300x295Cloud NAS is growing in popularity. But the right approach to migrating unstructured data to the cloud is essential. Unstructured data is everywhere. From genomics and medical imaging to streaming video, electric cars, and IoT products, all sectors generate unstructured file data. Data-heavy enterprises typically have petabytes of file data, which can consist of billions of files scattered across different storage vendors, architectures and locations. And while file data growth is exploding, IT budgets are not. That’s why enterprises’ IT organizations are looking to migrate file workloads to the cloud. However, they face many barriers, which can cause migrations to take weeks to months and require significant manual effort.

Cloud NAS Migration Challenges

Common unstructured data migration challenges include:

  • Billions of files, mostly small: Unstructured data migrations often require moving billions of files, the vast majority of which are small files that have tremendous overhead, causing data transfers to be slow.
  • Chatty protocols: Server message block (SMB) protocol workloads—which can be user data, electronic design automation (EDA) and other multimedia files or corporate shares—are often a challenge since the protocol requires many back-and-forth handshakes which increase traffic over the network.
  • Large WAN latency: Network file protocols are extremely sensitive to high-latency network connections, which are essentially unavoidable in wide area network (WAN) migrations.
  • Limited network bandwidth: Bandwidth is often limited or not always available, causing data transfers to become slow, unreliable and difficult to manage.
Learn more about Komprise Smart Data Migration.

Network Attached Storage FAQ

These are some of the most commonly asked questions we get about network attached storage systems.

How are NAS drives different than typical data storage hardware?

NAS drives are specifically designed for constant 24×7 use with high reliability, built-in vibration mitigation, and optimized for use in RAID setups. Network attached storage systems also benefit from an abundance of health management systems designed to keep them running smoothly for longer than a standard hard drive would.

Which features are the most important ones to have in a NAS device?

The ideal NAS devices have multiple (2+) drive bays, should have hardware-level encryption acceleration, offer support for widely used platforms such as AWS glacier and S3, and have moderately powerful multicore CPU’s with at least 2GB of ram to pair with it.If you’re looking for these types of features, Seagate and Western Digital are some of the most reputable brands in the NAS industry.

Are there any downsides to using NAS storage?

NAS storage systems can be quite expensive when they’re not optimized to contain the right data, but this can be remedied with an analytics-driven NAS data management software, like Komprise Intelligent Data Management.

Using NAS Data Management Tools to Substantially Reduce Storage Costs

komprise-analysis-overview-white-paper-THUMB-3-768x512One of the biggest issues organizations are facing with NAS systems is trouble understanding which data they should be storing on their NAS drives and which should be offloaded to more affordable types of storage. To keep data storage costs lower, an analytics-based NAS data management system can be implemented to give your organization more insight into your NAS data and where it should be optimally stored.

For the thousands of data-centric companies we’ve worked with, most of them needed less than 20% of their total data stored on high-performance NAS drives. With a more thorough understanding of their NAS data, organizations are able to realize that their NAS storage needs may be much lower than they originally thought, leading to substantial storage savings, often greater than 50%, in the long run.

Komprise makes it possible for customers to know their NAS and S3 data usage and growth before buying more storage. Explore your storage scenarios to get a forecast of how much could be saved with the right data management tools.

This is what Komprise Dynamic Data Analytics provides.

NAS Fast Facts:

  • Network-attached storage (NAS) is a type of file computer storage device that provides a local-area network with file-based shared storage. This typically comes in the form of a manufactured computer appliance specialized for this purpose, containing one or more storage devices.
  • Network attached storage devices are used to remove the responsibility of file serving from other servers on a network, and allows for a convenient way to share files among multiple computers. Benefits of dedicated network attached storage include faster data access, easier administration, and simple configuration.
  • In an enterprise, a network attached storage array can be used as primary storage for storing unstructured data, and as backup for archiving or disaster recovery. It can also function as an email, media database or print server for a small business. Higher end network attached storage devices can hold enough disks to support RAID, a storage technology that allows multiple hard disks into one unit to provide better performance times, redundancy, and high availability.
  • Data on NAS systems is often mirrored (replicated) to another NAS system, and backups or snapshots of the footprint are kept on the NAS for weeks or months. This leads to at least three or more copies of the data being kept on expensive NAS devices.

Read the white paper: How to Accelerate NAS Migrations and Cloud Data Migrations 

Know the difference between NAS and Cloud Data Migration vs. Tiering and Archiving

Elastic_DM_NASmigration_2020-FINAL1024_1

Getting Started with Komprise:

Want To Learn More?

S3 Intelligent Tiering

S3 Intelligent Tiering is an Amazon cloud storage class. Amazon S3 offers a range of storage classes for different uses. S3 Intelligent Tiering is a storage class aimed at data with unknown or unpredictable data access patterns. It was introduced in 2018 by AWS as a solution for customers who want to optimize storage costs automatically when their data access patterns change.

Instead of utilizing the other Amazon S3 storage classes and moving data across them based on the needs of the data, Amazon S3 Intelligent Tiering is a distinct storage class that has embedded tiers within it and data can automatically move across the four access tiers when access patterns change.

To fully understand what S3 Intelligent Tiering offers it is important to have an overview of all the classes available through S3:

Classes of AWS S3 Storage

  1. Standard (S3) – Used for frequently accessed data (hot data)
  2. Standard-Infrequent Access (S3-IA) – Used for infrequently accessed, long-lived data that needs to be retained but is not being actively used
  3. One Zone Infrequent Access – Used for infrequently accessed data that’s long-lived but not critical enough to be covered by storage redundancies across multiple locations
  4. Intelligent Tiering – Used for data with changing access patterns or uncertain need of access
  5. Glacier – Used to archive infrequently accessed, long-lived data (cold data) Glacier has a latency of a few hours to retrieve
  6. Glacier Deep Archive – Used for data that is hardly ever or never accessed and for digital preservation purposes for regulatory compliance
Also be sure to read the blog post about Komprise data migration with AWS Snowball

Accelerating Petabyte-Scale Cloud Migrations with Komprise and AWS Snowball

AWS_Building_Logo-scaled

What is S3 Intelligent Tiering?

S3 Intelligent Tiering is a storage class that has multiple tiers embedded within it, each with its own access latencies and costs – it is an automated service that monitors your data access behavior and then moves your data on a per-object basis to the appropriate level of tier within the S3 Intelligent Tiering storage class. If your object has not been accessed for 30 consecutive days it will automatically move to the infrequent access tier within S3 Intelligent Tiering, and if the object is not accessed for 90 consecutive days it will automatically move the object to the Archive Access tier and then after 190 consecutive days to the Deep Archive access tier. If an object is moved to the archive tier, the retrieval can take 3 to 5 hours and if it is in the deep archive tier it can take 12 hours. and if it is then subsequently accessed it will move it into the frequently accessed storage class.

What are the costs of AWS S3 Intelligent Tiering?

You pay for monthly storage, request and data transfer. When using Intelligent-Tiering you also pay for a monthly per-object fee for monitoring and automation. While there is no retrieval fee in S3 Intelligent-Tiering and no fee for moving data between tiers, you do not manipulate each tier directly. S3 Intelligent Tier is a bucket, and it has tiers within it that objects move through. Objects in the Frequent Access tier are billed at the same rate as S3 Standard, objects stored in the Infrequent Access tier are billed at the same rate as S3 Standard Infrequent Access, objects stored in the Archive Access tier are billed at the same rate as S3 Glacier and objects stored in the Deep Archive access tier are billed at the same rate as S3 Deep Glacier.

What are the advantages of S3 Intelligent tiering?

The advantages of S3 Intelligent tiering are that savings can be made. There is no operational overhead, and there are no retrieval costs. Objects can be assigned a tier upon upload and then move between tiers based on access patterns. There is no impact on performance and it is designed for 99.999999999% durability and 99.9% availability over annual average.

What are the disadvantages of S3 Intelligent tiering?

The main disadvantage of S3 Intelligent Tiering is that it acts as a black-box – you move objects into it and cannot transparently access different tiers or set different versioning policies for the different tiers. You have to manipulate the whole of S3 Intelligent Tier as a single bucket. For example, if you want to transition an object that has versioning enabled, then you have to transition all the versions. Also, when objects move to the archive tiers, the latency of access is much higher than the access tiers. Not all applications may be able to deal with the high latency.

S3 Intelligent tiering is not suitable for companies with predictable data access behavior or companies that want to control data access, versioning, etc with transparency. Other disadvantages are that it is limited to objects, and cannot tier from files to objects, the minimum object storage requirement is 30 days, objects smaller than 128kb are never moved from the frequent access tier and lastly, because it is an automated system, you cannot configure different policies for different groups.

S3 Data Management with Komprise

Komprise is an AWS Advance Tier partner and can offer intelligent data management with visibility, transparency and cost savings on AWS file and object data. How is this done? Komprise enables analytics-driven intelligent cloud tiering across EFS, FSX, S3 and Glacier storage classes in AWS so you can maximize price performance across all your data on Amazon. The Komprise mission is to radically simplify data management through intelligent automation.

Komprise helps organizations get more value from their AWS storage investments while protecting data assets for future use through analysis and intelligent data migration and cloud data tiering.

AWS-Use-Case-Table-2

Learn more at Komprise for AWS.

What is S3 Intelligent Tiering?

S3 Intelligent Tiering is an Amazon cloud storage class that moves data to more cost-effective access tiers based on access frequency.

How AWS S3 intelligent tiering works

S3 Intelligent Tiering  is a storage class that has multiple tiers embedded within it. For a monitoring fee data is moved to optimize costs. Each tier with its own access latencies and costs:

  • Frequent – data accessed within 30 days
  • Infrequent – data accessed within 30-90 days
  • Archive Instant Access – data accessed greater than 90 days
  • Deep Archive Access – data not accessed for 180 days or greater (Optional*)

* Deep Archive Access: Also known as Glacier provides low cost with the tradeoff that data is not available for instant access. Retrieval time is within 12 hours and may cause time out condition for many applications. As such Deep Archive Access must be configured with the default configuration of S3 Intelligent Tiering

What are the advantages of S3 Intelligent tiering?

The advantages of S3 Intelligent tiering are that savings can be made for data where access pattern is unpredictable or unknown. There is no operational overhead, and there are no additional retrieval costs. Objects can be assigned a tier upon upload and then move between tiers based on access patterns.

What are the disadvantages of S3 Intelligent tiering?

The main disadvantage of S3 Intelligent Tiering is that it acts as a black-box – you move objects into it and cannot transparently access different tiers or set different versioning policies for the different tiers. For well-known workloads selecting the appropriate tier of storage can be more cost-effective vs S3 Intelligent Tiering.

Getting Started with Komprise:

Want To Learn More?

Secondary Storage

What is Secondary Storage?

Secondary storage devices are storage devices that operate alongside the computer’s primary storage, RAM, and cache memory. Secondary storage is for any amount of data, from a few megabytes to petabytes. These devices store almost all types of programs and applications. This can consist of items like the operating system, device drivers, applications, and user data. For example, internal secondary storage devices include the hard disk drive, the tape disk drive, and compact disk drive.

Some key facts about secondary storage:

Secondary Storage Data Tiering

Secondary storage typically tiers or archives inactive cold data and backs up primary storage through data replication or other data backup methods. This replication or data backup process, ensures there is a second copy of the data. In an enterprise environment, the storage of secondary data can be in the form of a network-attached storage (NAS) box, storage-area network (SAN), or tape. In addition, to lessen the demand on primary storage, object storage devices may also be used for secondary storage. The growth of organizational unstructured data has prompted storage managers to move data to lower tiers of storage, increasingly cloud data storage, to reduce the impact on primary storage systems. Furthermore, in moving data from more expensive primary storage to less expensive tiers of storage, knowns as cloud tiering, storage managers are able to save money. This keeps the data easily accessible in order to satisfy both business and compliance requirements.

path-to-the-cloud-files-graphic

When data tiering and archiving cold data to secondary storage, it is important that the archiving / tiering solution does not disrupt users by requiring them to rewrite applications to find the data on the secondary storage. Transparent archiving is key to ensuring that data moved to secondary storage still appears to reside on the primary storage and continues to be accessed from the primary storage without any changes to users or applications. Transparent move technology solutions that use file-level tiering to accomplish this.

Learn More: Why Komprise is the Easy, Fast, No Lock-In Path to the Cloud for file and object data.

What is Secondary Storage?

Secondary storage, sometimes called auxiliary storage, is non-volatile and is used to store data and programs for later retrieval. It is also known as a backup storage device, tier 2 storage, external memory, secondary memory or external storage. It is a non-volatile device that holds data until it is deleted or overwritten.

Secondary Storage Devices

Here are some examples of secondary storage devices:

  • Hard drive
  • Solid-state drive
  • USB thumb drive
  • SD card
  • CD
  • DVD
  • Floppy Diskette
  • Tape Drive
What is the difference between Primary and Secondary Storage?

Primary storage is the main memory where the operating system resides and is likely to be temporary, more expensive, smaller and faster and is used for data that needs to be frequently accessed.

Secondary storage can be hosted on premises, in an external device, or in the cloud. It is more likely to be permanent, cheaper, larger and slower and is typically used for long term storage for cold data.

Getting Started with Komprise:

Want To Learn More?

Storage Tiering

What is Storage Tiering?

Storage Tiering refers to a technique of moving less frequently used data, also known as cold data, from higher performance storage such as SSD to cheaper levels of storage or tiers such as cloud or spinning disk. The term “storage tiering” arose from moving data around different tiers or classes of storage within a storage system, but has expanded now to mean tiering or archiving data from a storage system to other clouds and storage systems. Storage tiering is now considered a core feature of modern storage systems and recently has become part of default configuration for next generation storage like AWS FSx ONTAP.

Block-level data storage solutions include: NetApp FabricPool and Dell PowerScale CloudPools.

Storage-agnostic data management and data tiering have emerged as more and more enterprise organizations adopt hybrid, multi-cloud, and edge IT infrastructure strategies. See also cloud tiering and choices for cloud data tiering.

komprise-file-tiering-image-768x404

 

Storage Tiering Cuts Costs Because 70%+ of Data is Cold

As data grows, data storage costs grow. It is easy to think the solution is more efficient storage. Or simply buy more storage. But data management is the real solation. Typically over 70% of data is cold and has not been accessed in months, yet it sits on expensive storage hardware or cloud infrastructure and consumes the same backup resources as hot data. As a result, data storage costs are rising, backup times are slowing, disaster recovery (DR) is unreliable, and the sheer bulk of this data makes it difficult to leverage newer options like Flash and Cloud.

Data Tiering Was Initially Used within a Storage Array

Data Tiering was initially a technique used by storage systems to reduce the cost of data storage by tiering cold data within the storage array to cheaper but less performant options – for example, moving data that has not been touched in a year or more from an expensive Flash tier to a low-cost SATA disk tier.

Typical storage tiers within a storage array or on-premises storage device include:

  • Flash or SSD: A high-performance storage class but also very expensive. Flash is usually used on smaller data sets that are being actively used and require the highest performance.
  • SATA Disks: High-capacity disks with lower performance that offer better price per GB vs SSD.
  • Secondary Storage, often Object Storage: Usually a good choice for capacity storage – to store large volumes of cool data that is not as frequently accessed, at a much lower cost.

Increasingly, enterprise IT organization are looking at another option – tiering or archiving data to a public cloud.

  • Public Cloud Storage: Public clouds currently have a mix of object and file storage options. The object storage classes such as Amazon S3 and Azure Blob (Azure Storage) provide tremendous cost efficiency and all the benefits of object storage without the headaches of setup and management.
  • Cloud NAS has also become increasingly popular, but if unstructured data is not well managed, data storage costs will be prohibitive.

Cold-Data-TieringCloud Storage Tiering is now Popular

Tiering and archiving less frequently used data or cold data to public cloud storage classes is now more popular. This is because customers can leverage the lower cost storage classes within the cloud to keep the cold data and promote them to the higher cost storage classes when needed. For example, data can be archived or tiered from on-premises NAS to Amazon S3 Infrequent Access or Amazon Glacier for low ongoing costs, and then promoted to Amazon EFS or FSX when you want to operate on it and need performance.

Cloud isn’t just low-cost data storage 

The cloud offers more than low-cost data storage. Advanced security features such immutable storage that can defeat ransomware. Cloud native services from analytics to machine learning can drive value from your unstructured data.

But in order to take advantage of these capabilities, and to ensure you’re not treating the cloud as just a cheap storage locker, data that is tiered to the cloud needs to be accessible natively in the cloud without requiring third-party software. This requires the right approach to storage tiering, which is file-tiering, not block-tiering.

Komprise_ArchivingTiering_blogthumb-768x512

Block Tiering Creates Unnecessary Costs and Lock-In

Block-level storage tiering was first introduced as a technique within a storage array to make the storage box more efficient by leveraging a mix of technologies such as more expensive SSD disks as well as cheaper SATA disks.

Block storage tiering breaks a file into various blocks – metadata blocks that contain information about the file, and data blocks that are chunks of the original file. Block-tiering or Block-level tiering moves less used cold blocks to lower, less expensive tiers, while hot blocks and metadata are typically retained in the higher, faster, and more expensive storage tiers.

Block tiering is a technique used within the storage operating system or filesystem and is proprietary. Storage vendors offer block tiering as a way to reduce the cost of their storage environment. Many storage vendors are now expanding block tiering to move data to the public cloud or on-premises object storage.

But, since block storage tiering (often called CloudPools – examples are NetApp FabricPool and Dell EMC Isilon CloudPools) is done inside the storage operating system as a proprietary solution, it has several limitations when it comes to efficiency of reuse and efficiency of storage savings. Firstly, with block tiering, the proprietary storage filesystem must be involved in all data access since it retains the metadata and has the “map” to putting the file together from the various blocks. This also means that the cold blocks that are moved to a lower tier or the cloud cannot be directly accessed from the new location without involving the proprietary filesystem because the cloud does not have the metadata map and the other data blocks and the file context and attributes to put the file together. So, block tiering is a proprietary approach that often results in unnecessary rehydration of the data and treats the cloud as a cheap storage locker rather than as a powerful way to use data when needed.

With block storage tiering, the only way to access data in the cloud is to run the proprietary storage file system in the cloud which adds to costs. Also, many third-party applications such as backup software that operate at a file level require the cold blocks to be brought back or rehydrated, which defeats the purpose of tiering to a lower cost storage and erodes the potential savings.

For more details, read the white paper: Block vs. File-Level Tiering and Archiving.

PttC_pagebanner-2048x639

Getting Started with Komprise:

Want To Learn More?

Contact | Data Assessment