Ideal Amazon S3 Storage Class for Your Data – A Quick Guide

Ideal Amazon S3 Storage Class for Your Data | Zmanda

Finding the right Amazon AWS S3 storage class for your data storage requirements is essential to ensure cost optimization. Understanding the differences between the storage classes and weighing their features can help you in deciding the right storage class for your data. In this blog, you will learn how each Amazon S3 storage class can be leveraged for specific use cases along with the right EBR solution.

You can make the most out of these diverse storage classes when performing your backups. The storage configurations will be simpler to handle if you can configure the storage directly from the backup and recovery solution itself, just like Zmanda EBR solution allows you to. Zmanda offers you to back up to five storage classes within AWS S3. Let’s explore strengths of AWS S3 storage classes with strong backup support from Zmanda.

Our blog on AWS overview will help you understand its storage classes better. So, if you are new to our AWS series, we recommend you give a quick read to the previous blog in this series.

Types of Storage in AWS S3

A single type of cloud storage will not meet modern-day business storage needs. It is difficult to achieve cost optimization with just a single cloud storage type, hence AWS S3 offer you choices as per your varying needs.

While some data is accessed frequently, like your credit card PIN, which is used every time you make a payment, others are accessed rarely, like your childhood photos. AWS storage classes are also created to meet such distinct access needs.

Storing specific data to specific storage classes each time you start a backup sounds like a lot of work, does it not? Well, lucky for you, Zmanda allows you to configure the AWS S3 storage class to back up right from the Zmanda Management Console (ZMC) itself.

To choose the right AWS S3 storage class for your implementation, a better understanding of the storage classes of AWS S3 is essential. To understand that, we shall first learn how data processing is classified based on response times.  

Data processing is categorized into three types based on response time. They are:

  • Real-time: This is used when data processing results are needed immediately. For example, suppose you enforce limits to a free trial of your software. In that case, you will need to know when the trial users exhaust their quota immediately rather than later.
Real-time
  • Near real-time:  This is used when data processing results are needed soon but not immediately. In near real-time data processing, the results reflect the data of the recent past and not the present. Say, for example, in the case of obtaining operation intelligence where sales leads are recognized from data sets. Identifying those sales leads which are more likely to buy and then focusing on them instead of focusing on every lead.
  • Batch:  This is used when data processing results are not needed immediately, and a delay of hours or days is acceptable. Say, for example, in the case of payroll activities which are done in monthly cycles.

If the AWS S3 storage classes supported by Zmanda are classified into real-time and near-real-time categories, they would appear as follows:

Real-time (millisecond access)Near real-time
S3 StandardS3 Glacier Flexible Retrieval (1 minute to 12 hours)
S3 Standard-Infrequent Access
S3 One Zone-Infrequent Access
S3 Reduced Redundancy Storag

The above classification is just in terms of access times. As discussed before, you can choose the storage class you want to back up to when you leverage AWS S3 as the target in ZMC. This convenience in the configuration of AWS S3 selections is what makes Zmanda truly user centric.

When backing up to the cloud, good internet connectivity is non-negotiable. However, if your internet keeps breaking up intermittently, our cloud backups stability feature will keep your backups and restores running without any hiccups. You can rest assured that trivial connectivity issues do not hamper your backup and recovery processes. This feature needs no intervention by you and is automatically enabled by Zmanda.

S3 Standard

S3 standard is the default storage class for all objects. If you do not specify any storage class during the upload to S3, the data is automatically stored in S3 standard. For high durability (99.999999999%), data stored on S3 Standard is stored in at least three Availability Zones. The Flexibility of same-region replication or cross-region replication is available. As a result, your data safety is assured even if an entire Availability Zone is down. It supports real-time access and is the most expensive storage class.  You are charged for storage per gigabyte and API calls.

So you have spent most of your budget on your AWS S3, and you’re running tight. But compromising on the backup and recovery solution is not an option. In such a situation, Zmanda will enable you to make the most of your AWS S3 spending without any compromise. Just because the S3 standard is expensive, your backup solution (read Zmanda) need not be pricey. Zmanda’s licensing policy charges you based on the number of endpoints being backed up. You may choose to backup terabytes or even petabytes of data, but the Zmanda licensing costs remain unaffected by it. This makes your business easy to scale without thinking much about cost factor.

When you want backup a sizeable volume of critical data like that of dynamic websites, cloud applications, mobile and gaming applications, and content distribution, the S3 standard is ideal. You get the best of AWS S3 storage with the cost benefits of Zmanda.

You can configure the backup set on ZMC to back up your data to the S3 Standard storage class. This eliminates the overhead of coordinating between screens of ZMC and AWS S3. You can do this on the Storage page.

Cloud Storage dropdown for AWS S3
Storage Option dropdown for AWS S3 in ZMC

Next, we shall learn about the S3 Standard Infrequent Access (S3 IA) storage class

S3 Standard Infrequent Access (S3 IA)

The S3 IA storage class is meant for data that is likely to be accessed once a month or once a year, but when you access it, you need it immediately. Even though the access is infrequent, the data stored in S3 IA can be accessed instantly, i.e., in milliseconds. You will be charged a lower price per gigabyte for storage in comparison to the S3 standard. However, you will be liable to pay a per GB retrieval fee.

It helps you cut down costs while storing infrequently accessed data for the long term. Consequently, S3 IA is ideal for long-term storage and disaster recovery use cases.

To simplify the coordination between AWS S3 UI and Zmanda UI, the Storage page on ZMC allows you to set up backups of your data such that they happen to the S3 Standard IA storage class.

What are S3 Lifecycle Policies?

S3 life cycle policies allow you to configure how AWS should handle objects stored in S3, depending on their age. Through S3 lifecycle policies, you can either move your object to another storage class, archive it, or even delete it after a certain period of time, say 30 days, for example.

The advantage of defining S3 lifecycle policies is that the move/archive/delete actions can be automated. This saves you the hassle of monitoring the objects and dealing with them from time to time. Zmanda allows you to implement a similar functionality like the one we just learned.

The configuration of backups in Zmanda happens via backup sets. The backup set will be configured through seven different screens. Each screen allows you to control different aspects of the backup set, like what to backup, where to backup, how staging should be used, etc. At the last screen of the backup configuration titled Backup Media, you can configure many media-specific options. One among the options available is the Retention period. This option allows you to specify how long the backups should be retained.

However, if the storage set is archived, the data will not be pruned from AWS S3, even if the data retention period is expired. In the below screenshot, Amazon S3 is being used for archival with forever retention.

AWS S3 with forever retention
AWS S3 for archival with forever retention

Please note that there are some storage class transitions that AWS does not support. Say, for example, S3 lifecycle policies do not transition objects from any storage class to S3 Standard or Reduced Redundancy Storage (RRS). Additionally, S3 lifecycles policy transitions and expirations are chargeable.

S3 One Zone Infrequent Access (S3 One Zone IA)

S3 One Zone IA is meant for less critical data, that may be accessed once a month or once a year, but when you need to access it, you need it instantly. The difference between One Zone IA and S3 IA is that there is no replication of data across multiple Availability Zones in One Zone IA. Whatever data you store in S3 One Zone IA is stored in a single Availability Zone to cut costs. It is 20% cheaper than the S3 IA storage class. The trade-off here is that if the Availability Zone where the data is stored is destroyed, then your data is lost. Hence, the durability of One Zone IA is 99.999999999% in a single Availability Zone.

S3 One Zone IA is ideal for non-critical data such as additional copies of backup data or easily re-creatable data. Here again, you will be charged a lower price per gigabyte in comparison to S3 IA and a per GB retrieval fee the same, as S3 IA.

If you wish to back up your data to S3 Standard One Zone IA storage class, you can easily set it up through ZMC (Storage page) itself.

S3 Reduced Redundancy Storage (RRS)

S3 Reduced Redundancy Storage is meant for non-critical data, which you are likely to access frequently. Data stored in this class has lesser redundancy compared to S3 Standard, i.e., the durability of RRS is 99.99%.

You can use RRS when you need highly available storage for distributing reproducible content like processed data or thumbnails. According to AWS, the S3 Standard class is more cost-effective than RRS and hence recommends using the S3 Standard. The RRS costs slightly more than the S3 standard for the same amount of storage.

If you are backing up reproducible content via Zmanda, then you cannot go wrong with RRS. You get the combined benefit of highly available AWS storage together with user-centric features of Zmanda like multipart upload API. The Storage screen of ZMC will give you the necessary configuration options for backing up to S3 Reduced Redundancy storage.

S3 Glacier Flexible Retrieval (formerly S3 Glacier)

S3 Glacier Flexible Retrieval is meant for archiving data that need not be accessed instantly. If saving money while sacrificing speed of access is what you need, S3 Glacier Flexible Retrieval is just right for you.  The trade-off in the access speed is because the storage class data is archived. The data needs to be extracted before you can use it. As a result, you usually have to wait minutes to hours (near real-time) before using it.

In spite of the low cost, you still get the same high durability as the S3 standard, as the data stored in this class is replicated across at least 3 Availability Zones.

Backing up to S3 Glacier Flexible Retrieval is perfect for data archival for compliance reasons, disaster recovery involving large sets of data, and offsite data storage needs. You will be charged lower than S3 One Zone IA for storage, and bulk retrievals are free.

If you are considering other storage options for long-term storage, then you might like to know how tape storage stacks up against S3.

To make archiving easier, Zmanda allows you to enable the archiving to Glacier right from the ZMC itself. On the Backup Where page, you have a checkbox for the Archive to glacier option, including the Number of days alongside it. Suppose you want to automatically archive your data stored on AWS S3 to AWS S3 Glacier. In that case, you need to check the checkbox and specify the number of days after which the archiving should happen. Once you save this configuration, data backed up by the corresponding backup set will be automatically archived in the S3 Glacier Flexible Retrieval storage class. The archiving will automatically happen after the data has spent the specified number of days in AWS S3 storage. Hence, you need not worry about moving data to the S3 Flexible Retrieval Glacier storage class, as Zmanda will do it for you.

Archive to Glacier option
Archive to Glacier option

S3 Vs. Tape

With the way tapes have evolved in capacity, the advantage it offers against S3’s cost and speed cannot be ignored.

S3 vs tape

Cost

When you choose tape for storing your data, you will spend a lot of your money on setting it up. You have to buy the tape library, media, and drives to set it up. Thankfully, the maintenance and cartridge replacement costs (approximately USD 15) are low. Although the life expectancy of tape increases in a moisture-controlled environment, it is not a necessity.

Enterprises that have a lot of data to store (read petabytes) prefer using tape with Zmanda. Some examples are research facilities, supercomputer labs, weather forecast stations, and universities. Storing such huge amounts of data on the S3 would cost a lot and is not advisable. If such huge amounts of data are stored on S3, the cost incurred would be almost equal to the cost of a tape changer library (USD 18,000 approximately).

This huge cost will be applied every month unlike in tape wherein it is a one-time investment. To sum it up, if you are storing large volumes of data for longer periods, tapes are better than S3.

Please note that when you save the backup on the tapes, you will also be responsible for labeling the tapes and keeping them organized. The responsibility of protecting the tapes from theft and physical damage will also fall on you.

On the cloud front, you need not spend much upfront as the cloud service providers like AWS will handle all the infrastructure setup and maintenance. Enterprises that are already using many AWS services can get discounts if they use AWS S3 for their storage needs. Some examples include software firms and small to medium businesses. So, tapes and AWS S3 both do offer cost advantages in their own ways. The choice of which depends on your storage requirements alone.

Speed

Just like most technologies, tapes have evolved quite a lot. The latest LTO-9 supports a maximum uncompressed speed of 400 MBs per second and a maximum compressed speed of 1000 MBs per second. Tapes will only become faster in the coming days. In case you need higher speed, you have total flexibility to upgrade your tape library. 

On the other hand, the speed of AWS S3 is limited by the bandwidth you have. If you have a large amount of data to be stored on S3, it would be smart to invest in increasing the bandwidth.

Zmanda allows you to optimize bandwidth usage through client parallel backups. You can set it by configuring the Client Parallel Backups field of the Backup How page. A value of 1 in the Client Parallel Backups field backs up all sources on a client sequentially. You can exercise this option if the client has the necessary CPU and network resources.

Please note that client parallel backups are not feasible for tape storage as it does not support parallel writes.

If you want high-speed retrieval from S3 for restores, you can store your critical data in high-speed classes.

With the speed aspect covered, next, we shall next analyze the security angle.

Security

AWS S3 offers you powerful access management tools and encryption to secure your data. The S3 Block Public Access features allow you to restrict public access to your objects at the account or bucket level.

AWS S3 allows you total freedom to configure who can access what by using the following features:

  • Access Control Lists to authorize users to access individual objects.
  • Bucket policies to set up permissions for objects within an Amazon S3 bucket.
  • Query String Authentication to give users temporary access (with time constraints) via temporary URLs.
  • AWS Identity and Access Management to create users and configure their access.

Coming to tapes, you have the unique ability to air-gap your data. Tapes are not connected to a network unless it is being written to or read from. So, all the tapes that have your data need not be inserted into the drives all the time.

In order to corrupt the data stored on tapes, the tape has to be inserted into a drive first. So, cybercriminals who steal or corrupt your data via the internet will have a tough time accomplishing their goals. For added protection, you can encrypt the data before storing them on tapes.

With a clear understanding of the AWS S3 vs. the tape storage, choosing a backup target for your needs is relatively easy.

The Way Forward With Zmanda

AWS S3 storage classes are unique and can serve best when selected wisely. Now that you have a clear understanding of how each storage class of AWS S3 is unique, you can make the call for your backup storage. Once the storage class is finalized, Zmanda can take care of configurations and backup handling for you.

Whether it is tapes or AWS S3, or a combination of both as a 3-2-1 storage strategy, Zmanda is a perfect fit for it all. Our user-friendly ZMC console can provide seamless integration between your shortlisted media and data. It is a one-stop destination for all backup and recovery requirements.

Zmanda will continue to bring forth new features that will make your backup and recovery even easier. Customer-driven enhancements are always part of our quarterly releases. The archive to the glacier, immutable backups, cloud backups stabilization, etc., were all developed with user convenience in mind. To discover how versatile Zmanda truly is, take a free trial. If you need help with Zmanda, feel free to reach out to our support team or drop a mail to our sales team. Our team of experts will be happy to help you through your Zmanda journey.


Explore More Topics