Archive for January, 2010

Fast Backups of MySQL Running on Amazon EC2

Saturday, January 23rd, 2010

If you are running your MySQL databases on the Amazon EC2 compute cloud, Zmanda Recovery Manager (ZRM) for MySQL can perform fast full backups of these databases by using Elastic Block Store (EBS) Snapshots. ZRM takes only a momentary read lock on the MySQL database during the creation of the snapshot, in order to ensure consistency of the backed up database archive. MySQL Backups using Amazon EBS snapshots are differential backups, meaning that only the blocks that have changed since your last full backup (via EBS snapshot) will be saved. For example, if you have a database with 100 GBs of data, but only 5 GBs of data has changed since your last snapshot, only the 5 additional GBs of snapshot data will be stored back to Amazon S3 during the current full backup run.

EC2 to S3 mysql backup diagram

ZRM automatically deletes EBS snapshots (containing full backups of MySQL) according to the configured retention policy. Just like other snapshot based full backups, ZRM intelligently correlates EBS Snapshots with incremental backups using MySQL logs, enabling you to recover your MySQL instances running on EC2 to any point in time.

Backups made using EBS snapshots can be recovered on the original EC2 instance or on a new EC2 instance. This also provides a quick and convenient mechanism to instantiate new MySQL database servers based on the database state from a desired point-in-time.

ZRM can run on the same EC2 instance as the MySQL database. On the other hand, if you have multiple EC2 instances with MySQL databases, you can run ZRM on one centralized EC2 instance dedicated for backup purposes. In this case, backup configuration and management for all MySQL databases is performed via Zmanda Management Console from this centralized backup server.

We have created an Amazon Machine Image (AMI) with ZRM pre-configured. This makes implementation of a MySQL backup solution on the cloud even simpler. We have used the “EC2 Small Instance” - which is powerful enough to backup most MySQL workloads in the cloud. This also makes it a very cost-effective option. This AMI is available to all ZRM customers, as part of the ZRM Enterprise subscription. You will need to create your own Amazon EC2 account, and pay standard per hour price to Amazon to run an instance based on this AMI. Note that you can configure your backup server instance to run only during the backup window. So, if you are backing up your databases once a week, and your backups takes less than an hour, then you can have this instance up only during that hour. EC2 pricing is per instance-hour consumed from the time an instance is launched until it is terminated. Each partial instance-hour consumed will be billed as a full hour. In addition to the EC2 compute capacity, you will pay standard storage charges for Amazon S3 (to store EBS Snapshots).

Join us on January 28th for a webinar on MySQL Backups (hosted by Sun/MySQL). Along with an introduction to Zmanda Recovery Manager, we will also discuss backing up MySQL applications on the cloud, and demonstrate the new ZRM AMI.

Red Hat Enterprise Linux and Amanda Enterprise: IT Manager’s Backup Solution

Thursday, January 14th, 2010

A backup server represents a very important component of any IT infrastructure. red hat logoYou need to pick the right components to implement a scalable, robust and secure backup server. The choice of the operating system has crucial implications. Red Hat Enterprise Linux (RHEL) provides many of the features needed from an ideal OS for a backup server. Some of these include:

Virtualization: RHEL includes a modern hypervisor (Red Hat Enterprise Virtualization Hypervisor) based on the Kernel-Based Virtual Machine (KVM) technology.  Amanda backup server can be run as a virtual machine on this hypervisor. This virtual backup server can be brought up as needed. This provides optimal resource management, e.g. you can bring up the backup server just at the time of backup window or for restores. A virtualized backup server also makes it much more flexible to change the resource levels depending on the business needs, e.g. if more oomph is needed from the backup server prior to a data center move.

High I/O Throughput:
Backup server represents huge I/Os, typically characterized by large sequential writes. RHEL, both as real and virtual system, provides high I/O throughput needed for a backup server workload. RHEL 5 allows for switching I/O schedulers on-the-fly. So, a backup administrator can fine tune I/O activity to match with higher level function (e.g. write-heavy backups vs. read-heavy restores).

Security: Securing a backup server is critical in any overall IT security planning. In a targeted attack, a backup server provides a juicy target because data that is deemed to be important by an organization can be had from one place. Security-Enhanced Linux (SELinux) in RHEL implements a variety of security policies, including U.S. Department of Defense style mandatory access controls, through the use of Linux Security Modules (LSM) in the Linux kernel. Amanda supports RHEL SELinux configuration. It allows users to run backup server in a secure environment.

Scalable Storage:
Storage technologies built into RHEL provide scalability needed from backup storage. The Ext3 filesytem supports up to 16TB file systems. Logical Volume Manager (LVM) allows for backup storage on a pool of devices which can be added to when needed. System administrators can also leverage Global File System (GFS) to provide backup server direct access to data to be backed up, by-passing the production network.

Compatibility: RHEL is found on compatibility matrix of any modern secondary storage device - whether it be a tape drive, tape library or a NAS device. RHEL also supports wide variety of SAN architectures, including iSCSI and Fibre Channel. This, along with Amanda’s use of native drivers to access secondary media, gives IT managers the widest choice in the market for devices to store backup archives.

Manageability: Easy update mechanism, e.g. using yum, from Redhat Network makes it easier for the administrator to keep the backup server updated with latest fixes (including security patches). Amanda depends on some of the system libraries and tools to perform backup and recovery operations. A system administrator can pare down a RHEL environment to only have bare-minimum set of packages needed for Amanda, and then use RHN to keep these packages up-to-date.

Long Retention Lifecycle: Many organizations need to retain their backup archives for several years due to business or compliance reasons. Each version of RHEL comes with seven year support. This combined with open formats used by Amanda Enterprise makes it practical for IT managers to have real long-term retention policies, with a confidence to be able to recover their data several years from now.

starbucks coffee
In summary, if you are in the process of making a choice for your backup server, RHEL should certainly be in the short-list for operating systems, and (yes, we are biased) Amanda in the short-list for backup software.  We will discuss this combination in detail in a webinar on January 21st. Red Hat is warming up this webinar by offering a $10 Starbucks card for every attendee. Join us!