Next Steps with OpenStack Swift Advisor – Profiling and Optimization (with Load Balancer in the Mix)

April 22nd, 2012

In our last blog on building Swift storage clouds, we proposed the framework for the Swift Advisor – a technique that takes two of  the three constraints (Capacity, Performance, Cost) as  inputs,  and provides hardware recommendations as output – specifically count and configuration of systems for each type of node (storage and proxy) of  the Swift storage cloud (Swift Cloud). Plus, we also provided a subset of our initial results for the Sampling phase.

In this blog, we will continue the discussion on Swift Advisor, first focusing on the impact of the load balancer on the aggregate throughput of the cloud (we will  refer to it as “throughput”) and then provide a subset of outcomes for the profiling and optimization phases in our lab.

Load Balancer

The load balancer distributes the incoming API requests evenly across the proxy servers. As shown below, the load balancer sits in front of the proxy servers to forward the API requests to them and can be connected with any number of proxy servers.

load balancer

If a load balancer is used, it is the only entry point of the Swift Cloud and all user data goes through it. So it is a very important component to consider for user visible performance of your Swift Cloud. In case it is not properly provisioned, it will become a severe bottleneck that inhibits the scalability of the Swift Cloud.

At a high-level, there are two types of load balancers:

Software Load Balancer: Runs a software load balancing software (e.g. Pound, Nginx) or round robin DNS on a server to evenly distribute the requests among proxy servers. The server running the software load balancer usually requires powerful multi-core CPUs and extremely high network bandwidth.

Hardware Load Balancer: Leverages the network switch/firewall or dedicated hardware with capability of load balancing to assign the incoming data traffic to the proxy servers of Swift Cloud.

Regardless of whether a software or hardware load balancer is used, the throughput of the Swift cloud cannot scale beyond the bandwidth of the load balancer. Therefore, we advise the cloud builders to deploy a powerful load balancer (e.g. with 10 Gigabit Ethernet) so that its “effective” bandwidth  exceeds the expected throughput of the Swift cloud.  We recommend that you pick your load balancer so that with a fully loaded (i.e. 100% busy) Swift Cloud, the load balancer still has around 50% unused capacity for future planning or sudden needs of higher bandwidth.

To have a sense of how to properly provision the load balancer and how it impacts the throughput of Swift Cloud, we show some results of running the Swift Cloud of c proxy and cN storage server (c:cN Swift Cloud) with the load balancer. (N is the “magic” value for 1:N Swift Cloud found in Sampling phase). These results are the “performance curves” for the profiling phase and can be directed used for optimizing your goal.

The experiments

In our last article, we already used some running examples to show how to get the output results from the Sampling phase. Here, we directly use the outputs (1:N swift cloud) of sampling phase as the inputs of the profiling phase, as seen below,

  • 1 Large Instance based proxy node: 5 Small Instance based storage nodes (N=5)
  • 1 XL Instance based proxy node: 5 Small Instance based storage nodes (N=5)
  • 1 CPU XL Instance based proxy node: 5 Small Instance based storage nodes (N=5)
  • 1 Quad Instance based proxy node: 5 Medium Instance based storage nodes (N=5)

Based on the above 1:5 swift clouds, we profile the throughput curves of c:c5 Swift cloud (c = 2, 4, 6,…) with the following setups of load balancer:

  1. Using one “Cluster Compute Eight Extra Large Instance” (Eight) with  Pound (a reverse proxy, load balancer) as the software load balancer (“1 Eight”), that all proxy nodes are connected to. (Eight Instance is one-level more powerful than Quad Instance. Similar to the Quad Instance, it also equips 10Gigabit Ethernet, but has 2X amount of CPU resources, 2 x Intel Xeon ES-2670, eight-core “Sandy Bridge” architecture, and 2X of memory.)
  2. Using two identical Eight Instances (each runs with Pound) as the load balancers (“2 Eight”). 50% proxy nodes are connected to the first Eight Instance and another 50% proxy nodes are linked to the second Eight Instance. The storage nodes have no sense of the first and second half of proxy nodes and accept all data from all of the proxy nodes.

Again, we use Amanda Enterprise as our application to backup a 20GB data file to the c:c5 Swift Cloud. We concurrently run two Amanda Enterprise servers on two EC2 Quad instances to send data to the c:c5 Swift cloud, ensuring that two Amanda Enterprise servers can fully load the c:c5 Swift cloud in all cases.

For this experiment, we focus on the backup operations, so the aggregate throughput of backup operations is simply regarded as “throughput” (MB/s) measured between the two Amanda Enterprise servers and the c:c5 Swift cloud.

Let’s first look at the throughput curves (throughput on Y-axis, values of c on X-axis) of c:c5 Swift cloud with the two types of load balancers for each of above mentioned configurations of proxy and storage nodes.

(1) Proxy nodes run on the Large instance and the storage nodes run on the Small instance. The two curves are for the two types of load balancers (LB):

Proxy nodes run on the Large instance

(2) Proxy nodes run on the XL instance and the storage nodes run on the Small instance.

Proxy nodes run on the XL instance

(3) Proxy nodes run on the CPU XL instance and the storage nodes run on the Small instance.

Proxy nodes run on the CPU XL instance

(4) Proxy nodes run on the Quad instance and the storage nodes run on the Medium instance.

Proxy nodes run on the Quad instance

From the above 4 figures, we can see that throughput of c:c5 Swift cloud using 1 Eight instance as the load balancer can not scale beyond 140MB/s. While, with 2 Eight instances as the load balancer, the c:c5 Swift Cloud can scale in linear shape (for the values of “c” we tested with).

Next, we combine the above results of “2 Eight” load balancer  into one picture, and look at it from another point of view —  throughput on Y-axis, cost ($) on X-axis. (As you may recall from our last blog, the cost is defined as the EC2 usage cost of running c:c5 swift cloud for 30 days.)

load balancer  into one picture

The above graph tells us several things:

(1) The configuration of using CPU XL instances for proxy nodes and Small instances for Storage node is not a good choice, because when compared with configuration of using XL instances for proxy nodes and Small instances for Storage node, it consumes similar cost, but delivers lower throughput. The reason for this is our observation that XL instances provide better bandwidth than CPU XL instances. AWS marks the I/O performance (including the network bandwidth) of  both XL instance and CPU XL instance as “High”. From our pure network bandwidth testing, XL instance shows maximum 120 MB/s for both incoming and outgoing bandwidth, while CPU XL instance has maximum 100 MB/s for both incoming and outgoing bandwidth.

(2) The configuration of using Large instances on proxy nodes and Small instances on Storage node is the most cost-effective. Since within each throughput group (marked as dotted circle in the figure): low, medium and high, it achieves the similar throughput, but with much lesser cost. The reason  this configuration can be cost-effective is because Large instance can provide the maximum 100 MB/s for both incoming and outgoing network bandwidth, which is similar to the XL and CPU XL instances, but is associated with 2x lower cost than the XL and CPU XL instances.

(3) While using Large instances on proxy nodes and Small instances on Storage node is very cost-effective, but the configuration of using Quad instances on proxy nodes and Medium instances on Storage node is also an attractive option. Especially if you consider the manageability and failure issues. To achieve 175MB/s througput, you can choose either 8 Large instance based proxy nodes and 40 Small instance based storage nodes (total 48 nodes), or 4 Quad instance based proxy nodes and 20 Medium instance based storage nodes (total 24 nodes). Hosting and managing more nodes in the data center may require higher IT-related costs, e.g. power, # of server racks, failure rate and IT administration. Considering those costs, it may be more attractive to setup a Swift Cloud with smaller number of more powerful nodes.

Based on the data in the above figure and considering the IT-related costs, the goal of the optimization phase is to choose the configuration that optimizes your goal best. For example, if you input the performance and capacity constraints and want to minimize the cost, let’s suppose the two configuration: (1) using Large instances for proxy nodes and Small instances for Storage nodes, and (2) using Quad instances for proxy nodes and Medium instances for Storage nodes, can both satisfy your capacity constraint. Now, the only thing left is that you want to figure out which configuration has less cost to fulfill the throughput constraint. The final result depends on your IT management costs. If your IT management cost is relatively expensive, then you may want to choose second configuration, otherwise, the first configuration will likely incur lesser cost.

In the future articles, we will talk about how to map the EC2 instances to the physical hardware so that the cloud builders can build an optimized Swift cloud running on physical servers.

If you are thinking of putting together a storage cloud, we would love to discuss your challenges and share our observations. Please drop us a note at

OpenStack Swift Advisor: Building Cloud Storage with Optimized Capacity, Cost and Performance

April 18th, 2012

OpenStack Swift is an open source cloud storage platform, which can be used to build massively scalable and highly robust storage clouds. There are two key use cases of Swift:

  • A service provider offering cloud storage with a well defined RESTful HTTP API – i.e. a Public Storage Cloud. An ecosystem of applications integrated with that API are offered to the service provider’s customers. Service provider may also choose to only offer a select service (e.g. Cloud Backup) and not offer access to the API directly.
  • A large enterprise building a cloud storage platform for use for internal applications – i.e. a Private Storage Cloud. The organization may do this because it is reluctant to send its data to a third party public cloud provider or to build a cloud storage platform which is closer to the users of its applications.

In both of above cases, as you plan to build your cloud storage infrastructure, you will face one of these three problems:

  1. Optimize my cost: You know how much usable storage capacity you need from your cloud storage, and you know how much aggregate throughput you need for applications using the cloud storage, but you want to know what is the least amount of budget you need to be able to achieve your capacity and throughput goals.
  2. Optimize my capacity: You know how much aggregate throughput you need for applications using the cloud storage, and you know your budget constraints, but you want to know the maximum capacity you can get for your throughput needs and budget constraints.
  3. Optimize my performance: You know how much usable storage capacity you need from your cloud storage, and you know your budget constraints, but you need to know the configuration to get best aggregate throughput for your capacity and budget constraints.

Solving any of the three problems above is very complex because of the myriad choices that the cloud storage builder has to make, e.g. size and number of various types of servers, network connectivity, SLAs etc. We have done extensive work in our labs and with several cloud providers to understand above problems and to address them with rigorous analysis. In this series of blogs we will provide some of the results of our findings as well as description of tools and services which can help you to build, deploy and maintain your storage cloud with confidence.

Definitions Since the terms used can be interpreted differently depending on context, below are the specific definitions used in this series of blogs for the three key parameters:

Capacity: It is the usable storage capacity, i.e. the size of the maximum application data that can be stored on the cloud storage. Usually, for better availability and durability, the data is replicated in the cloud storage across multiple systems.  So, the the raw capacity of the cloud storage should be planned with the consideration of data redundancy. For example, in OpenStack Swift,  each object is replicated three times by default. So, the total size of raw storage will be at least three times larger than the usable storage capacity.

Performance: It is the maximum aggregate throughput (MB/s or GB/s) that can be achieved by applications from the cloud storage. In this blog, we will also use the term throughput to denote aggregate throughput.

Cost: For this discussion we will only consider the initial purchase cost of the hardware for building the cloud storage. We expect that the built cloud storage will be put to use for several years, but we are not amortizing the cost over a period of time.  We will point out best practices to reduce on-going maintenance and scaling costs. For this series of blogs we will use the terms “node” and “server” interchangeably. So, “storage node” is same as “storage server”.

Introducing the framework for the Swift Advisor

The Swift Advisor is a technique that takes two of  the three constraints (Capacity, Performance, Cost) as  inputs,  and provides hardware recommendation as output, specifically count and configuration of systems for each type of node (storage and proxy) of  the Swift storage cloud. This recommendation is optimized for the third constraint: e.g. minimize  your budget, maximize your throughput, or maximize your usable storage capacity.

Before discussing the technical details of the Swift Advisor, let’s first look at a practical way to use the Swift Advisor: In order to build an optimized Swift cloud storage (Swift Cloud), an important feature of Swift Advisor is to consider a very large range of hardware configurations (e.g. a wide variety of CPU, memory, disk and network choices). However, it is unrealistic and very expensive to blindly purchase a large amount of physical hardware upfront and let Swift Advisor evaluate their individual performances as well as the overall performance after putting them together. Therefore, we choose to leverage virtualized and elastic environment offered by Amazon EC2 and build an optimized Swift Cloud on the EC2 instances initially.

While it may seem ironical that we are using a public compute cloud to come up with an optimized private storage cloud, the reasons for choosing EC2 as the test-bed for Swift Advisor are multi-fold: (1) EC2 provides many types of EC2 instances with different capacities of CPU, memory and I/O to meet the various needs. So, the Swift Advisor can try out many types of EC2 instances on the basis of pay-per-use, instead of physically owning the wide variety of hardware needed. (2) EC2 has a well defined pricing structure.

This provides a good comparison point for the cloud storage builders – they can look at the pricing information and justify the cost of owning their own cloud storage in the long run. (3) Specification of each type of EC2 instance, including CPU, memory, disk and network  is well defined. Once an optimized Swift Cloud is built on the EC2 instances with the input constraints, the specifications of those EC2 instances can effectively guide  the purchases of physical servers to build a Swift Cloud running on the physical hardware. In summary, you can use the elasticity of a compute cloud along with Swift Advisor to get specifications for your physical hardware based storage cloud, while preserving your desired constraints.

The high-level workflow of the Swift Advisor is shown below: The high-level workflow of the Swift Advisor There are four important phases and we explain them as follows:

Sampling Phase: Our eventual goal is to build an optimized Swift cloud consisting of quantity A of proxy servers and quantity B of storage severs  – A and B are unknown initially and we denote it as A:B Swift Cloud. In this first phase we focus on performance and cost characteristics of 1:N Swift Cloud. We look for the “magic” value of N that makes a 1:N Swift Cloud with the lowest cost per throughput ($ per MB/s) . The reason why we want to find a 1:N Swift cloud with the lowest $ per MB/s is to remove two potential pitfalls when building a Swift cloud : (1) Under-provisioning: the proxy server is under utilized and can still be attached to more storage servers to improve the throughput. (2) Over-provisioning: the proxy server has been overwhelmed by too many storage servers.

Since the potential combinatorial space for storage and proxy node choices is potentially huge, we use several heuristics to prune the candidates during various phases of the Swift Advisor. For example we do not consider very low powered configuration (e.g. Micro Instances) for proxy nodes.

After the sampling phase, for each combination of EC2 instance sizes on proxy and storage servers, we know the “magic” value of N that produces the lowest $ per MB/s of running a 1:N Swift cloud. You can run the sampling phase on any available virtual or physical hardware, but the larger the sample set the better.

Profiling Phase: Given the “magic” values of N from the sampling phase, our goal in this phase is to profile throughput curves (the throughput verses the size of Swift cloud) of several Swift clouds consisting of c proxy server and cN storage servers (c:cN Swift Cloud) with various values of c.

Please note that each throughput curve corresponds to each combination of hardware configuration (EC2 instance sizes in our case) of the proxy and storage servers. In our experiments, for each combination of EC2 instance sizes of the proxy and storage servers, the profiling starts from 2:2N Swift Cloud and we double the number of proxy and storage servers each time. (e.g. 4:4N, 8:8*N, ….). All cN EC2 instances for storage nodes are identical.

The profiling stops when the throughput of c:cN Swift Cloud is larger than the throughput constraint. After that, we apply a non-linear or linear regression on the profiled throughputs to plot a throughput curve with the X-values of c and Y-values of the throughput. The output of the profiling phase is a set of throughput curves of c:cN Swift Cloud, where each curve corresponds to a combination of EC2 instance sizes of the proxy and storage servers.

Optimization Phase: By taking the throughput curves from the profiling phase and two input constraints, the optimization phase is where we figure out a Swift Cloud optimized for the third parameter. We do this by plotting constraints on each throughput curve and look for the optimized value across all curves.

For example, lets say we are trying to optimize capacity with maximum budget given and minimum throughput requirement:  we will input the minimum required throughput on each throughput curve and find the corresponding values of c, and then reject the throughput curves where the implied hardware cost is more than the budget. Out of the remaining curves we will select the one resulting in maximum capacity based on cN * storage capacity of the system used for storage server.

Validation and Refinement Phase: The validation phase checks if the optimized Swift cloud really conforms to the throughput constraint through a test run of the workloads. If the test run fails a constraint, then the Swift Advisor goes to the refinement phase. The refinement phase gets the average throughput measured from the test run and sends it to the profiling phase.

The profiling phase adds that information to the profiled data to refine the throughput curves. After that, we use the refined throughput curves as the inputs to redo the optimization phase. The above four phases consists of the core of Swift Advisor. However, there are some important remaining issues to be discussed:

(1) choice of the load balancer

(2) mapping between the EC2 instance and the physical hardware when the cloud operators finally want to move the optimized Swift Cloud to physical servers, while preserving the three constraints on the new hosting hardware.

(3) SLA constraints. We will address these and other issues in building an optimized storage cloud for your needs in our future blogs.

Some Sampling Observations

In this blog, we present some of the results based on running Sampling phase on a selected configuration of systems. In future blogs, we will post the results for Profiling phase and Optimization phase.

For our sampling phase, we assume the following potential servers are available to us for proxy node: EC2 Large (Large), EC2 Extra Large (XL), EC2 Extra Large CPU-high (CPU XL) and EC2 Quadruple Extra Large (Quad). While the candidates for storage node are: EC2 Micro (Micro), EC2 Small (Small) and EC2 Medium (Medium).

Therefore, the total number of combinations of  proxy and storage nodes is 4 * 3 =12 and we need to find the “magic” value of N that produces the lowest $ per MB/s of running a 1:N Swift cloud for each combination. We start the sampling for each combination from N=5, and increase it until the throughput of 1:N Swift Cloud stops increasing. Note that a production Swift Cloud implementation requires at least 5 storage nodes. This happens when the proxy node is fully loaded and adding more storage nodes can not improve the throughput anymore.

We use Amanda Enterprise as our application to backup a 10G data file to the 1:N Swift cloud. The Amanda Enterprise runs on an EC2 Quad instance to ensure that one Amanda Enterprise server can fully load the 1:N Swift cloud in all cases. For this analysis we are assuming that the cloud builder is building the cloud storage optimized for backup operations. The user of the Swift Advisor should change the test workload based on the desired mix of expected application workload when the cloud storage goes production. We first look at the throughput for different values of N on each combination of EC2 instance sizes on proxy and storage nodes.

(1) Proxy node runs on EC2 Large instance and the three curves are for the three different sizes for the storage node:

Proxy node runs on EC2 Large instance

Observations with EC2 Large Instance based Proxy Node:

  1. Micro Instance based Storage nodes: Throughput stops increasing at # storage node = 30
  2. Small Instance based Storage nodes: Throughput stops increasing at # storage node = 10
  3. Medium Instance based Storage nodes: Throughput stops increasing at # storage node = 5

(2) Proxy node runs on EC2 XL instance: Proxy node runs on EC2 XL instance

Observations with EC2 XL Instance based Proxy Node:

  1. Micro Instance based Storage nodes: Throughput stops increasing at # storage node = 30
  2. Small Instance based Storage nodes: Throughput stops increasing at # storage node = 10
  3. Medium Instance based Storage nodes: Throughput stops increasing at # storage node = 5

(3) Proxy node runs on EC2 CPU XL instance: Proxy node runs on EC2 CPU XL instance

Observations with EC2 CPU XL Instance based Proxy Node:

  1. Micro Instance based Storage nodes: Throughput stops increasing at # storage node = 30
  2. Small Instance based Storage nodes: Throughput stops increasing at # storage node = 10
  3. Medium Instance based Storage nodes: Throughput stops increasing at # storage node = 5

(4) Proxy node runs on EC2 Quad instance: Proxy node runs on EC2 Quad instance

Observations with EC2 Quad Instance based Proxy Node:

  1. Micro Instance based Storage nodes: Throughput stops increasing at # storage node = 60
  2. Small Instance based Storage nodes: Throughput stops increasing at # storage node = 20
  3. Medium Instance based Storage nodes: Throughput stops increasing at # storage node = 10

Looking at above graphs, we can already draw some conclusions: E.g. if the only storage nodes available to you were equivalent to EC2 Micro Instance and you wanted your storage cloud to be able to scale beyond 30 storage nodes (per proxy node), you should pick at least EC2 Quad Instance equivalent proxy node. Let’s look at the figures (1) – (4) from another view: fix the EC2 instance size of storage node and vary the EC2 instance size of proxy node

(5) Storage node runs on EC2 Micro instance and the four curves are for the four different EC2 instance sizes on proxy node: Observations with EC2 Micro Instance based Storage Node:

  1. Large Instance based Proxy nodes: Throughput stops increasing at # storage node = 30
  2. XL Instance based Proxy nodes: Throughput stops increasing at # storage node = 30
  3. CPU XL Instance based Proxy nodes: Throughput stops increasing at # storage node = 30
  4. Quad Instance based Proxy nodes: Throughput stops increasing at # storage node = 60

From the above graphs, we can conclude that, (a) when proxy node runs on the Quad instance, it has the capability, especially the network bandwidth, that can accommodate more storage nodes and achieve higher throughput (MB/s) than using other instances for the proxy node.  (b) Different EC2 instance sizes on storage node load the same proxy node at different speed: for example, when proxy node runs on the Quad instance, we need to use 60 Micro instances as storage nodes to fully load the proxy node.

While, if we use Small or Medium instance size on storage node, we only need 10 storage nodes to fully load the proxy node. Based on the above results on throughput, now we look at the $ per throughput (MB/s) for different values of N on each combination of EC2 instance sizes on proxy and storage nodes. Here, $ is defined as the EC2 usage cost of running 1:N Swift cloud for 30 days. In this blog we are only showing numbers with proxy node set to EC2 Quad Instance. We will publish numbers for other combinations in another detailed report.

(6) Proxy node runs on EC2 CPU Quad instance: Proxy node runs on EC2 CPU Quad instance Observations with EC2 Quad Instance based Proxy Node:

  1. Micro Instance based Storage nodes: The lowest $ per MB/s is achieved at # storage node = 60
  2. Small Instance based Storage nodes: The lowest $ per MB/s is achieved at # storage node = 15
  3. Medium Instance based Storage nodes: The lowest $ per MB/s is achieved at # storage node = 5

Overall, the lowest $ per MB/s in the above figure  is achieved by using Medium Instance based Storage nodes at # storage node = 5 This specific result will provide input to the profiling phase of N=5, 15 and 60 for proxy/storage node combination EC2 Quad/Medium, EC2 Quad/Small and EC2 Quad/Micro respectively.

So, one can conclude that when using 1 Quad Instance based Proxy node it may be better to use 5 Medium based Storage nodes to achieve the lowest $ per MB/s, rather than using more Micro Instance based storage nodes. Above graphs are a small subset of the overall performance numbers achieved during the Sampling phase.

The overall objective here is to give you a summary of our recommended approach to building an optimized Swift Cloud. As mentioned above, we will publishing detailed results in another report, as more conclusions and best practices in future blogs in this series.

If you are thinking of putting together a storage cloud, we would love to discuss your challenges and share our observations. Please drop us a note at

MySQL Backup Updated

April 10th, 2012

As MySQL continues to grow (as a technology and as an ecosystem) the need and importance of creating and deploying robust MySQL backup solutions grows as well. In many circles Zmanda is known as “The MySQL Backup Company”. While we provide backup of a wide variety of environments, we gladly take the label of backing up the most popular open source database in the world, especially as we kick off our presence at the 2012 MySQL Conference.

Here are some of the updates to our MySQL backup technologies that we are announcing at the conference:

Announcing Zmanda Recovery Manager 3.4

We have updated the popular Zmanda Recovery Manager (ZRM) for MySQL product for scalability. Our customers continue to deploy ZRM to backup ever larger MySQL environments. Some of the scalability features include: Better support for hundreds of backup sets within one ZRM installation, support for more aggressive backup schedules, better support for site-wide templates, and deeper integration with NetApp’s snapshot mechanisms. We have also added support for the latest versions of XtraBackup and MySQL Enterprise Backup. We have also added experimental support for backing up Drizzle (via XtraBackup). If you are deploying Drizzle in your environment, we are looking for beta customers.

Many of our customers store their MySQL databases on NetApp storage. ZRM can be used in conjunction with NetApp Snapshot and SnapVault products to create database consistent backups without moving the data out of NetApp storage. ZRM creates snapshots of MySQL database volumes, which it can then move to another Netapp storage using Netapp SnapVault. SnapVault moves the data efficiently between various NetApp filers. This provides customers a way to protect the backups without impacting their corporate LAN. ZRM uses SnapRestore functionality to quickly restore the databases in case of a failure.

Announcing MySQL Backup Agent for Symantec NetBackup

If you have Symantec NetBackup deployed in your environment, and you would like to consolidate your MySQL backups within the umbrella of NetBackup based backup infrastructure, now you have a well integrated solution. We have released MySQL backup Agent, which is deeply integrated with Symantec NetBackup. This agent allows you do perform live backups of your MySQL databases directly from your MySQL servers to your NetBackup server.

NetBackup MySQL Agent

Backup of your MySQL databases to the Cloud

Public or Private Cloud Storage is a great choice for offsite store for backup archives. You can also use compute clouds as inexpensive DR site for your MySQL databases. For MySQL databases running on Windows, our Zmanda Cloud Backup product provides a very easy and inexpensive way to backup to Amazon S3 or Google Cloud Storage.

If you have MySQL databases running on Linux or heterogeneous environments, you have two choices for backing up to the cloud: You can use our Amanda Enterprise product with Amazon S3 or Google Cloud Storage option to move backup images created by ZRM to the cloud. Second option is to use the recently released Amazon Storage Gateway in conjunction with ZRM.

ZRM Backing Up To AWS Gateway Storage

We have published an integration report (available on Zmanda Network under the MySQL Backup section – free registration required) to show how you can deploy AWS Gateway to asynchronously upload backup files created by ZRM to Amazon S3.

As you can see, we have been busy updating our MySQL backup solutions. All of above improvements and feature additions have been done based on feedback provided by MySQL DBAs. If you are visiting the MySQL user conference this week, please do visit us at our booth – we would love to understand and discuss your MySQL backup challenges.

Cloud Backup Your Way! (Releasing ZCB 4.1)

February 27th, 2012

Today, we released ZCB 4.1, a major update to our prior version 4.0.2. In addition to significant polish and general fixes, ZCB 4.1 has several features requested by our customers.

Here is a walkthrough:

Better utilize your Internet bandwidth based on your work schedule

Traditionally, data backup is seen as an activity to be completed during nights or weekends – when users are not actively using their systems. But today, this practice is difficult to follow for two main reasons: First, the available time-window for backups has now shrunk as people now work from different offices at different times of the day. Second, with improvements in Internet speeds still lagging behind the growth in data volumes, hoping to upload everything during the weekend looks like an eventual impossibility.

So how does one cope with this changed reality? While we can’t perhaps solve the problem of lack of bandwidth, we can try dividing it better between our production work and backup procedure. And this is exactly what ZCB 4.1 offers through its new feature which allows you to specify the bandwidth throttling limits down to the granularity of 15 minute intervals of the day. This essentially means that you can control ZCB’s usage of bandwidth to exactly fit your environment’s unique network utilization pattern.

To help you understand how this feature can be used, we have also included intelligent predefined templates such as “Throttle on weekdays” and “Gradual throttle on weekdays”. The latter template, for example, limits ZCB’s usage of bandwidth during peak weekday hours and relaxes the limits as it begins to close. This is shown in the figure below:

Time Window

(In the above screenshot, green bars represent full bandwidth usage, red bars represent significantly throttled bandwidth and other bars represent a value in between these two extremes).

You can also work with Zmanda’s support team to customize these templates to exactly fit your needs.

ZCB now supports seven backup locations in four continents

ZCB 4.1 supports the two newest regions of Amazon S3 – US West (Oregon) and South America (São Paulo). If you are near these regions and/or wish to use them – you can celebrate a more, since all usage charges for these two regions are waived off until March 20th, 2012!

With this update, ZCB now supports seven convenient regions (spread across four continents!) to backup to – making backups of our users across the globe more efficient, convenient and practical. And when we are talking about a global user-base, let me add that in addition to English, German, Chinese (Traditional and Simplified) and Japanese languages, ZCB UI is now available in Korean language too. ZCB will speak more languages soon – stay tuned!

Cloud Locations

Backup Oracle servers

ZCB 4.1 includes the support for backing up Oracle 11g databases running on Windows servers. All backup levels – Full, Differential and Incremental are supported.


Backup more, at the same time

ZCB 4.1 supports parallel backups across backup sets. This means you don’t have to wait for your ongoing backup to finish before future backups begin. This allows you to schedule backups independently and easily. This item was on our radar for quite some time and we have finally added this support in ZCB 4.1.

Faster and more efficient restores

When all you want is to restore a few specific files from your backups, why should the whole backup archive get downloaded from the cloud? Yes, now ZCB only downloads the specific chunk of data from within the backup archive which it needs to complete the requested restore, so that you can recover your data faster – with minimal downtime.

Save costs with a finer grained control on retention policy

ZCB 4.1 allows users to have different retention policies for full and incremental/differential backups. What this means is that you can choose just how long you want a particular kind of backup to remain on the cloud. This new feature will be very useful in ensuring judicious use of your cloud storage, which could, in turn, translate to significant reductions your backup costs. An instance of such a backup scheme is:


Here, full backups are scheduled every week and retained as per default retention policy (two weeks). But the user doesn’t want to retain incremental backups for so long and wishes to delete them after 8 days since the backup cycle is of 1 week.

The above feature further illustrates the design of ZCB to let you backup your way, and also have control in deciding how much you want to pay for your backups. To see more such examples, you may want to read my earlier blog post.

Bulk-deploy ZCB on multiple machines with the new Configuration Cloning utility

ZCB 4.1 includes a new utility to help you in deploying ZCB on multiple of your machines more efficiently. If you are looking to protect multiple of your systems using ZCB, you may be interested in exploring this new feature. For more details, please look at our knowledgebase article here:

So this was a brief walkthrough of ZCB 4.1. If you are an existing customer and find these features interesting, download it from your account on and upgrade now (the release notes can be found here). And if you are yet to purchase ZCB, well, let us know what’s been holding you back!

Our engineering is aggressively working on many more cloud backup innovations. If you would like to request a feature or have some feedback, we would love to hear from you at

Optimizing the cost of your cloud backup

January 5th, 2012

A well-known challenge of new technologies such as cloud backup is that there are no set standards. Take pricing, what would you expect to pay for storing 10 GB of your data on cloud today? Given that the answer can be anything from zero to a few hundred dollars, how do you know that you are not paying more than you really should for your requirements? The question worth asking essentially is – since businesses are different and have different backup needs, why shouldn’t they be allowed to control how much they want to pay for cloud backup?

We broached this question in our recent Zmanda Cloud Backup (ZCB) webinar titled “How to get the maximum out of ZCB” (recording available here) and looked at ways to optimize ZCB costs for one’s requirements. While exploring different options, we realized something interesting – ZCB’s flexibility not only makes it very versatile, but when combined with its pay-as-you-go pricing model, it also allows great leeway in optimizing backup costs. In this post, I will try to explore the options available in ZCB to do just that.

Before we begin, allow me to clarify – while the bulk of this post focuses on cost optimization options with ZCB, the intent is to provide a systematic way of thinking about cloud backup costs. If you are a ZCB user, you can use these options directly. And if you are not a ZCB user, you can map some of these options to your backup solution (and for the benefit of all of us, please do remember to post your results in a comment below!) and see how better (or worse) it fares.

First, a look at the ZCB pricing model

ZCB’s pricing model has two components:

  • Fixed monthly license fee: $4.95 per month
  • Usage based fee:
    • Storage: $0.15 per GB-month
    • Upload to cloud: $0.15 per GB
    • Data download from cloud: Free

Admittedly, this does look more complicated than a fixed monthly cost, but its complexity really emerges from its flexibility, which does leave a lot of room for optimizing costs. Let’s see how.

Step 1: “Divide and conquer” the monthly license fee!

Got multiple machines to back up? Congratulations! Unlike most other backup services, which charge a fee per machine, ZCB allows a single ZCB license to be used to protect an unlimited number of systems. So if you have, say, 5 or 10 machines to be backed up, the fixed monthly cost per backup system becomes non-significant. (However, just be aware that machines that share a ZCB license can potentially access the backup data of each other – although use of encryption can alleviate the potential privacy issue).

Step 2: Optimize the usage based fee!

The usage based fee with ZCB simply means you pay for data storage and data uploads. Thus, optimizing this fee can involve two steps:

Step 2.1: Optimize your total backup size

Let’s first try to see how much data you really need to backup and how to shrink the size of backup media to store the backed up data. ZCB offers following options here:

  • Carefully choose what data needs to be backed up: While backing up applications such as Exchange with ZCB, you can select specific datastores instead of all datastores. For file system backups, you entirely control what gets backed up (ZCB does NO automatic selection of *.mp3, *.jpg files etc.) and you can also specify an “exclude list” to skip backing up large user files by mentioning patterns such as *.mp3 or *.mov. This point may look obvious, but doing this is not easily allowed by many cloud backup applications which attempt to maximize your backup data size, for obvious reasons ;).

    Figure 1: Exclude list

    Figure 1: Exclude list

  • Use backup levels: Incremental and differential backups contain only the data which changed since a last backup and hence reduces backup size. Use incremental and differential backups judiciously to reduce data size while still adhering to your backup strategy.

    Figure 2: Differential backups – backup changed data since last full backup

    Figure 2: Differential backups – backup changed data since last full backup

    Figure 3: Incremental backups – backup changed data since any last backup

    Figure 3: Incremental backups – backup changed data since any last backup

  • Choose backup frequency: How often you backup directly impacts your total backup data size. So you need to choose the right backup frequency which fits your backup requirements but also keeps your total data size manageable. With ZCB you can choose to do only manual backups (when you want) or choose from powerful scheduling options to perform backups every 15 mins to a certain date in every year.

    Figure 4: Choose backup frequency

    Figure 4: Choose backup frequency

  • Enable compression: Depending on your data type (document, text files are more compression friendly), enabling compression may help you shrink your storage requirement by about 10-50%.Here is a figure which summarizes all of these ZCB options:

    Figure 5: Summary of all options to optimize total backup size

    Figure 5: Summary of all options to optimize total backup size

Step 2.2: Optimize how much cloud storage is used to store backup data

Now that you have optimized the total backup data size, let’s see how you can reduce the storage required on the cloud for keeping this backup data. Here are your options with ZCB:

  • Blend cloud storage with local storage: ZCB allows you to store all or some of your backup data to your local or network storage. e.g. you may choose to store only certain full backups on cloud storage while using your local/network disk storage for your primary/frequent backups. Below is an example:

    Figure 6: A sample backup strategy to minimize cloud storage (only monthly backups go to cloud, rest all backups go to local/network storage)

    Figure 6: A sample backup strategy to minimize cloud storage (only monthly backups go to cloud, rest all backups go to local/network storage)

  • Judiciously choose the cloud data retention policy: ZCB allows complete control over the retention period for your backup data. So you can choose to adopt as aggressive retention policy as your backup policy allows, such as “retain full backups for 2 weeks and retain incremental backups for 2 days”.
  • Monitor, Monitor and Monitor: Monitor your cloud usage regularly and purge old backup runs which you don’t require. For monitoring, you can use Amazon bills, ZCB Global Dashboard and jets3t tool. And for purging old backup data which is not required, you can click on File > Purge Backup Runs Before and select a historic date all the backup runs before which will be deleted by ZCB. Do note that deleting any data which is required by subsequent backup runs (such as deleting full backups while retaining incremental/differential backups) may make your dependent backups useless for any restoration requirement in future.

    Figure 7: Purging old backup data which is no longer required

    Figure 7: Purging old backup data which is no longer required

  • Exploit the ZCB free tier: ZCB offers 5 GB free cloud storage and uploads for each of the 5 Amazon S3 regions, making it possible to use up to 25 GB free cloud storage across all 5 regions completely free! You can scatter your data across all the 5 AWS regions to fully exploit this free tier. (With two more AWS regions supported in ZCB 4.1, this free tier will soon become 35 GB free tier and hence this option becomes even more effective!).

    Figure 8: The ZCB free tier

    Figure 8: The ZCB free tier

Quite a handful ways to optimize costs, isn’t it? And perhaps the best part is – since ZCB as well as the pricing model is super-flexible, the above is not even an exhaustive list!

Are you a ZCB user? If yes, do consider these steps and let us know if/how they worked for you. And if you are not a ZCB user, I’m very curious to know how you are optimizing your costs with your current solution?

Have a “cost effective” new year!


Drop the box and start backing up!

November 22nd, 2011

Okay first let me say this: I love Dropbox and like many of you depend on it each day to seamlessly access my important files from office/home/shared computers and from my cell phone. Also ever since Dropbox released the developer APIs, an increasing number of innovative applications (see here and here for a few examples) are coming to the fore that extend Dropbox beyond its “native” features of syncing, sharing and collaboration.

This is great but creates a potential problem. With all this excitement it is easy to get carried away and think of using Dropbox to solve a problem which it was never designed to solve – a robust cloud backup. Even at a conceptual level, classic data backup technology based tools such as Zmanda Cloud Backup (ZCB) and sync-sharing tools such as Dropbox solve very different needs of businesses. To most of backup administrators it would seem outlandish to even suggest that one can be used in place of the other (a silicon valley based system administrator, I tossed this idea to, frowned upon it and found the comparison so illogical that he spent a few seconds thinking about where to begin his explanation from!).

But yet over the last few months, since the same time since Dropbox started gaining mass acceptance, we’ve been seeing this confusion pop up in the heads of some of our prospective users. Thanks to the (well-deserved) widespread attention which Dropbox has gathered in recent times, such users would begin comparing ZCB with Dropbox for solving their data backup problems. And so far, to clear up the matter, we largely just tried to remind them about the fundamentals of disaster recovery and how Dropbox is an excellent tool to share and synchronize data but a very primitive tool to perform data backups.  I can’t tell how far we’ve succeeded in conveying this, but I know some of them indeed saw our point (they became our customers!).

But this post became unavoidable, since the plot seems to have thickened with the recent introduction of Dropbox for teams. With this latest offering, Dropbox now consciously targets businesses by offering them huge shared storage (1 TB) along with some administrative tools to manage the service. Not a bad idea really. The problem however is that to sell SMBs this much storage, Dropbox now seems to be telling them to use this storage for data backups, something it never claimed to do well so far.

So let’s scratch the surface a bit here to see what Dropbox is and what it can or can’t backup.

At the outset, let’s try to see what problems Dropbox has been designed to solve and how data backup was not one of those problems. This is how the Wikipedia defines Dropbox:

Dropbox is a Web-based file hosting service operated by Dropbox, Inc. that uses cloud computing to enable users to store and share files and folders with others across the Internet using file synchronization.

This is what it really is. You give Dropbox some files which you want to share and it laps them up, stores them on its cloud storage and shares them among multiple Dropbox clients:

Dropbox at work


And when any of your files change from any shared machine, the changes are instantaneously replicated across all the shared devices. So what’s the secret sauce? Well the steadfast decision process to keep things simple for syncing and sharing the user files. See such an instance of decision making on this page.

On the other hand, a true backup solution, such as ZCB, exists to ensure that all your data gets backed up regularly and you can go back to any of the backed-up states of your machine when the sky comes crumbling down. This may sound similar, so let’s see why this goal is not achievable with Dropbox:

  1. Completeness: At a higher level, the data in your computer can be classified in following categories:
    1. User files: These are independent files like documents, presentations, spreadsheets which are created by users for their official or personal work.
    2. File system/Interlinked files: These can be your entire directory structure such as D:\, a particular special directory such as “My Documents” or a set of some files which are inter-linked – for e.g. a bunch of website files or a spreadsheet with embedded images or macros.
    3. Application data: The data created and used by your business applications such as SQL Server or Outlook. These can be databases, configuration files, temporary files, etc. and are generally created in the installation directory of these applications. Also these files are “open” when the application is running.
    4. Applications: Binaries and configuration files of applications which are installed such as Microsoft office and Adobe PDF suite.
    5. Operating system and system configuration: The installed operating system, its configuration (“System State” in Windows) and other system information such as partition table, etc.

    Now looking at the above, it is obvious that Dropbox can only be considered for data in the first and second categories. And even in second category, some special folders (e.g. C:\Program Files) can’t even be put inside your Dropbox folder. And those which can be, you are likely to have problems during restores. With many interlinked files, how are you going to find a logically consistent set of interlinked files as it existed at a particular historic point in time?

    A true backup solution such as ZCB, on the other hand, backs up almost all the above categories of data (ZCB backups Windows system state though not the operating system and boot loader/partitioning information), and the backup archives represent logical and consistent states at particular points in time.

  2. Modification/Deletion of original copy of data: A true backup solution never modifies the original copy of data, let alone deleting it. In fact even changing a file’s meta-data (archival bit, modification time etc) has been considered unacceptable by many backup administrators, since that may interfere with some other installed applications.

    But since the primary goal of Dropbox is to “synchronize” data across multiple machines, it will do all which is necessary to accomplish this goal. So if a file gets accidently deleted or corrupted on one system, Dropbox will gleefully and promptly propagate that accident to all the shared machines.This is obviously a serious problem and hence in its paid versions, Dropbox offers an “unlimited undo history” feature to allow you to undelete files. Though this surely helps, but from Disaster Recovery standpoint it still is a risky situation, since this would mean that you have lost all your local copies and now have only one remaining copy of your original data. What’s worse – it is only available on the cloud, so if you need it when you have no or poor internet connectivity, you are out of luck.

    On the other hand, a true cloud backup solution such as ZCB supports smart redundancy options where you can keep backup data on local as well as cloud storage. Since you will have 3 copies of your data (original + 2 separate copies), even if you accidently delete your original copy of file you still have two redundant copies to restore from.

  3. Security: The tricky thing about security is that it’s like insurance – you may not care for it in steady state but it can be catastrophic when something goes wrong. And security has been the number one reason why Dropbox is still unwelcome in many enterprises today. Some issues:
    1. True data privacy: Dropbox encrypts your data on the Amazon S3 cloud using an encryption key which is unique to your Dropbox account. Also note that this encryption key is known to Dropbox. This means two things. First, your data is not truly private as Dropbox personnel can potentially see your data (Of course, we believe that this is unlikely). Second, it means you can’t have any data privacy between two of your users sharing the same Dropbox account.

      The only way out here would be to use a separate file/volume level encryption tool on top of Dropbox (such as TrueCrypt). But in addition to burdening your users with new workflows related to encryption/decryption, this would most probably also make the Dropbox synchronizations inefficient, thus defeating the whole purpose of using Dropbox in first place. I recommend checking out the experiences of the commenters on this blog for the gory details of such problems if you are indeed thinking of going down this path.

      In comparison, a true backup solution like ZCB offers asymmetric encryption with the user generated certificates, making it virtually impossible for anyone else to see your encrypted data.

    2. The disadvantage of being a public “data sharing” service: Dropbox was designed to support data exchanges among multiple devices and multiple users over the internet. You can imagine that such a service needs to have somewhat relaxed rules when it comes to authentication, access rules, open ports, etc. Dropbox has already had its share of such issues – see this page and this page for examples.

    Again, in contrast, a true backup application such as ZCB has much more tighter security mechanisms. It can securely encrypt your data with user-generated keys as soon it is backed up, can send the data over a SSL tunnel to the cloud which is protected by multiple layers of authentication for gaining access. This ensures that your backup data is safe and secure; irrespective of its location – on local disk or cloud.

  4. Flexibility in choosing data retention policy: Choosing retention policy is a very important decision variable for your Disaster Recovery plan as it decides the oldest historic time you can restore to and has direct implication on your storage costs.But since Dropbox has the “unlimited undo history” feature, why should one even worry about this? My doubts about the long term sustainability of a truly “unlimited” deleted file history notwithstanding, there are at least two reasons why data retention policy still is an issue with Dropbox:
    1. There is no automatic management of your storage quota – so you need to manually delete the older files manually to free up space for newer data. With multiple users working on your shared data, won’t it be challenging to identify what data is too old and delete it manually? Until of course you buy a storage quota which is multiple times of your actual storage requirement, so you never have to delete anything!
    2. In addition, many organizations need to abide with the data storage laws which stipulate which geographical location to store data and even the maximum time customer data can be retained by a business. You don’t have any such control with Dropbox.
  5. Scheduling uploads for making them efficient and unobtrusive: One key issue for many businesses while considering cloud backup is the lack of adequate internet bandwidth. During normal business hours there is only so much bandwidth which you can devote for data backups. This is why many administrators like to schedule the backup uploads to run during the idle times such as weekends.

    Telling Dropbox when to sync is not possible, and even if it is made possible, it surely defeats the whole purpose of using such a sync tool. Yet another problem (feature!) with Dropbox is that it immediately syncs every change of your data. So if you make frequent changes to your files during the day, each of them will be synced across all your devices thus wasting your bandwidth, even though you may have just wanted to make a copy of your file at the end of the day. Again, for syncing and sharing this “churn” is the necessity and one of the core benefits of Dropbox but for backups, it is nothing but “noise” which is wasteful and disruptive for your normal business network traffic.

As you can see, the above list is by no means an exhaustive one. As you go deeper into this, more such differences pop up. But the question is – is that surprising? Given that Dropbox was conceived, designed and implemented to solve the need of syncing and sharing and not robust cloud backup, isn’t trying to do the latter is more of a “hack” than a true solution?

And did I mention that we have a webinar coming up on Dec 7th, 2011 in which we will be discussing how to get the maximum out of your ZCB installation and will also be taking some of the above issues for discussion? Please register for this webinar here. Hope to see you then!


Die nächste Generation der Sicherung in dezentralen Rechenzentren (Cloud) – ZCB 4 ist hier!

August 31st, 2011

Heute haben wir die sofortige Verfügbarkeit von Zmanda Cloud-Backup (ZCB) 4, unserem umfassenden Sicherungsprogramm für Windows-Server, Desktops und Laptops angekündigt. Das ermöglicht Ihnen, Ihre Daten und Systeme in dezentraken Rechenzentren (Cloud) zu sichern. Innerhalb der letzten Wochen ist ZCB 4 als ein begrenztes Beta-Programm ausgiebig von vielen Endkunden und Wiederverkäufern getestet worden. Wir haben ein tolles Feedback, das zu Fehlerbehebungen und vielen Verbesserungen geführt hat, bekommen. Vielen Dank an alle, die teilgenommen haben!

Die Idee der Sicherung auf Cloud ZCB 4 ist ein zukunftsweisender Schritt nach vorn. Ja, wir sind in diesem Punkt sehr selbstbewusst, aber dieses Selbstbewusstsein ruht auf dem Feedback von Tausenden von Zmanda-Kunden.

Da wir die auf dem Market verfügbaren Lösungen und die Wünsche der Anwender stets im Auge behalten haben, war es uns möglich, verschiedene Lücken in mehreren Sicherungsprogrammen zu identifizieren und sie in ZCB 4 zu beheben.

  1. Flexibilität durch die Wahl, wo die gesicherten Daten gespeichert werden sollen:

    Die Nutzer einer Sicherung in der Cloud (Cloud Backup) haben unterschiedliche Bedürfnisse.

    Manche möchten eine zusätzliche Schutzvorsorge für ihre Daten treffen und somit die Sicherungskopien sowohl auf einem lokalen Datenträger als auch in einem dezentralen Rechenzentrum (Cloud) speichern.

    Für andere Nutzer soll die Cloud als ihre primäre und einzige Speicherstelle dienen und somit brauchen sie eine Lösung, die ihre Daten sichert und direkt auf Cloud speichern kann. Genau das war die erste Anwendung von ZCB – eine Sicherungskopie direkt auf Cloud zu speichern. Vorher wurden die Sicherungskopien auch auf lokalen Festplatten gespeichert. Eine der Verbesserungen die Sie in ZCB 4 vorfinden werden, ist die Möglichkeit, eine Sicherungskopie direkt in der Cloud zu erstellen, ohne ihren lokalen Speicherplatz in Anspruch zu nehmen.
    Diese Erweiterung ermöglicht es Ihnen, auch bei einer geringen Kapazität Ihrer Speicherplatte, Ihre Daten zuverlässig zu sichern!

    Sie können also Ihre Sicherungskopien lokal speichern und dann in die Cloud hochladen. Die Daten auf der Festplatte könnten dann gelöscht oder beibehalten werden, oder Sie erstellen direkt eine Sicherungskopie in der Cloud, je nachdem welche der beiden oben genannten Anwendungsfälle für Sie in Frage kommen.

    Cloud Backup new operation

  2. Die Verbesserung der Übertragungsgeschwindigkeiten: Benutzer, die entweder eine große Menge von Daten zu sichern oder weniger Internet-Bandbreite haben, sind durch ein grundlegendes Problem betroffen – wie kann man die Daten in die und von der Cloud innerhalb der geforderten Fristen übertragen? Wir entdeckten auch Fälle, in denen Nutzer die notwendige Bandbreite zur Verfügung hatten, die Sicherungsprogramme aber entweder nicht in der Lage waren die Bandbreite vollständig zu nutzen oder die Anbieter der Sicherungslösungen bestimmte Beschränkungen für die Hoch- und Runterlade- Geschwindigkeiten vorgeschrieben hatten.

    ZCB setzt keine Grenzen für die Übertragungsgeschwindigkeit und ist immer bemüht, aus allen Ressourcen maximalen Nutzen zu erzielen. Mit ZCB 4 haben wir es möglich gemacht, verschiedene Pfade fürs Hoch- und Runterladen zu nutzen. Diese Eigenschaft erlaubt mehrere Verbindungen zu Amazon S3 Cloud gleichzeitig zu nutzen uns so so die Bandbreite, die Sie schon immer zur Verfügung gehabt haben. Und so haben wir getreu unserem Versprechen, die Flexibilität für die Nutzer zu erweitern, diese Eigenschaft komplett konfigurierbar gemacht.

    Cloud Backup multithreading

    Standardmäßig verwenden wir drei gleichzeitige Verbindungen für die Datenübertragung. Wenn Sie möchten, können Sie diesen Wert ändern um zu experimentieren und herauszufinden, was in Ihrer Arbeitsumgebung an besten funktioniert. Eine höhere Pfadenanzahl kann vorteilhaft sein, wenn Sie eine freie Bandbreite und CPU-Ressourcen für die Datensendung und den Datenempfang zur Verfügung haben.

  3. Verwaltbarkeit und Benutzerfreundlichkeit: Wir glauben, dass die Benutzerfreundlichkeit der Kern der Idee einer Sicherung von Daten in der Cloud ist. Da es hier um die wichtigen Entscheidungen – wann, wo und wie Ihre Daten/Systeme gesichert werden sollen – geht, müssen Anwender volle Freiheit haben, um diese Entscheidungen treffen zu können. Andererseits soll es nicht zu schwer sein, die Sicherungsvorgänge zu konfigurieren und zu beobachten. So haben wir in ZCB 4, haben wir unsere Benutzeroberfläche neugestaltet und manche Verbesserungen gemacht, um die Bedienbarkeit intuitiver und einfacher zu machen. Bitte schauen Sie sich die neuen ZCB Screenshots an.

Zusätzlich zu den oben genannten Eigenschaften hat ZCB 4 Folgendes zu bieten:

  • Vollständige Lokalisierung auf Deutsch
  • Sicherung /Wiederherstellung von selektiven Datenbanken in SQL-Servern und Exchange-Servern
  • Differentielle Sicherung von SharePoint-Servern
  • Parallel-Operationen über mehrere Sicherungssets
  • Umfassende Berichte über mehrere Sicherungssets
  • Hunderte von weiteren Verbesserungen

ZCB 4 bringt eine umfassende, flexible und praktische Lösung für die Daten- und System-Sicherung, sowohl auf lokalen Datenträgern als auch in der Cloud auf den Markt.

Wir arbeiten intensiv an der nächsten ZCB-Version und werden in Kürze einige spannende Ankündigungen machen. Wenn Sie weitere Fragen oder einen Vorschlag für uns haben sollten, wenden Sie sich bitte an!

Introducing ZCB 4 – Next Generation Cloud Backup!

August 29th, 2011

Today, we announced the immediate availability of Zmanda Cloud Backup (ZCB) 4, our comprehensive cloud backup offering for Windows servers, desktops and laptops. ZCB 4 had been under a limited beta program for past few weeks and was extensively tested by many end users and resellers. We received some great feedback which led to many improvements and bug fixes. Thank you all who participated!

ZCB 4 is a huge step forward for the idea of cloud backup. And after working with many ZCB users for a while, we say this with a lot of conviction. We took a hard look at what users wanted to achieve with cloud backup and compared that with the available solutions on the market. We identified various gaps, which are addressed in ZCB4:

  1. Flexibility of choosing where to store backup data: Users of cloud backup have different needs. Some are embracing it as an extra line of defense and hence want to backup to both on-premises disks and on the cloud. On the other hand, some users are looking at cloud storage as their primary and only backup location and hence are looking for a solution which backs up data only to the cloud.ZCB had this covered from day 1. The data was always first backed up to disk and then to cloud. The data on disk could then be deleted or retained, depending on which of the above two use cases you wanted to deploy. But we realized that we could improve it further by offering a “backup to cloud” operation which would back up your data to cloud directly without using any temporary local storage. So, if you are short on disk space or don’t want backup on disk, then this operation would be handy.

    Cloud Backup new operation

  2. Improving transfer speeds: Users who have either a lot of data to backup or less Internet bandwidth to use, are hit by a basic problem – how to transfer data to and from cloud within required time limits? We also discovered cases where users had the bandwidth available and met their part of the bargain, but the backup solution was either not capable to use the bandwidth completely or the backup provider imposed limits to the upload/download speeds for the users.ZCB never imposed any transfer limits, whatsoever, on upload/download speeds and always tried hard to maximize the throughput. In ZCB 4, we made it even better by adding support for multithreaded uploads/downloads. This feature makes ZCB use multiple concurrent connections to Amazon S3 cloud and hence unlocks the bandwidth you always had available. And true to our promise of providing flexibility to users, we have made this feature entirely configurable:

    Cloud Backup multithreading

    So by default, we use 3 concurrent connections for data transfer. If you wish you can tweak this value to experiment and find out what works best in your work environment. Higher thread count may be beneficial if you have spare bandwidth and CPU/memory resources to push or pull data.

  3. Manageability and usability: We believe usability is core to the idea of cloud backup, as it involves critical decisions about when/where/how you backup your systems. Users need to be given full freedom to make these decisions and yet it shouldn’t be hard to configure and monitor the solution. Though our users and experts always rated us high in this area, we realized that the user interface  needed to be ready to handle our planned rapid growth, both in terms of product features as well as customer use cases. So in ZCB 4, we redesigned our user interface, made many workflow improvements and made it much more intuitive and easier to use. To see it in action, you can view the new ZCB screenshots.

In addition to the above, ZCB 4 also offers:

  • Backup/Restore of selective databases in SQL Server and Exchange server
  • Differential backup of SharePoint server
  • Parallel operations across multiple backup sets
  • Extensive Reporting across multiple backup sets
  • Hundreds of other improvements

ZCB 4 is also available in German and Japanese languages. For more details on ZCB 4, please refer to the release notes page.

ZCB 4 brings to the market a comprehensive, flexible and practical cloud backup solution. Also, as we gain scale, we are also working on the pricing (in case you didn’t notice, we recently announced the 25 GB free tier, perhaps a first in the industry) to make it affordable to a bigger set of users.

We are already working very aggressively on our next releases and will soon be making some exciting announcements. If you have a suggestion for us, please do drop me a line at

Zmanda Cloud Backup adds Tokyo as its latest cloud storage location

March 16th, 2011

We are adding support for Asia Pacific (Tokyo) Region in Zmanda Cloud Backup (ZCB). This is the fifth worldwide location supported by ZCB.

This support provides faster uploads for ZCB users in Japan. Throughput will be significantly higher because of less hops along the way and very high bandwidth connections typically available in Japan. Overall processing will be faster because of lower latency (expected to be single digit millisecond latency for most end users in Japan).

Cloud Backup to Three Continents Now Includes Japan

Cloud Backup to Three Continents Now Includes Japan

This support enables users to ensure that their data does not leave Japan, e.g. if required for compliance reasons.

In summary, users in Japan now have an effective and scalable solution to backup their Windows Filesystems, Microsoft Applications and Databases (MySQL, SQL Server, Oracle) to a robust storage cloud

As an introductory offer to our customers in Japan, we are waiving all transfer and storage charges to the Tokyo location until April 30th, 2011. You only pay for the initial setup fees ($4.95) and pro-rated monthly fees ($4.95 per month). After April 30th, our regular charges will apply at par with all other supported regions.

There is more on the horizon for our Japanese customers. We are soon going to offer a fully localized Japanese version of ZCB (Current shipping version has already been tested with Japanese file and folder names). Watch this space for an announcement on that within few weeks.

Zmanda Cloud Backup with Japanese Files/Folders

MySQL Backup Webinar Series: Scalable backup of live databases

October 14th, 2010

mysql logo

Setting up of a good backup and recovery strategy is crucial for any serious MySQL implementation. This strategy can vary from site to site based on various factors including size of the database, rate of change, security needs, retention and other compliance policy etc. In general, it is also required from MySQL DBAs to have least possible impact on usability and performance of the database at the time of backup – i.e MySQL and its dependent applications should remain hot during backup.

Join MySQL backup experts from Zmanda for two webinars dedicated to hot backup of MySQL:

MySQL Backup Essentials: In this webinar, we will go over best practices for backing up live MySQL databases. We will also cover Zmanda Recovery Manager (ZRM) for MySQL product in detail, including a walk through the configuration and management processes. We will discuss various features of ZRM including full backups using snapshots, point-in-time recovery, monitoring and reporting.

Register for MySQL Backup Essentials Webinar on November 23rd at 10:00AM PT

MySQL Backup to Cloud: In this webinar, we will focus on backing up MySQL databases running on Windows to the cloud. Cloud Storage provides an excellent alternative to backing up to removable media and shipping it to remote secure site. We will provide live demonstration of the Zmanda Cloud Backup (ZCB) product backing up MySQL to Amazon S3 storage. ZCB is an easy-to-use cloud backup solution which supports all Windows platforms. We will also discuss recovering MySQL database in the cloud, creating a radically low cost disaster recovery solution for MySQL.

Register for MySQL Backup to Cloud Webinar on November 30th at 10:00AM PT