Archive for October, 2012

Zmanda - A Carbonite Company

Wednesday, October 31st, 2012

I am very excited to share that today Zmanda has combined forces with Carbonite - the best known brand in cloud backup. I want to take this opportunity to introduce you to Carbonite and tell you what this announcement means to the extended Zmanda family, including our employees, customers, resellers and partners.

First, we become “Zmanda - A Carbonite Company” instead of “Zmanda, Inc.” and I will continue to lead the Zmanda business operations. Carbonite will continue to focus on backing up desktops, laptops, file servers, and mobile devices. Zmanda will continue to focus on backup of servers and databases. Carbonite’s sales team will start selling Zmanda Cloud Backup directly and through its channels. Since Carbonite already has a much larger installed base of users and resellers, our growth should accelerate considerably next year which will allow us to innovate at an even higher level than before. Zmanda’s direct sales team and resellers will continue to offer the Zmanda products they respectively specialize in.

I’ve gotten to know Carbonite over the last few months and I am very impressed with their organization and am looking forward to joining the management team. One of the things that attracted me to Carbonite was its commitment to customer support. Carbonite has built a very impressive customer support center in Lewiston, Maine, about a two hour drive north of their Boston headquarters, where it now employs a little over 200 people. We’ll be training a technical team in Maine to help us support Zmanda Cloud Backup, and of course we’ll also be keeping our support teams in Silicon Valley and in Pune, India for escalations and support of Amanda Enterprise and Zmanda Recovery Manager for MySQL. Please note that at this point, all our current methods of connecting with customer support including Zmanda Network, will continue as is.

Another thing that makes Carbonite a good fit for us is its commitment to ease-of-use. Installing and operating Carbonite’s backup software is as easy as it gets. We share this goal, and we hope to learn a thing or two from the Carbonite team on this front - as we continue to build on aggressive roadmap of all our product lines.

We’ve worked hard to make Zmanda products as robust as possible. Our technologies, including our contributions to the open source Amanda backup project, have been deployed on over a million systems worldwide. Amanda has hundreds of man years of contributed engineering. We believe it is one of the most solid and mature backup systems in the world. Much of what we have done for the past five years has been to enhance the open source code and provide top notch commercial support. Carbonite, too, understands that being in the backup business requires the absolute trust of customers and I believe that every day the company works hard to earn that trust: it respects customer privacy, is fanatical about security, and has made a real commitment to high quality support.

I and the other Zmanda employees are very enthusiastic and proud to be joining forces with Carbonite. We look forward to lots of innovation in the Zmanda product lines next year and hope that you will continue to provide us with the feedback that has been so helpful in the evolution of our products.

Swift @ OpenStack Summit 2012

Thursday, October 25th, 2012

We just came back from OpenStack Summit 2012 in San Diego.  Summit was full of energy and rapid progress of OpenStack project, on both technical and business fronts, was palpable.

Our participation was focused around OpenStack Swift, and here are three notable sessions (including our own!) on the topic:

(1) COSBench: A Benchmark Tool for Cloud Object Storage Service: Folks from Intel presented how they designed and implemented a Cloud Storage benchmark tool, called COSBench (Cloud Object Storage Benchmark), for OpenStack Swift. In our previous blog, we briefly introduced COSBench and our expectation of this tool becoming the de facto Swift benchmarking tool in the future. In this session, the presenter also demonstrated how to use COSBench to analyze the bottleneck of a Swift cluster when it is under certain workload. The most promising point in this session is the indication that COSBench is going to be released to the open-source community. The slides for the session are available here.

(2) Building Applications with OpenStack Swift: In this very interesting talk from SwiftStack, a primer was provided on how to build web-based application on top of OpenStack Swift. The presentation team jumped into code-level to explain how to extend and customize Swift authentication and how to develop custom Swift middleware. The goal is to seamlessly support the integration between the web applications and Swift infrastructure.  A very useful presentation for developers who are thinking of how to make applications for Swift.

(3) How swift is you Swift?: Goal of this presentation (from Zmanda) was to shed light on the provisioning problem for Swift infrastructure. We looked at almost every hardware and software component in Swift and discussed how to pick up the appropriate hardware and software settings for optimizing the upfront cost and performance. Besides, we also talked about the performance degradation when a failure (e.g. node or HDD failure) happens. Our slides are available here.

All in all the Summit was a great step forward in the evolution of Swift.

If you are thinking of putting together a storage cloud, we would love to discuss your challenges and share our observations. Please drop us a note at  swift@zmanda.com

How swift is your Swift? Benchmarking OpenStack Swift.

Monday, October 8th, 2012

The OpenStack Swift project has been developing at a tremendous pace. The version 1.6.0 was released in August followed by 1.7.4 (Folsom) just after two months!  In these two recent releases, many important features have also been implemented, for example the optimization for using SSD, object versioning, StatsD logging and much more – many of these features have significant implications for performance planning for the cloud builders and operators.

As an integral part of deploying a cloud storage platform based on OpenStack Swift, benchmarking a Swift cluster implementation is essential before the cluster is deployed for production use. Preferably the benchmark should simulate the eventual workload that the cluster will be subjected to.

In this blog, we discuss following Swift benchmarking concepts:
(1)    Benchmark Dimensions for Swift cluster: performance, scalability and degraded-mode performance (e.g. when hardware and software failures happen).
(2)    Sample workloads for Swift cluster

Benchmark Tools for Swift

There are currently two Swift benchmark tools available: swift-bench and COSBench.

swift-bench is a command-line benchmark tool that is shipped along with Swift distribution. Recently,  we improved swift-bench to allow for random object sizes and better usability.

COSBench is a fairly new web-based benchmark tool, led by the researchers at Intel. Fortunately, we obtained a trial version of COSBench. Based on our initial experience with COSBench, we believe it represents a very helpful tool, and may become the the de facto Swift benchmarking tool in the future.

Benchmark Dimensions

Dimension 1 – Performance

The performance dimension is to measure the performance of the Swift cluster when it is under a certain load. The performance metrics can be specified in many ways. In most cases, the cloud operators will be interested in the following four performance metrics:

(1)    The average throughput (number of operations per second)
(2)    The average bandwidth (MB/s)
(3)    The average response time of all requests.
(4)    Response time for a certain percentage of requests (e.g. 95 percentile).

To measure the performance, we first need to populate a Swift cluster with some data (i.e. objects) to simulate an initial stage. The size of the initially loaded objects can be controlled by the inputs of the benchmark client. Subsequently, a pre-defined workload is executed against the Swift cluster while the performance is measured.

When measuring the performance, there is one key issue we need to pay attention to:  First, we need to carefully adjust the number of threads because it determines how much workload the benchmark clients will generate against the Swift cluster. Since we want to measure the performance of the Swift cluster when it is under load or saturated, we need to increase the number of threads, until the point at which the bandwidth/throughput becomes stable and the average response time starts to increase very sharply.

As the number of threads increases, the benchmark client will get busier. We need to make sure that it has enough resources (CPU, memory, network bandwidth) to use and should not be the performance bottleneck.

While the performance of the client software (Cyberduck, Cloud Backup software etc.), that is connecting with Swift, is an important factor in the overall usability of the storage cloud, the scope of this blog is the performance of the storage cloud platform itself.

Dimension 2 – Scalability

The benchmark on scalability is to test if a Swift cluster can scale out gracefully by adding more servers and other resources. We can conduct this benchmark in the following steps:  we proportionally add more servers for each type of node in the Swift cluster. For example, we double the number of the storage nodes and proxy nodes with the same hardware and software configurations. Then, we run the same workloads to measure the performance. If a Swift cluster can scale out nicely, then its bandwidth/throughput will be increased in proportion to the number of new servers we added in. Otherwise, the cloud operators should analyze what is the bottleneck to prevent it from scaling well.

To simulate a real-world scenario, we need to test the scalability of a Swift cluster while it is running. As suggested by a blog from SwiftStack, cloud operators may consider adding new servers gradually in order to avoid the performance degradation because of the data movement between the existing and new servers. During the measurement, we want to observe: (1) if the Swift cluster operates normally (i.e. no period of service disruption) and (2) the increase on performance when the new servers are added into the Swift cluster.

Dimension 3 – Degraded Mode Performance

The cloud operators will face hardware or software failures at some points. If their objective is to ensure that their clusters will perform at a certain level (e.g. abide by the performance SLA) even in face of the failures, they should benchmark their Swift cluster appropriately upfront.

The most straightforward way to measure the availability of a Swift cluster is to intentionally shut down some nodes and measure the number of errors (e.g. failed operations) and performance degradation when the Swift is running in the degraded mode.

There are some factors that increase the complexities of benchmarking the degraded Swift cluster. For example, the failures can happen at every possible system level. For example, I/O devices, OS, Swift processes or even the entire server. The impact of failures is different when they occur at different levels. So, the failure scenarios at all system levels need to be considered. Such as, to simulate a disk failure, we may intentionally umount the disk; To simulate a Swift process failure, we need to kill some or all Swift processes on a node; To simulate an OS or entire server failure, the server could be temporarily powered off; Or a whole zone could be powered off (to simulate power failure of an entire rack of servers).

By combining the above considerations together, we notice that the total problem space for analyzing all failure scenarios may be very huge for a large-scale Swift cluster. So, it is more practical to prioritize those failure scenarios. For example, only the worst scenarios or more common scenarios are evaluated first.

In our presentation at the coming OpenStack Summit, we will present our empirical results to show how a Swift cluster performs when the hardware failures occur.

Sample Workloads

The COSBench tool allows users to define a Swift workload based on the following two aspects: (1) range of the object sizes in the workload (e.g. from 1MB to 10MB). (2) the ratio of PUT, GET and DELETE operations (e.g. 1:8:1).

The object sizes in a workload may have certain distributions. For example, uniform, Zipfan and more. At this point, based on our experiences with COSBench, it assumes the object sizes are uniformly distributed within the pre-defined range. Plus, it assumes all objects have the equal possibility to be accessed by the GET operation. It may be a good direction for COSBench to add more choices on the distribution when the users want to specify the object size and access pattern.

In the following table, we provide some sample Swift workloads in the following table.

Upload Intensive

Download Intensive

Small Objects (size range:1KB-100KB)

GET: 5%, PUT: 90%, DELETE:5%

Example: Online gaming hosting service — the game sessions are periodically saved as the small files which record the user profiles and game information in the order of the time series.

GET: 90%, PUT: 5%, DELETE:5%

Example: Website hosting service — once a new webpage is published by the owner, lots of read requests will hit on the new webpage.

Large Objects (size range:1MB – 10MB)

GET: 5%, PUT: 90%, DELETE:5%

Example: Enterprise Backup — small files are compressed into large trunk of data and backed up to cloud storage. Occasionally, the recovery and delete operations are needed.

GET: 90%, PUT: 5%, DELETE:5%

Example: Online video sharing service — once the new video clips are uploaded, lots of download traffic will be generated when people watch those new video clips.

Plus, the benchmark users are free to define their own favorite workloads based on the two inputs: range of object sizes and ratio between PUT, GET and DELETE operations.

We will discuss above dimensions and benchmarks workloads in detail in future blogs, as well as at our presentation at the OpenStack Summit in San Diego (Presentation at 4:10PM on October 18th). We hope to see you there.

If you are thinking of putting together a storage cloud, we would love to discuss your challenges and share our observations. Please drop us a note at  swift@zmanda.com