Archive for the ‘Open Source’ Category

Amanda Enterprise 3.3.5 Release with Hyper-V and Vaulting Support

Friday, December 6th, 2013

Our objective with Amanda Enterprise is to continue to widen its umbrella to protect most common IT infrastructure elements being deployed in the modern data center. We also seek to support various workflows for secondary and tertiary data as desired by the IT administrators. The new release of Amanda Enterprise (3.3.5) comes with two key features for these objectives.

Hyper-V Support

Hyper-V is a native hypervisor, which is getting increasingly popular especially in Windows server-centric IT shops. Amanda Enterprise now supports image level backup of live VMs running on Hyper-V. We support Hyper-V which is installed as a role either on Windows Server 2012 or Windows Server 2008.

One license for Zmanda Windows Client and Zmanda Client for Microsoft Hyper-V enables you to backup an unlimited number of VMs on one hypervisor. You need to run the Zmanda Windows Client on the hypervisor, and configure backups of VMs via Zmanda Management Console (GUI for Amanda Enterprise). ZMC discovers the VMs running on each hypervisor. You can choose whether you want to backup all or a selected set of VMs on each hypervisor.

A full consistent image of the VM is backed up to media of your choice - disk, cloud storage or tape.

Of course, you can continue to install and run separately licensed Zmanda Clients within your virtual machines (guests) to perform finer granular backups. This enables you to recover from smaller disasters (like accidental deletion of a file) without recovering the whole VM. You can consider having different schedules for your file-level or application-level backups and VM level backups - e.g. file-level backups may be performed daily and VM level backups may be performed each weekend.

Vaulting Support

While many of our customers have already implemented disk-to-disk-to-cloud (d2d2c) or disk-to-disk-to-tape (d2d2t) data flows using our products, we have formally introduced Vaulting as a feature supported via Zmanda Management Console. There are two key use cases supported by this feature:

  • Backup to disk and then Vault to lower cost disk or remote disk (via NFS)
  • Backup to disk and then Vault to cloud or tape for offsite storage

Our implementation allows very flexible media duplication (e.g., you can vault from multiple disk volumes to a smaller number of physical tapes). You can also choose to vault only the latest full backups for off-site archiving purposes. All storage devices supported by Amanda Enterprise can be a source or destination for vaulting (e.g., for one of the backup sets you can vault to a tape library and for another backup set on the same backup server you can vault to cloud storage). The Backup catalog of Amanda Enterprise keeps tracks of various backup copies and will allow for restoration from either the backup media volumes or vaulted volumes.

Note that vaulting is most useful when you want to duplicate your backup images. If you want to have a data flow to periodically move your backups e.g. from disk to cloud, you can simply use the staging area in Amanda Enterprise and treat the storage on staging area as your first backup location.

In addition to above new features, Amanda Enterprise 3.3.5 has several bug fixes and usability improvements. If you are interested in learning more, you can join one of our upcoming webinars (next one scheduled on December 18th): http://zmanda.com/webinars.html

If you are already on the Amanda Enterprise platform, you can discuss upgrading to 3.3.5 with your Zmanda support representative. If you are interested in evaluating Amanda Enterprise for your organization, please contact us at zsales@zmanda.com

ZRM Community Edition 3.0 (Beta) with Parallel Logical Backup Support for MySQL

Tuesday, April 23rd, 2013

We are pleased to announce the release of Zmanda Recovery Manager (ZRM) Community Edition 3.0 (Beta). This release features support for parallel logical backups as an additional full backup method, which is made possible by integrating with the mydumper open source project.  This backup method represents a faster, scalable way to backup large databases.  The mysqldump (single threaded logical backup) support is still available for backing up stored procedures/routines.  ZRM Community Edition allows you to create multiple backup sets with different backup methods and policies; so, now you can do MySQL database backups with mydumper, as well as mysqldump in the same server.

We have also made many additional improvements and bug fixes since our earlier 2.2 release. We currently plan to release a final version of ZRM Community Edition 3.0 later this quarter, and in the meantime, we look forward to your feedback on the Zmanda forums.

Amanda Enterprise 3.3 brings advanced backup management features

Wednesday, March 20th, 2013

Built on extensive research and development, combined with active feedback from a thriving open source community, Amanda Enterprise (AE) 3.3 is here! AE 3.3 has significant architecture and feature updates and is a robust, scalable and feature-rich platform that meets the backup needs of heterogeneous environments, across Linux, Windows, OS X and Solaris-based systems.

As we worked to further develop Amanda Enterprise, it was important to us that the architecture and feature updates would provide better control and management for backup administration.  Our main goal was to deliver a scalable platform which enables you to perform and manage backups your way.

Key enhancements in Amanda Enterprise include:

Advanced Cloud Backup Management: AE 3.3 now supports use of many new and popular cloud storage platforms as backup repositories. We have also added cloud backup features to give users more control over their backups for speed and data priority.

 Backup Storage Devices Supported by Amanda Enterprise 3.3


Backup Storage Devices Supported by Amanda Enterprise 3.3

Platforms supported now include Amazon S3, Google Cloud Storage, HP Cloud Storage, Japan’s IIJ GIO Storage Service, and private and public storage clouds built on OpenStack Swift. Notably, AE 3.3 supports all current Amazon S3 locations including various locations in US (including GovCloud), EU, Asia, Brazil and Australia.

Cloud Storage Locations Supported by Amanda Enterprise


Cloud Storage Locations Supported by Amanda Enterprise

In addition to new platforms, now, you can control how many parallel backup (upload) or restore (download) streams you want based on your available bandwidth. You can even throttle upload or download speeds per backup set level; for example, you can give higher priority to the backup of your more important data.

Optimized SQL Server and Exchange Backups: If you are running multiple SQL Server or Exchange databases on a Windows server, AE 3.3 allows selective backup or recovery of an individual database. This enables you to optimize the use of your backup resources by selecting only the databases you want to back up, or to improve recovery time by enabling recovery of a selected database. Of course, the ability to do an express backup and recovery of all databases on a server is still available.

Further optimizing, Zmanda Management Console (which is the GUI for Amanda Enterprise) now automatically discovers databases on a specific Windows server, allowing you to simply pick and choose those you want to backup.

Improved Virtual Tape and Physical Tape Management: Our developers have done extensive work in this area to enhance usability, including seamless management of available disk space. With extensive concurrency added to the Amanda architecture, you can eliminate using the staging disk for backup-to-disk configurations. AE 3.3 will write parallel streams of backups directly to disk without going through the staging disk. You can also choose to optionally configure staging disk for backup to tapes or clouds to improve fault tolerance and data streaming.

Better Fault Tolerance: When backing up to tapes, AE 3.3 can automatically withstand the failure of a tape drive. By simply configuring a backup set to be able to use more than one tape drive in your tape library, if any of the tape drives is not available, AE will automatically start using one of the available drives.

NDMP Management Improvements: AE 3.3 allows for selective restore of a file or a directory from a Network Data Management Protocol (NDMP) based backup. Now, you can also recover to an alternative path or an alternative filer directly from the GUI. Support for compression and encryption for NDMP based backups has also been added to the GUI. Plus, in addition to devices from NetApp and Oracle, AE now also supports NDMP enabled devices from EMC.

Scalability, Concurrency and Parallelism: Many more operations can now be executed in parallel. For example, you can run a restore operation, while active backups are in progress. Parallelism also has been added in various operations including backup to disk, cloud and tapes.

Expanded Platform Support: Our goal is to provide a backup solution which supports all of the key platforms deployed in today’s data centers. We have updated AE 3.3 to support latest versions of Windows Server, Red Hat Enterprise Linux, CentOS, Fedora, Ubuntu, Debian and OS X. With AE, you have flexibility of choosing the platforms best suited for each application in your environment – without having to worry about the backup infrastructure.

Want to Learn More?

There are many new enhancements to leverage! To help you dive in, we hosted a live demonstration of Amanda Enterprise 3.3. The session provides insights on best practices for setting up a backup configuration for a modern data center.

Zmanda Recovery Manager for MySQL - What’s New in Version 3.5

Wednesday, March 20th, 2013

As we continue to see MySQL being implemented in bigger and more challenging environments, we are working to ensure Zmanda Recovery Manager for MySQL (ZRM) matches this growth and provides a comprehensive, scalable backup management solution for MySQL that can easily integrate into any network backup infrastructure.

The latest release of ZRM for MySQL is a significant next step, bringing disk space and network usage optimization and enhanced backup reporting, along with simplified management to help configure backups quickly and intelligently.   Additionally, ZRM for MySQL 3.5 now supports backup of MySQL Enterprise Edition, MySQL Community Edition, SkySQL, MariaDB, and MySQL databases running on latest versions of Red Hat Enterprise Linux, CentOS, Debian, Ubuntu and Windows – giving you an open choice for your MySQL infrastructure, now and in future, with confidence that your backup solution will continue to work.

Here is a look at the key updates in ZRM:

Optimization of Disk Space: We’ve implemented streaming for various backup methods so that you don’t need to provide additional disk space on the systems running MySQL servers. This will allow you to do hot backup of your MySQL databases without having to allocate additional space on the system running MySQL. Backup data will get directly stored on the ZRM server.

Optimization of Network Usage: We have implemented client-side compression for various backup methods so you can choose to compress backup data even before it is sent to the ZRM server. Of course, you also have the choice to compress on the backup server; for example, if you don’t want to burden the MySQL server with backup compression operation.

Enhanced Backup Reporting: Backup is often where IT meets compliance. ZRM allows you to generate backup reports for all of the MySQL databases in your environment. With the latest version, now you can generate unified backup reports across backup sets too.

Simplified Management: One of the key features of ZRM is that it hides nuances of particular types of backup method for MySQL behind an easy-to-use GUI, the Zmanda Management Console (ZMC). With the new release, ZMC brings new features for applicable backup methods, such as parallelism, throttling, etc. You will also find several tool tips to help you configure your backups quickly and intelligently, without having to dig through documentation on specific backup methods.

Broad Platform Coverage: MySQL gets implemented in various shapes and forms on various operating systems. We continue to port and test all variants of MySQL on all major operating system platforms. ZRM 3.5 supports backup of MySQL Enterprise Edition, MySQL Community Edition, SkySQL and MariaDB. Backup of MySQL databases running on latest versions of Red Hat Enterprise Linux, CentOS, Debian, Ubuntu and Windows is supported.

Seamless Integration with Backup Infrastructure: ZRM is architected for the MySQL DBAs. In order for DBAs to integrate and comply with the overall backup methodology of their corporate environment, we have made sure that ZRM can integrate well into any of the network backup infrastructures being used. While ZRM is already known to work well with almost all network backup environments, we have completed specific integration and testing of ZRM 3.5 with Amanda Enterprise, Symantec NetBackup, and Tivoli Storage Manager.

If you are putting together a new MySQL based environment, or looking to add a well managed backup solution to your existing MySQL infrastructure, our MySQL backup solutions team is ready to help: zsales@zmanda.com

Quota Project: An effective way to manage the usage of your Swift-based storage cloud

Thursday, January 31st, 2013

During the OpenStack Folsom Design Summit in April 2012, there was an interesting workshop discussion on Swift Quota. This topic has been actively and formally discussed in many forums (Link1, Link2) and also regarded as one of the blueprints in OpenStack Swift. Here are some of our key takeaways and insights on what this means for your storage cloud.

Swift Quota: Business Values

The business value of implementing Swift Quota is two-fold:

(1) Protect the Cluster: Cloud operators can conveniently set some effective limits, (e.g. limit on the number of objects per container), to protect the Swift cluster from many malicious behaviors, for example, creating millions of 0-byte objects to slow down the container database, or creating thousands of empty containers to overload the account database.

(2) Manage Storage Capacity: Cloud storage providers can sell their cloud storage capacity upfront, which is similar to the Amazon EC2 reserved instance price model: the provider can sell a fixed amount of storage capacity (e.g., 1TB) to a customer by setting up a capacity limit for that customer and would not be concerned with how the customer uses the storage capacity (e.g., use 100% capacity all the time, or use 50% capacity today and 95% capacity next month). The vendor will simply charge the customer based on the fixed amount of storage capacity (and possibly other resource usages, such as the number of PUT, GET and DELETE operations) and would not have to precisely track and calculate how much storage capacity is used by a customer on an on-going basis.

In summary, the reason Swift Quota is interesting to the cloud storage operators and providers is that it enables effective and robust resource (e.g. capacity) management and improves the overall usability of the Swift-based storage cloud.

Today, we would like to introduce an interesting Swift Quota project that we have been focusing on and which has been used in StackLab – a production public cloud for users to try out OpenStack for free. (Details about StackLab can be found at http://freedomhui.com/stacklab/

Swift Quota Introduction

Swift Quota is a production-ready project that is mainly used for controlling the usage of account and containers in OpenStack Swift. In the current version of Swift Quota, the users can set up the quotas on the following three items:

(1) Number of containers per account (example: an account cannot have more than 5 containers)

(2) Number of objects per container (example: a container cannot have more than 100 objects)

(3) Storage capacity per container (example: the size of a container cannot be larger than 100 GB)

Swift Quota is implemented as the middle layer in Swift, so it is simple and straightforward to integrate and merge with the mainstream Swift code. The idea of Swift Quota is not to create new separate counters to keep track of the resources usages, but to utilize the existing metadata associated with the containers and accounts. So it is very lightweight in the production environment.

Swift Quota Installation

Before we go any further, we’d like to thank AlexYuYang for his contribution to this project. The project is available at Alex’s github repository.

To install Swift Quota, you either check out the modified Swift code from the github repository above (git clone git://github.com/AlexYangYu/StackLab-swift.git) and switch to the branch called “dev-quota” (git checkout dev-quota). Then you install the modified Swift software on the cluster nodes, or you need to follow the commit history to figure out which changes are new and then merge them to your existing Swift code base.

Configuration File

To enable Swift Quota, /etc/swift/proxy-server.conf should be adjusted as following (bold words/lines highlight the new configuration settings),

[pipeline:main]
pipeline = catch_errors cache token auth quota proxy-server

[filter:quota]
use = egg:swift#quota
cache_timeout = 30
# If set precise_mode = true, the quota middleware will disable the cache.
precise_mode = true
set log_name = quota
quota = {
“container_count”: {
“default”: 5,
“L1″: 10,
“L2″: 25
},
“object_count”: {
“default”: 200000,
“L1″: 500000,
“L2″: 1000000
},
“container_usage”: {
“default”: 2147483648,
“L1″: 10737418240,
“L2″: 53687091200
}
}

From the above configuration settings, for each of the three resource quotas, there are 3 levels of limits: default, L1 and L2. Here, we want to provide a flexible and configurable interface for the cloud operator (e.g., reseller_admin) to specify quota level for each account. For example, the cloud operator can assign “L1” level quota to one account and “L2” level quota to a different account. If the quota level is not clearly specified, all accounts will strictly follow the “default” quota level. Cloud operators are free to define as many quota levels as they want for their own use cases. Next, we will show how to specify the quota level for an account.

Assigning Quota Level to an Account

We assume only the reseller_admin can modify the quota level for an account, so make sure you have a reseller_admin login in your authentication system. For example,

[filter:tempauth]
use = egg:swift#tempauth
user_system_root = testpass .admin http://your_swift_ip:8080/v1/AUTH_system
user_reseller_reseller = reseller .reseller_admin http:// your_swift_ip:8080/v1/AUTH_reseller

Then, we use this curl command to retrieve the X-Auth-Token of the reseller_admin

curl -k -v -H ‘X-Storage-User: reseller:reseller’ -H ‘X-Storage-Pass: reseller’ http://your_swift_ip:8080/auth/v1.0

Next, we use this curl command to edit the quota level of an account, called “system”. For example,

curl -v -X POST http://your_swift_ip:8080/v1/AUTH_system -H ‘X-Auth-Token: your reseller_admin token’ -H ‘X-Account-Meta-Quota: L1′

Note that, in the above curl command, ‘X-Account-Meta-Quota: L1′ is to assign L1 level quota to the account called “system”

Similarly, the following curl command will update the quota level to L2

curl -v -X POST http://your_swift_ip:8080/v1/AUTH_system -H ‘X-Auth-Token: your reseller_admin token’ -H ‘X-Account-Meta-Quota: L2′

If everything works correctly, you will receive a “204 No Content” response from the server after you issue the above curl commands.

Trade-off between Cluster Performance and Quota Accuracy

It is possible to trigger a quota check upon each PUT request to guarantee that no quota violation is allowed. However, when hardware resources are in short supply and the workload becomes very intensive, the check upon each PUT request may affect the Swift cluster performance. So, in the current design of Swift Quota, there are two parameters, called precise_mode and cache_time under [filter:quota] in /etc/swift/proxy-server.conf, that can effectively balance the cluster performance and quota accuracy.

When precise_mode is set to true, cache_time is not effective and the Swift cluster will check the quota upon each PUT request by reading the current container and account usage from the server. However, when precise_mode is set to false, the Swift cluster will only read the container and account usage that is cached in the memory. cache_time will then decide how often the cached information is updated via reading it from the server.

Closing Comments

We are happy to see that the Swift Quota has been in production in StackLab environment for almost 6 months and we believe Swift Quota is a neat and clear design that will be adopted by more Swift users.

If you are thinking of putting together a storage cloud, or thinking of introducing Quota to your Swift cluster, we would love to discuss your challenges and share our observations. Please drop us a note at swift@zmanda.com.

Backward Compatible Keystone-based OpenStack Swift

Thursday, January 10th, 2013

In a previous blog, we proposed a method to enable Cyberduck to work with Keystone-based Swift, which is to upgrade java-cloudfiles API to 2.0 in Cyberduck. We received lot of feedback on it, and we appreciate hearing your feedback. Today, we move one step forward and propose a more reliable and straightforward way to make your older Swift clients, such as Cyberduck, work with Keystone-based Swift.

The high-level idea of this new method is to add v1.0 authentication middleware in Keystone, while keeping the client, in this case Cyberduck, unchanged. Thanks to AlexYangYu for providing the v1.0 enabled Keystone code base;  it’s available at:

https://github.com/AlexYangYu/StackLab-Ketystone/tree/dev-protocol-convertor

In case you still want to use your own version of Keystone, rather than removing it and using the Keystone from above location, you need to follow the steps below:

First, add the following files to your existing Keystone code base:

https://github.com/AlexYangYu/StackLab-Ketystone/commit/9e126d6716912e8822de3884c32f5b9509ef0994

Then, after incorporating the middleware to support v1.0 authentication in Keystone, you need to recompile and install the modified Keystone code base.

Next, change the keystone configuration file (/etc/keystone/keystone.conf) as follows (bold lines highlight the differences from the default keystone.conf)

[composite:main]
use = egg:Paste#urlmap
/v2.0 = public_api
/v1.0 = public_api_v1
/ = public_version_api
[pipeline:public_api_v1]
pipeline = protocol_converter token_auth admin_token_auth xml_body json_body debug  ec2_extension public_service
[filter:protocol_converter]
paste.filter_factory = keystone.contrib.protocol_converter:ProtocolConverter.factory

Finally, you need to restart the keystone service.

To do this on the client side, you follow the standard configuration procedures traditionally used with v1.0 authentication. For Cyberduck, you can follow the steps here to set the Authenticate Context Path (ch.sudo.cyberduck cf.authentication.context /auth/v1.0).

We have verified this method on both PC and Mac platforms with the latest version of Cyberduck and other v1.0 authentication based Swift clients.

If you are thinking of putting together a storage cloud, we would love to discuss your challenges and share our observations. Please drop us a note at swift@zmanda.com.

Zmanda - A Carbonite Company

Wednesday, October 31st, 2012

I am very excited to share that today Zmanda has combined forces with Carbonite - the best known brand in cloud backup. I want to take this opportunity to introduce you to Carbonite and tell you what this announcement means to the extended Zmanda family, including our employees, customers, resellers and partners.

First, we become “Zmanda - A Carbonite Company” instead of “Zmanda, Inc.” and I will continue to lead the Zmanda business operations. Carbonite will continue to focus on backing up desktops, laptops, file servers, and mobile devices. Zmanda will continue to focus on backup of servers and databases. Carbonite’s sales team will start selling Zmanda Cloud Backup directly and through its channels. Since Carbonite already has a much larger installed base of users and resellers, our growth should accelerate considerably next year which will allow us to innovate at an even higher level than before. Zmanda’s direct sales team and resellers will continue to offer the Zmanda products they respectively specialize in.

I’ve gotten to know Carbonite over the last few months and I am very impressed with their organization and am looking forward to joining the management team. One of the things that attracted me to Carbonite was its commitment to customer support. Carbonite has built a very impressive customer support center in Lewiston, Maine, about a two hour drive north of their Boston headquarters, where it now employs a little over 200 people. We’ll be training a technical team in Maine to help us support Zmanda Cloud Backup, and of course we’ll also be keeping our support teams in Silicon Valley and in Pune, India for escalations and support of Amanda Enterprise and Zmanda Recovery Manager for MySQL. Please note that at this point, all our current methods of connecting with customer support including Zmanda Network, will continue as is.

Another thing that makes Carbonite a good fit for us is its commitment to ease-of-use. Installing and operating Carbonite’s backup software is as easy as it gets. We share this goal, and we hope to learn a thing or two from the Carbonite team on this front - as we continue to build on aggressive roadmap of all our product lines.

We’ve worked hard to make Zmanda products as robust as possible. Our technologies, including our contributions to the open source Amanda backup project, have been deployed on over a million systems worldwide. Amanda has hundreds of man years of contributed engineering. We believe it is one of the most solid and mature backup systems in the world. Much of what we have done for the past five years has been to enhance the open source code and provide top notch commercial support. Carbonite, too, understands that being in the backup business requires the absolute trust of customers and I believe that every day the company works hard to earn that trust: it respects customer privacy, is fanatical about security, and has made a real commitment to high quality support.

I and the other Zmanda employees are very enthusiastic and proud to be joining forces with Carbonite. We look forward to lots of innovation in the Zmanda product lines next year and hope that you will continue to provide us with the feedback that has been so helpful in the evolution of our products.

Swift @ OpenStack Summit 2012

Thursday, October 25th, 2012

We just came back from OpenStack Summit 2012 in San Diego.  Summit was full of energy and rapid progress of OpenStack project, on both technical and business fronts, was palpable.

Our participation was focused around OpenStack Swift, and here are three notable sessions (including our own!) on the topic:

(1) COSBench: A Benchmark Tool for Cloud Object Storage Service: Folks from Intel presented how they designed and implemented a Cloud Storage benchmark tool, called COSBench (Cloud Object Storage Benchmark), for OpenStack Swift. In our previous blog, we briefly introduced COSBench and our expectation of this tool becoming the de facto Swift benchmarking tool in the future. In this session, the presenter also demonstrated how to use COSBench to analyze the bottleneck of a Swift cluster when it is under certain workload. The most promising point in this session is the indication that COSBench is going to be released to the open-source community. The slides for the session are available here.

(2) Building Applications with OpenStack Swift: In this very interesting talk from SwiftStack, a primer was provided on how to build web-based application on top of OpenStack Swift. The presentation team jumped into code-level to explain how to extend and customize Swift authentication and how to develop custom Swift middleware. The goal is to seamlessly support the integration between the web applications and Swift infrastructure.  A very useful presentation for developers who are thinking of how to make applications for Swift.

(3) How swift is you Swift?: Goal of this presentation (from Zmanda) was to shed light on the provisioning problem for Swift infrastructure. We looked at almost every hardware and software component in Swift and discussed how to pick up the appropriate hardware and software settings for optimizing the upfront cost and performance. Besides, we also talked about the performance degradation when a failure (e.g. node or HDD failure) happens. Our slides are available here.

All in all the Summit was a great step forward in the evolution of Swift.

If you are thinking of putting together a storage cloud, we would love to discuss your challenges and share our observations. Please drop us a note at  swift@zmanda.com

Cyberduck with support for Keystone-based OpenStack Swift

Tuesday, August 28th, 2012

Cyberduck is a popular open source storage browser for several cloud storage platforms. For OpenStack Swift, Cyberduck is a neat and efficient client that enables users to upload/download objects to/from their Swift storage clouds. However, the latest version (4.2.1) of Cyberduck does not support  Keystone-based authentication method.  Keystone is an identity service used by OpenStack for authentication and authorization. We expect Keystone to be the standard identity service for future Swift clouds.

There has been intensive discussions on how to make Cyberduck work with Keystone-based Swift, for example [1], and this issue has been listed as the highest priority for the next release of Cyberduck.

So, we decided to dig into making Cyberduck work with Keystone-based Swift. First we start by thanking the Cyberduck team for making compilation tools available to enable this task. Second, special thanks to David Kocher for guiding us through the process.

The key is to make java-cloudfiles API support Keystone first, because Cyberduck needs java-cloudfiles API to communicate with Swift. We thank AlexYangYu for providing the initial version of the modified java-cloudfiles API that supports Keystone. We made several improvements based on that and our fork is available here:

https://github.com/zmanda/java-cloudfiles.git

The high-level steps are to replace the older cloudfiles-1.9.1.jar in the lib directory of Cyberduck with java-cloudfiles.jar that supports Keystone authentication. Besides, we also need to copy org-json.jar from the lib directory of java-cloudfiles to the lib directory of Cyberduck.

In order to make sure Cyberduck uses the modified java-cloudfiles API, Cyberduck needs to be re-compiled after making above changes. Generally, we need to follow the steps here to set the Authenticate Context Path. But, we need to add the following information to the AppData\Cyberduck.exe_Url_*\[Version]\user.config file

<setting name=”cf.authentication.context” value=”/v2.0/tokens” />

After that, we can run the re-compiled Cyberduck and associate it with a Swift cloud. For example,

In the field Username, we need to use the following style: username:tenant_name. The API Access Key is the password for the username. If the authentication information is correct, we will see that  Cyberduck has been successfully connected to the Keystone-based Swift Cloud Storage.

The following images show that you can use Cyberduck to save any kind of files, e.g. pictures and documents, on your Swift cloud storage. You can even rename any files and open them for editing.

You can download our version of Cyberduck for Windows with support for Keystone by running git clone https://github.com/zmanda/cyberduck or from here. Once the file is unzipped, you can execute cyberduck.exe to test against your Keystone-based Swift.

If you want to know more detail about how we made this work, or you would like to compile or test for other platforms, e.g. OS X, please drop us a note at swift@zmanda.com

Next Steps with OpenStack Swift Advisor - Profiling and Optimization (with Load Balancer in the Mix)

Sunday, April 22nd, 2012

In our last blog on building Swift storage clouds, we proposed the framework for the Swift Advisor - a technique that takes two of  the three constraints (Capacity, Performance, Cost) as  inputs,  and provides hardware recommendations as output - specifically count and configuration of systems for each type of node (storage and proxy) of  the Swift storage cloud (Swift Cloud). Plus, we also provided a subset of our initial results for the Sampling phase.

In this blog, we will continue the discussion on Swift Advisor, first focusing on the impact of the load balancer on the aggregate throughput of the cloud (we will  refer to it as “throughput”) and then provide a subset of outcomes for the profiling and optimization phases in our lab.

Load Balancer

The load balancer distributes the incoming API requests evenly across the proxy servers. As shown below, the load balancer sits in front of the proxy servers to forward the API requests to them and can be connected with any number of proxy servers.

load balancer

If a load balancer is used, it is the only entry point of the Swift Cloud and all user data goes through it. So it is a very important component to consider for user visible performance of your Swift Cloud. In case it is not properly provisioned, it will become a severe bottleneck that inhibits the scalability of the Swift Cloud.

At a high-level, there are two types of load balancers:

Software Load Balancer: Runs a software load balancing software (e.g. Pound, Nginx) or round robin DNS on a server to evenly distribute the requests among proxy servers. The server running the software load balancer usually requires powerful multi-core CPUs and extremely high network bandwidth.

Hardware Load Balancer: Leverages the network switch/firewall or dedicated hardware with capability of load balancing to assign the incoming data traffic to the proxy servers of Swift Cloud.

Regardless of whether a software or hardware load balancer is used, the throughput of the Swift cloud cannot scale beyond the bandwidth of the load balancer. Therefore, we advise the cloud builders to deploy a powerful load balancer (e.g. with 10 Gigabit Ethernet) so that its “effective” bandwidth  exceeds the expected throughput of the Swift cloud.  We recommend that you pick your load balancer so that with a fully loaded (i.e. 100% busy) Swift Cloud, the load balancer still has around 50% unused capacity for future planning or sudden needs of higher bandwidth.

To have a sense of how to properly provision the load balancer and how it impacts the throughput of Swift Cloud, we show some results of running the Swift Cloud of c proxy and cN storage server (c:cN Swift Cloud) with the load balancer. (N is the “magic” value for 1:N Swift Cloud found in Sampling phase). These results are the “performance curves” for the profiling phase and can be directed used for optimizing your goal.

The experiments

In our last article, we already used some running examples to show how to get the output results from the Sampling phase. Here, we directly use the outputs (1:N swift cloud) of sampling phase as the inputs of the profiling phase, as seen below,

  • 1 Large Instance based proxy node: 5 Small Instance based storage nodes (N=5)
  • 1 XL Instance based proxy node: 5 Small Instance based storage nodes (N=5)
  • 1 CPU XL Instance based proxy node: 5 Small Instance based storage nodes (N=5)
  • 1 Quad Instance based proxy node: 5 Medium Instance based storage nodes (N=5)

Based on the above 1:5 swift clouds, we profile the throughput curves of c:c5 Swift cloud (c = 2, 4, 6,…) with the following setups of load balancer:

  1. Using one “Cluster Compute Eight Extra Large Instance” (Eight) with  Pound (a reverse proxy, load balancer) as the software load balancer (”1 Eight”), that all proxy nodes are connected to. (Eight Instance is one-level more powerful than Quad Instance. Similar to the Quad Instance, it also equips 10Gigabit Ethernet, but has 2X amount of CPU resources, 2 x Intel Xeon ES-2670, eight-core “Sandy Bridge” architecture, and 2X of memory.)
  2. Using two identical Eight Instances (each runs with Pound) as the load balancers (”2 Eight”). 50% proxy nodes are connected to the first Eight Instance and another 50% proxy nodes are linked to the second Eight Instance. The storage nodes have no sense of the first and second half of proxy nodes and accept all data from all of the proxy nodes.

Again, we use Amanda Enterprise as our application to backup a 20GB data file to the c:c5 Swift Cloud. We concurrently run two Amanda Enterprise servers on two EC2 Quad instances to send data to the c:c5 Swift cloud, ensuring that two Amanda Enterprise servers can fully load the c:c5 Swift cloud in all cases.

For this experiment, we focus on the backup operations, so the aggregate throughput of backup operations is simply regarded as “throughput” (MB/s) measured between the two Amanda Enterprise servers and the c:c5 Swift cloud.

Let’s first look at the throughput curves (throughput on Y-axis, values of c on X-axis) of c:c5 Swift cloud with the two types of load balancers for each of above mentioned configurations of proxy and storage nodes.

(1) Proxy nodes run on the Large instance and the storage nodes run on the Small instance. The two curves are for the two types of load balancers (LB):

Proxy nodes run on the Large instance

(2) Proxy nodes run on the XL instance and the storage nodes run on the Small instance.

Proxy nodes run on the XL instance

(3) Proxy nodes run on the CPU XL instance and the storage nodes run on the Small instance.

Proxy nodes run on the CPU XL instance

(4) Proxy nodes run on the Quad instance and the storage nodes run on the Medium instance.

Proxy nodes run on the Quad instance

From the above 4 figures, we can see that throughput of c:c5 Swift cloud using 1 Eight instance as the load balancer can not scale beyond 140MB/s. While, with 2 Eight instances as the load balancer, the c:c5 Swift Cloud can scale in linear shape (for the values of “c” we tested with).

Next, we combine the above results of “2 Eight” load balancer  into one picture, and look at it from another point of view –  throughput on Y-axis, cost ($) on X-axis. (As you may recall from our last blog, the cost is defined as the EC2 usage cost of running c:c5 swift cloud for 30 days.)

load balancer  into one picture

The above graph tells us several things:

(1) The configuration of using CPU XL instances for proxy nodes and Small instances for Storage node is not a good choice, because when compared with configuration of using XL instances for proxy nodes and Small instances for Storage node, it consumes similar cost, but delivers lower throughput. The reason for this is our observation that XL instances provide better bandwidth than CPU XL instances. AWS marks the I/O performance (including the network bandwidth) of  both XL instance and CPU XL instance as “High”. From our pure network bandwidth testing, XL instance shows maximum 120 MB/s for both incoming and outgoing bandwidth, while CPU XL instance has maximum 100 MB/s for both incoming and outgoing bandwidth.

(2) The configuration of using Large instances on proxy nodes and Small instances on Storage node is the most cost-effective. Since within each throughput group (marked as dotted circle in the figure): low, medium and high, it achieves the similar throughput, but with much lesser cost. The reason  this configuration can be cost-effective is because Large instance can provide the maximum 100 MB/s for both incoming and outgoing network bandwidth, which is similar to the XL and CPU XL instances, but is associated with 2x lower cost than the XL and CPU XL instances.

(3) While using Large instances on proxy nodes and Small instances on Storage node is very cost-effective, but the configuration of using Quad instances on proxy nodes and Medium instances on Storage node is also an attractive option. Especially if you consider the manageability and failure issues. To achieve 175MB/s througput, you can choose either 8 Large instance based proxy nodes and 40 Small instance based storage nodes (total 48 nodes), or 4 Quad instance based proxy nodes and 20 Medium instance based storage nodes (total 24 nodes). Hosting and managing more nodes in the data center may require higher IT-related costs, e.g. power, # of server racks, failure rate and IT administration. Considering those costs, it may be more attractive to setup a Swift Cloud with smaller number of more powerful nodes.

Based on the data in the above figure and considering the IT-related costs, the goal of the optimization phase is to choose the configuration that optimizes your goal best. For example, if you input the performance and capacity constraints and want to minimize the cost, let’s suppose the two configuration: (1) using Large instances for proxy nodes and Small instances for Storage nodes, and (2) using Quad instances for proxy nodes and Medium instances for Storage nodes, can both satisfy your capacity constraint. Now, the only thing left is that you want to figure out which configuration has less cost to fulfill the throughput constraint. The final result depends on your IT management costs. If your IT management cost is relatively expensive, then you may want to choose second configuration, otherwise, the first configuration will likely incur lesser cost.

In the future articles, we will talk about how to map the EC2 instances to the physical hardware so that the cloud builders can build an optimized Swift cloud running on physical servers.

If you are thinking of putting together a storage cloud, we would love to discuss your challenges and share our observations. Please drop us a note at  swift@zmanda.com