Archive for May, 2012

Building a Swift Storage Cloud? Avoid Wimpy Proxy Servers and Five other Pitfalls

Wednesday, May 23rd, 2012

We introduced OpenStack Swift Advisor in a previous blog: a set of methods and tools to assist cloud storage builders to select appropriate hardware based on their goals from their Swift Cloud. Here we describe six pitfalls to avoid, when choosing components for your Swift Cloud:

(1) Do not use wimpy servers for proxy nodes

The key functionality of  a proxy node is to process a very large amount of API requests, receive the data from the user applications and send them out to the corresponding storage nodes. Proxy node makes sure that a minimum number of required replicas get written to storage nodes. Reply traffic (e.g. Restore traffic, in case the Swift Cloud is used for Cloud Backup) also flows through the proxy nodes. Moreover, the authenticating  services (e.g. keystone, swauth) may also be integrated into the proxy nodes.  Considering these performance critical functions being performed by proxy nodes, we strongly advise cloud storage builders to consider powerful servers as the proxy nodes. For example, a typical proxy node can be provisioned with 2 or more multi-core Xeon-class CPUs, large memory and 10G Ethernet.

There is some debate on whether a small number of powerful servers or a large number of wimpy servers should be used as the proxy nodes. It is possible that the initial cost outlay of a large number of wimpy proxy nodes may be lower than a smaller number of powerful nodes, while providing acceptable performance. But for data center operators, a large number of wimpy servers will inevitably incur higher IT related costs (personnel, server maintenance, space rental, cooling, energy and so on). Additionally, more servers will need more network switches, thus decreasing some of the cost benefits as well as increasing failure rate. As your cloud storage service gets popular, scalability will be challenging with wimpy proxy nodes.

(2) Don’t let your load-balancer be overloaded

Load-balancer is the first component of a Swift cluster that directly faces the user applications. Its primary job is to take all API requests from the user application and evenly distribute them to the underlying proxy nodes. In some cases, it has to do the SSL termination to authenticate the users, which is a very CPU and network intensive job. An overloaded load-balancer inherently defeats the purpose by becoming the bottleneck of  your Swift cluster’s performance.

As we have discussed in a previous blog (Next Steps with OpenStack Swift Advisor), the linear scalability of a Swift cluster on performance can be seriously inhibited by a load-balancer which doesn’t keep up with the load. To reap the benefits of your investment in proxy and storage nodes, you should make sure that the load-balancer is not underpowered especially for peak load conditions on your storage cloud.

(3) Do not under-utilize your proxy nodes

Proxy node is usually one of the most expensive component in the Swift cluster. Therefore, it is desirable for cloud builders to fully utilize the resources in their proxy nodes. A  good question being asked by our customers is: how many storage nodes should I attach to a proxy node ? or what is the best ratio between the proxy and storage nodes ? If your cloud is built with fewer storage nodes per proxy node, you may be  under-utilizing your proxy nodes, as shown in the following Figure 1(a). (While we have simplified the illustrations, the factors of performance changes indicated in following figures are based on actual observations in our labs.) In this example, initially, the Swift cluster consists of 3 nodes: 1 proxy node and 2 storage nodes (we use capital P and S in the picture to denote proxy and storage nodes respectively). The write throughput of that 3-node Swift cluster is X MB/s. However, if we add two more storage nodes to that Swift cluster, as shown in Figure 1(b), the throughput of the 5-node Swift cluster becomes 2X MB/s.  So the throughput along with capacity of Swift cluster can be doubled (2X) by simply adding in two storage nodes. In terms of the cost per throughput and cost per GB,  the 5-node Swift cluster in this example will likely be more efficient.

(4) Do not over-utilize the proxy nodes

On the other hand, you can’t keep attaching the storage nodes without increasing your proxy nodes at some point. In Figure 2(a), 1 proxy node has been well-utilized by the 4 storage nodes with 2X MB/s throughput. If more storage nodes are attached to the proxy node,  as shown in Figure 2(b), its throughput will not increase because the proxy node is already busy with the 4 storage nodes. Therefore, attaching more storage nodes to a well-utilized (nearly 100% busy) proxy node will only make the Swift cluster less efficient in terms of the cost per throughput. However note that you may decide to over-subscribe proxy nodes, if you are willing to sacrifice potential performance gains by adding more proxy nodes, and you simply want to add more capacity for now. But to increase capacity, first look into making sure you are adding enough disks to each storage node, as described in the next pitfall.

(5) Avoid disk-bounded storage nodes

Another common question we get is: how many disks should I put into my storage node? This is a crucial question with implications on cost/performance and cost/capacity. In general, you want to avoid storage nodes which are bottlenecked on performance due to less number of disk spindles as illustrated by the following picture.

Figure 3(a) shows a Swift cluster consisting of 1 proxy node and 2 storage nodes, with each storage node attached to 1 disk. Let’s assume the throughput of this Swift cluster is Y MB/s. However, if we add one more disk on each storage node based on Figure 3(a), we will have two disks on each storage node, as shown in Figure 3(b). Based on our observations the throughput of the new Swift cluster may increase by as much as 1.5Y MB/s. The reason why the throughput is improved by simply attaching more disks is: in Figure 3(a), one disk in each storage node can easily be overwhelmed (i.e. 100% busy) when transferring the data from/to the storage nodes, while other resources (e.g. CPU, memory) in the storage node are not fully-utilized, hence the storage node being “disk-bounded”. However, since more disks are added to each storage node and all disks can work in parallel during the data transfers, the bottleneck of the storage node is shifted from disks to other resources, and thus, the throughput of Swift cluster can be improved. In terms of cost per throughput, Figure 3(b) is more efficient than Figure 3(a), since the cost of adding more disk is significantly less than the cost of the whole server.

An immediate follow-up question is: can the throughput keep increasing by attaching more disks to each storage node? Of course, the answer is No. Figure 4 shows the relationship between the number of disks attached to each storage node and the throughout of Swift cluster. As the number of disks increases from 1, the throughput is indeed improved but after some point (we call it “turning point”), the throughput stops increasing and becomes almost flat later on.

Even though the throughput of Swift cluster can not keep improving by attaching more disks,  some cloud storage builders may want to put large number of disks in each storage node, as doing that does not hurt the performance. Another metric, cost per MB/s per GB of available capacity,  tends to be minimized by adding more disks.

(6) Do not rely on two replicas of data

One more question we get frequently asked from our customers is: can we use 2 replicas of data in the Swift cluster in order to save on cost of storage space ? Our recommendation is: No. Here is why:

Performance: it may seem that a Swift cluster which maintains 2 replicas of data will have better performance when the data is written to the storage nodes as compared to a cluster which maintains 3 replicas (which has one more write stream to the storage nodes). However, in actuality,  when the proxy node attempts to write to N replicas, it only requires  (N/2)+1 successful responses out of N to declare a successful write. That is to say, only (N/2)+1 out of N concurrent writes are synchronous, while the rest of the writes can be asynchronous and Swift will rely on the replication process to ensure that the remaining copies are successfully created.

Based on the above, and in our tests comparing the “3-replication Swift cluster” and  “2-replication Swift cluster”:  they will both generate 2 concurrent synchronous writes to the storage nodes.

Risk of data loss: We recommend using commodity off-the-shelf storage for Swift Storage Nodes, without even using RAID. So, the replicas maintained by Swift are your defense against data loss. Also, lets say a Swift cluster has 5 zones (which is the minimum number of recommended zones) and 3 replicas of data. With this setup, up to two zones can fail at the same time without any data loss. However, if we reduce the number of replications from 3 to 2, the risk of data loss is increased by 100%, because the data can only survive one zone failure.

Avoiding above pitfalls will help you to implement a high-performance and robust Swift Cloud, which will scale to serve your cloud storage needs for several years to come.

If you are thinking of putting together a storage cloud, we would love to discuss your challenges and share our observations. Please drop us a note at  swift@zmanda.com

Zmanda “googles” cloud backup!

Friday, May 11th, 2012

Today, we are thrilled to announce a new version of Zmanda Cloud  Backup (ZCB) that backs up to Google Cloud Storage. It feels great to support perhaps the first mainstream cloud storage service we were introduced to (via the breakthrough Gmail and other Google services) and considering the huge promise shown by Google’s cloud services, we are sure that this version will be very useful to many of our customers.

However, a new cloud storage partner explains only part of the excitement. :) What makes this version more significant to us is its new packaging. As you may be aware, until now ZCB came only in a Pay-As-You-Go format and while this option has been great for our customers who value the flexibility offered by this model, we realized that there are our other customers (such as government agencies) who need a fixed amount to put down in their proposals and budget provisions. To put it differently - these customers would rather trade-off some of the flexibility for certainty.

So with these customers in mind, we chose to offer this ZCB version in the following prepaid usage quota based plans:

  • $75/year for 50 GB
  • $100/year for 100 GB
  • $1,000/year for 1000 GB
  • $10,000/year for 10000 GB

Note that the above GB values are the maximum size of data users can store on the cloud at any point in time. The prices above are inclusive of all costs of cloud storage and remain unaffected even if you wish to protect multiple (unlimited!) systems.

    So what are the benefits of this new pricing option? Here are some:

  • Budget friendly: Whether you are an IT manager submitting your annual IT budget for approval or a service provider vying for a client’s business, the all-inclusive yearly plans are a great option, one you can confidently put down in writing.
  • Cost effective: If you know your requirements well, this option turns out to be dramatically cost effective. Here is a rough comparison of our pricing with some other well-known providers:

    Note:
    Zmanda Cloud Backup: The annual plan pricing for Google Cloud Storage version was used.
    MozyPro: Based on http://mozy.com/pro/pricing/ “Server Pass” option was chosen since ZCB protects Server applications at no extra cost.
    JungleDisk: Based on: https://www.jungledisk.com/business/server/pricing/ Rackspace storage option was used since this was the only “all-inclusive” price option

  • More payment options: In addition to credit cards, this version supports a variety of payment options (such as Bank transfer, checks, etc.). So whether you are a government agency or an international firm, mode of payment is never going to be an issue.
  • Simplified billing and account management: Since this aspect is entirely handled by Zmanda, it is much easier and user friendly to manage your ZCB subscription. So no more hassles of updating your credit card information and no need of managing multiple accounts. When you need help, just write to a single email id (zcb@zmanda.com), or open a support case with us, and we will assist you with everything you may need assistance with.
  • Partner friendly: The direct result of all the above benefits is that reselling this ZCB version will be much more simplified and rewarding. If you are interested in learning more, do visit our new reseller page for more details.

So with all the great benefits above, do we still expect some customers to choose our current pay-as-you-go ZCB version for Amazon S3? Of course! As we said, if your needs are currently small or unpredictable, the flexibility of scaling up and down without committing to a long term plan is a sensible option. And the 70 GB free tier and volume discount tier offered on this ZCB version can keep your monthly costs very low.

Oh and I almost forgot - along with this version, we have also announced the availability of ZCB Global Dashboard, the web-interface to track usage and backup activity of multiple ZCB systems at a single place. If you have multiple ZCB systems in your environment or you are a reseller, it will be extra useful to you.

As we work on enhancing our ZCB solution more, please keep sending us your feedback at zcb@zmanda.com. Much more is cooking with Cloud Backup at Zmanda. Will be with you with more exciting news soon!

-Nik

Great Combination for Cloud Storage: Ubuntu 12.04 + OpenStack Swift Essex Release

Monday, May 7th, 2012

We are very excited to see the release of Ubuntu 12.04 LTS and OpenStack Essex, especially the Essex version of OpenStack Swift, and the brand-new Dashboard. We have not yet seen any performance review on the OpenStack Swift Essex running on Ubuntu 12.o4. The official Dashboard Demo introduced the components of System Panel and Mange Compute, without any details for the Object Store. So, we did an apple-to-apple cloud backup performance comparison between OpenStack Swift Essex on Ubuntu 12.04 LTS and  OpenStack Swift 1.46 + Ubuntu 11.10, as well as demonstrated the functionality of Object Store in the OpenStack Dashboard.

In the following, we will first report our results on some select hardware configurations of proxy and storage node on EC2. Our previous blog (Next steps with the OpenStack Advisor) provides details about these hardware configurations and we use the following four configurations as the example implementations of a “small-scale” Swift cloud.

  • 1 Large Instance based proxy node: 5 Small Instance based storage nodes
  • 1 XL Instance based proxy node: 5 Small Instance based storage nodes
  • 1 CPU XL Instance based proxy node: 5 Small Instance based storage nodes
  • 1 Quad Instance based proxy node: 5 Medium Instance based storage nodes

The Large, XL, CPU XL and Quad instances cover a wide range of CPU and memory selections. For network I/O, Large, XL and CPU XL instances are provisioned with a Gigabit Ethernet (100~120MB/s), while the Quad instance offers 10 Gigabit Ethernet (~1.20GB/s) connectivity.

Again, we use Amanda Enterprise as our application to backup and recover a 10GB data file to/from the Swift cloud to test its write and read throughput respectively. We ensure that one Amanda Enterprise server can fully load the Swift cloud in all cases.

Two systems involved in the comparison are: (1) Ubuntu 11.10 + OpenStack Swift 1.4.6; (2) Ubuntu 12.04 LTS + OpenStack Swift Essex (Configuration parameters of OS, OpenStack and Amanda Enterprise are identical).  In the following, we use 11.10+1.46 and 12.04+Essex as the labels to represent the above two systems.

(1) Proxy node runs on the Large instance and 5 storage nodes run on the Small instances. (Note that the throughput values on y-axis are not plotted from zero)

(2) Proxy node runs on the XL instance and 5 storage nodes run on the Small instances.

(3) Proxy node runs on the CPU XL instance and 5 storage nodes run on the Small instances.

(4) Proxy node runs on the Quad instance and 5 storage nodes run on the Medium instances.

From the above comparisons, we found out 12.04 + Essex performs better than 11.10+1.4.6 in terms of the backup throughput, and the performance gap ranges from 2% - 20% with the average of 9.7%.  For recovery throughput, the average speedup over 11.10+1.4.6 is not as significant as the backup throughput.

We did not dig into as to who (Ubuntu 12.04 LTS or OpenStack Essex) is the cause of this slight improvement on throughput. But we can see that the overall combination performs statistically better. From our initial testing, based on the performance improvements as well as feature improvements, we encourage anyone who is running OpenStack Swift on Ubuntu to upgrade to the latest released versions to take advantages of their new updates. Five years support for 12.04 LTS is a great assurance to maximize ROI for your cloud storage implementation.

Next, we demonstrate the functionality of Object Store within the OpenStack Dashboard.

After we log into the Dashboard and click the “Project” Tab on the left and then the “Containers” under the “Object Store”, we see screen as below:

We can create a container by clicking “Create Container” button and we see the following screen:

After creating a container, we can click the container name and browse the objects associated with that container. Initially, a newly-created container is empty.

We can upload an object to the container by clicking “Upload Object” button:

Meanwhile, we can delete an object from the container by choosing the “Delete Object” from its corresponding drop-down list at the “Actions” column.

Also, we can choose to delete a container by  choosing the “Delete Container” from its corresponding drop-down list at the “Actions” column.

Here, we demonstrate the core functionality of Object Store in OpenStack Dashboard and from the above screenshots, we can observe that Dashboard provides very neat and friendly user interfaces to mange the containers and objects. This saves lot of time to look up command-line syntax for basic functionality.

Congratulations, Ubuntu and OpenStack teams!  The Ubuntu 12.04 + OpenStack Swift Essex Release combination is a great contribution to Open Source and Cloud Storage communities!

Building an OpenStack Swift Cloud: Mapping EC2 to Physical hardware

Friday, May 4th, 2012

As we mentioned in an earlier blog that it may seem ironical that we are using a public compute cloud to come up with an optimized private storage cloud. But ready availability of diverse type of EC2 based VMs, makes AWS a great platform for running the Sampling and Profiling phases of the OpenStack Swift Advisor.

After an optimized Swift Cloud is profiled and designed on the virtualized hardware (for example, EC2 instances in our lab), the cloud builders will eventually want to build it on the physical hardware. The question is: how to preserve the cost-effectiveness and guaranteed throughput of the Swift Cloud on the new physical hardware with new data center parameters?

A straightforward answer is to keep the same hardware and software resources in the new hosts. But, the challenge is:  EC2 (this challenge remains if other cloud compute platforms, e.g. OpenStack Compute were used for profiling) provisions the CPU resource for each type of instance in terms of “EC2 Compute Unit”‘, e.g. Large instance has 4 EC2 Compute Units, Quad instance has 33.5 EC2 Compute Units. The question is: how to translate the 33.5 EC2 Compute Units into GHz when you purchase the physical CPUs on the market for the servers? Another ambiguous resource definition associated with EC2 is the network bandwidth. EC2 has 4 standards of network bandwidth: Low, Moderate, High and Very High and for example, EC2 allocates Low bandwidth to Micro instance and Moderate bandwidth to Small instance. But, what does “Low bandwidth” means in terms of MB/s? EC2 specs provide no answers for those.

Here we want to propose a method to translate these ambiguous resource definitions (e.g. EC2 Compute Units) into the standard specifications (e.g. GHz) that can be referred when choosing the physical hardware. We focus on 3 types of hardware resources: CPU, disk and network bandwidth.

CPU: We first choose a CPU benchmark software (e.g. PassMark) and run it on a certain type of EC2 instance to get a benchmark score. Then, we look up the published benchmark scores of that benchmark software to find out which physical CPU got the similar score. For safety, we can choose the physical CPU with a little higher score to ensure it performs no worse than the virtualized CPU in the EC2 instance.

Disk: We roughly assume the I/O patterns in storage nodes are close to sequential, and we can use the “dd” Linux command to benchmark the sequential read and write I/O bandwidths on a certain type of EC2 instance. Based on the I/O bandwidth results in terms of the MB/s, cloud builders can buy the physical storage drives with the matching I/O bandwidths.

Network: To test the maximum bandwidth of a certain EC2 instance within the Swift Cloud, we setup another EC2 instance with very high network bandwidth. e.g. the EC2 Quad instance. First, we install Apache and create a test file (the size of the file depends on the memory size, as discussed later) on both EC2 instances. Then, in order to benchmark the maximum incoming network bandwidth of the EC2 instance, we issue wget command on that EC2 instance to download the test file hosted on the Quad instance. The wget command will give the average network bandwidth after the download is finished and we will use it as the maximum incoming bandwidth. To test the maximum oncoming network bandwidth, we operate the above test in the reversed direction: the Quad instance downloads the test file from the EC2 instance we want to benchmark. The reason we choose wget (instead of e.g. scp) is that wget involves less CPU overhead. Notice that, to remove the interference from the disk I/Os, we ensure the test file can fit into the memory of the EC2 instance so that there are no read I/Os needed. Also, we always execute the wget with “-O /dev/null” to bypass the write I/Os. Once we get the maximum incoming and oncoming network bandwidths, we can choose the right Ethernet components to provision the storage and proxy nodes.

Memory: As to the virtualized memory in EC2 instance, if 10 GB memory is associated with the instance, then it is straightforward to provision 10GB memory in the physical server. So, we feel that there is no translation needed for virtualized memory.

Other cloud management platforms may offer several types of instances (e.g. large, medium, small) based on their own terminologies. We can use the similar methods as above to benchmark each type of instances they offer and find the matching physical hardware.

To fully ensure that the throughput of the Swift Cloud while mapping from the EC2 instances, we advise the cloud builders to provision the physical hardware with at least 10% better specs than deduced by above translation.

Here, we show an example of how to map an EC2 c1.xlarge instance to physical hardware:

CPU: We run Pass Mark CPU benchmark on c1.xlarge. The CPU score from PassMark is: 7295. Considering to provision 10% more resource when translating from virtualized hardware to physical hardware, some choices on physical CPU include: Intel Xeon E3-1245 @ 3.30 GHz, Intel Xeon L5640 @ 2.27GHz, Intel Xeon E5649 @ 2.53 GHz etc.

Memory: As c1.xlarge instance is allocated 7GB memory,  so we could choose 8GB memory (4GB x 2 or 2GB x 4) in the physical machine.

Disk: By using the “dd” command, we found out c1.xlarge instance has 100-120 MB/s for sequential read and 70-80MB/s for sequential write, which matches to a typical 7,200 RPM based drive. Therefore, most HDDs on the market can be safe to use as data disks in the physical machine.

Network: c1.xlarge instance has around 100 MB/s network bandwidth for both incoming and outgoing traffic, which corresponds to a 1Gigabit Ethernet interface. So, a typical 1Gigabit Ethernet should be enough for networking for the physical machine.

If you are thinking of putting together a storage cloud service, we would love to discuss your challenges and share our observations. Please drop us a note at  swift@zmanda.com