Archive for June, 2012

Sumo nodes for OpenStack Swift: Mixing storage and proxy services for better performance and lower TCO

Monday, June 25th, 2012

The ultimate goal of cloud operators is to provide high-performance and robust computing resources for the end users while minimizing the build and operational costs. Specifically OpenStack Swift operators want to build storage clouds with higher throughput bandwidth but with lower initial hardware purchase cost and lower ongoing IT-related costs (e.g. IT admin, power, cooling etc.).

In this blog, we present a novel way to achieve the above goal by using nodes in a Swift cloud which mix storage and proxy services. This is contrary to the existing common practice in Swift deployments to exclusively run proxy and storage services on separate nodes, where the proxy and authentication services are suggested to run on hardware with high-end CPU and networking resources, and the objects stored on storage nodes without the need for similar compute and networking resources as the proxy nodes. However, our proposal is to provide a new alternative for building cost-effective Swift clouds with higher bandwidth, but without compromising the reliability. This method can be easily adopted by cloud operators who have already built a Swift instance on some existing hardware or who are considering to buy new hardware to build their Swift clouds.

Our idea is based on following principle: when uploading an object to Swift with M replicas, (M/2)+1 out of M writes need to be successful before the uploader/client is acknowledged that the upload is successful. For example, in a default Swift environment which enforces 3 data replicas, when a client sends an object to Swift, once 2 writes complete successfully, Swift will declare a success message to the client – the remaining 1 write can be a delayed write and Swift will rely on the replication process to ensure that the third replica will be successfully created.

Based on the above observation, our idea is to speed up the 2 successful writes so that Swift can declare a success as soon as possible in order to increase the overall bandwidth. The third write is allowed to be finished at a slower pace and its goal is to guarantee the data availability when 2 zones fail. To speed up the 2 successful writes, we propose to run storage, proxy and authentication services on high-end nodes.

We call these mixed nodes Sumo Nodes: Nodes in a Swift cloud which provide proxy and authentication services as well as provide storage services.

The reason why we want to mix the storage and proxy services on these high-end nodes is based on the following observations: Proxy nodes are usually provisioned with high-end hardware, including 10Gbps network, powerful multi-core CPUs and large amount of memory. In many scenarios these nodes are over-provisioned (as we discussed in our last “pitfall” blog). Thus, we want to take full advantage of their resources by consolidating both proxy and storage services on one sumo node, with the goal to complete the 2 successful writes faster.

Sumo nodes will typically be interconnected via 10Gbps network. Writes to sumo nodes will be done faster because they are either local writes (proxy server sends data to storage service within the same node) or the write is done over a 10Gbps network instead of transferring to a storage node connected via a 1Gbps network. So, if two out of three writes for an upload are routed to the sumo nodes, the Swift cloud will return success much faster than the case when an acknowledgement from a traditional storage node is needed.

In the following discussion, we consider a production Swift cloud with five zones, three object replicas and two nodes with proxy services. (Five zones provide redundancy and fault isolation to ensure that three-replica requirement for objects is maintained in the cloud even when an entire zone fails and the repair for that zone takes significant time.)

Key considerations when designing a Swift cloud based on sumo nodes

How should I setup my zones with sumo nodes and storage nodes?

Each sumo node needs to belong to a separate zone (i.e. one zone should not have two sumo nodes). In our example Swift cloud the two sumo nodes represent two zones and rest of the three zones are distributed on the three storage nodes.

Can you guarantee that all “2 successful writes” will be done on the sumo nodes?

No. Upon each upload, every zone will have a chance to take one copy of the data. Based on the setup of two zones on the two sumo nodes and three zones on the storage nodes, it is even possible that all 3 writes will go to the three zones on the storage nodes.

However, we can increase the probability of having “2 successful writes” done on the sumo nodes. One straightforward but potentially costly way is to buy and use more sumo nodes. But, another way is to optimize the “weight” parameter associated with the storage devices on sumo nodes. (The weight parameter will decide how much storage space is used on a storage device, e.g. a device with 200 weight will take 2X amount of data than a device with 100 weight.) By giving higher weight value to the sumo nodes as compared to the storage nodes, more data will be written into the sumo nodes, thereby increasing the probability of “2 successful writes” being done on the sumo nodes.

What is the impact of using higher weight value on the sumo nodes?

As discussed above, when using a higher weight value on the sumo nodes, the probability of “2 successful writes” done at those nodes will be higher, so the bandwidth of the Swift should be better. However, using higher weight value will also speed up the device/disk space usage on the sumo nodes causing these nodes to reach their space limit quickly. Once the space limit on sumo node is reached, all remaining uploads will go the storage nodes. So, the performance of the sumo-node based Swift cloud would be downgraded to a typical Swift cloud (or even worse if the storage nodes being used are wimpy). However, the bottom line is that the storage nodes should have enough storage space to guarantee the data can be completely replicated on them and be able to accommodate failure of a sumo node. For example, assuming each sumo node has 44 available HDD bays and each HDD bay is populated with 2 TB HDD. Once the available 88 TB of space is used up on the sumo nodes, the users of the Swift cloud will stop enjoying the performance improvements because of sumo nodes (the “sumo effect”) even though there might be additional usable storage capacity on the cloud.

The disk usage of sumo nodes can be easily controlled by the weight value. Overall, the right value of the weight parameter will be determined by the individual cloud operators depending on (1) how much total data they want to save in the sumo-node based Swift cloud and (2) how much is the raw space limit on sumo nodes.

In a sumo-based Swift cloud, the function of storage nodes is to help guarantee the data availability. We expect the storage nodes to be wimpy nodes in this configuration. However, the storage nodes should have enough total disk space to handle the capacity requirements of the cloud. At a minimum, storage nodes should have the combined capacity of at least one sumo node.

While performance will increase when using sumo nodes, but what is the impact on the total cost of ownership of the Swift cloud?

Since the sumo nodes provide functionality and capacity of multiple storage nodes, cloud builders will save on the cost of purchasing and maintaining some storage nodes.

Using sumo nodes will not compromise any data availability because we still keep the same number of zone and data replicas at all times as compared to the traditional setup of Swift. However, we should pay attention to the robustness and health status of the sumo nodes. Failure of a sumo node is much more impactful than failure of a traditional proxy node or a storage node. If a sumo node fails, the cloud will lose one of the proxy processes as well as will need to replicate all of the objects stored on the failed sumo node to remaining nodes in the cloud. Therefore, we recommend investing more on the robustness features on the sumo nodes, e.g. dual fans, redundant power supply etc.

To choose the right hardware for sumo nodes and storage nodes, based on your cost, capacity and performance considerations, we recommend using the techniques and process of the Swift Advisor.

What use cases are best suited for sumo-based Swift configurations?

Sumo-based Swift configurations can be best applied in the following use cases: (1) high performance is a key requirement from the Swift cloud, but a moderate amount of usable capacity is required (e.g. less than 100TB) or (2) the total data stored in the Swift cloud will not change dramatically over time, so a cloud builder can plan ahead and pick the right sized sumo nodes for their cloud.

Sumo Nodes and Wimpy Storage Nodes in action

To validate the effectiveness of the sumo based Swift cloud, we ran extensive benchmarks in our labs. We looked at the impact of using Sumo nodes on the observed bandwidth and net usable storage space of the Swift cloud (we assume that cloud operators will not use the additional capacity, if any available, in their Swift cloud once the “sumo effect” wears off).

As before, we use Amanda Enterprise as our application to backup and recover a 15GB data file to/from the Swift cloud to test its write and read bandwidth respectively. We ensure that one Amanda Enterprise server can fully load the Swift cloud in all cases.

Our Swift cloud is built on EC2 compute instances. We use Cluster Compute Quadruple Extra Large Instance (Quad Instance) for the sumo nodes and use Small or Medium Instance for the storage nodes.

To build a minimal production-ready sumo-node based Swift cloud, we use 1 load balancer (pound-based load balancer, running on a Quad Instance), 2 sumo nodes and 3 storage nodes. We set 2 zones on the 2 sumo nodes and the remaining 3 zones are on the 3 storage nodes. To do an apples-to-apples comparison, we also setup a Swift cloud with traditional setup (storage services running exclusively on designated storage nodes) using the same EC2 instances on the load balancer and proxy nodes (Quad instance), and storage services running on five storage nodes (Small or Medium Instance). All storage nodes in traditional setup have the same and default values of the weight parameter.

We use Ubuntu 12.04 as base OS and install the Essex version of Keystone and Swift across all nodes.

We first look at the results of upload workload with different weight values on sumo nodes. In sumo-based Swift, we set the weight values on the storage nodes to be always 100. We use “sumo (w:X)” in Figure 1 and 2 to denote that the value of weight parameter on sumo node is X. E.g. sumo (w:300) represents a Swift cloud with sumo nodes with weight value set at 300.

Figure 1 shows the trade-offs between the observed upload bandwidth and projected storage space of a sumo-node based Swift cloud before the space limit of sumo nodes is reached – this is the point when the “sumo effect” of performance improvement tapers off. The projected storage space is calculated by assuming 44 HDD bays on each node and each HDD bay is populated with a 2TB HDD. The readers can have their own calculations on the storage space depending on their specific situation.

As we can see from the above Figure 1, with either Small or Medium Instances being used for storage nodes, the sumo-based Swift cloud with weight 300 always provides the highest upload bandwidth and lowest usable storage space. On the other hand, the traditional setup and “sumo (w:100)” both provide the largest usable storage space – but even in this case the sumo-based Swift cloud has higher upload bandwidth as it puts some objects on the faster sumo nodes.

In Figure 1, we also observe that: as the weight increases on the sumo nodes, the bandwidth gap between using Small Instances for the storage nodes and using Medium Instances for the storage nodes shrinks. For example, the bandwidth gap at “sumo (w:200)” is about 24 MB/s, but the it is reduced to 13 MB/s at “sumo (w:300)”. This implies that in a sumo-based Swift cloud, you may consider using wimpy storage nodes, as it may be more cost-effective as long as your performance considerations are met. The reason is that the effective bandwidth of a sumo-based Swift cloud is mostly determined by the sumo nodes when the weight of the sumo nodes is set to large.

For completeness, we also show the results of download workload in Figure 2. Similar to the Figure 1, the sumo-based Swift cloud delivers the highest download bandwidth and provides the lowest storage space when using weight 300 for sumo nodes. On the other hand, when using weight 100, it has the same usable capacity as the traditional setup, but still has higher download bandwidth as compared to the observed download bandwidth with traditional setup.

From the above experiments, we observe interesting trade-offs between bandwidth achieved and usable storage space until the point the Swift cloud is enjoying the “sumo effect”. Overall, the sumo-based Swift cloud is able to (1) Significantly boost the overall bandwidth with relatively straightforward modifications on the traditional setup and (2) Reduce the TCO by using smaller number of overall nodes, and requiring wimpy storage nodes, while not sacrificing the data availability characteristics of the traditional setup.

Above analysis and numbers should give Swift builders a new option to deploy and optimize their Swift clouds and seed other ideas on how to allocate various Swift services across nodes and choose hardware for those nodes.

If you are thinking of putting together a storage cloud, we would love to discuss your challenges and share our observations. Please drop us a note at