Storage policies: Coming to an OpenStack Swift cluster near you

Image by:

Opensource.com

OpenStack Object Storage (code named Swift) has a fairly frequent release schedule for improvements and new capabilities but naturally, there is always significant gravity around integrated releases for any OpenStack project. Even though the Havana release was very big for OpenStack Swift, with new support for global clusters, the upcoming Icehouse will be the biggest release yet for the OpenStack Swift project.

Since the project was open-sourced over three years ago, the community of contributors has grown significantly. Every new release is bigger than any prior release given the vibrancy of developer participation. Recent contributions have come from companies including HP, IBM, Intel, Red Hat, Rackspace, and SwiftStack. Icehouse is targeted for a major set of new features, and many improvements to replication and metadata. The standout new capability though is storage policies, a new way of configuring OpenStack Object Storage clusters so that deployers can very specifically match their available storage with their use case.

Storage policies

Why are "storage policies" so significant? The abstraction of storage policies will allow deployers to optimize for many more use cases than today’s single policy allows, and provide for more flexibility in the hardware used under OpenStack Swift. With the current Havana release, a deployer can support replicated content across a wide geographic area. Concur shared a great example of this at the last OpenStack Summit in Hong Kong. With a single policy, data is spread across the entire global deployment.

Coming in the Icehouse release, storage policies allow a deployer to have more than once policy and manage important configuration choices, a flexibility that opens up many more use cases:

First, given the global set of hardware available in a single OpenStack Swift cluster, there will be new choice of which subset of hardware on which to store specific data. This can be done by geography (e.g. United States-East vs. European Union vs. Asia Pacific vs. global) or by hardware properties (e.g. SATA vs. SSDs).
Second, given the subset of hardware being used to store the data, there will be a new choice of how to encode the data across that set of hardware. For example, perhaps you have 2-replica, 3-replica, or future erasure code policies (looking forward to the Juno release). Combining this with the hardware possibilities, you get e.g. United States-East reduced redundancy, global triple replicas, and European Union erasure coded.
Third, given the subset of hardware and how to store the data across that hardware, there will be new control over how Swift talks to a particular storage volume. This may be optimized local file systems, or Gluster volumes, or even non-POSIX volumes like Seagate's new Kinetic drives.

OpenStack Swift today allows for choosing the replication factor in the cluster (e.g. 3 replicas, 4 replicas), which can be modified as needed. This is a cluster-wide setting today and storage policies are fundamentally new to allow for replicated or non-replicated storage to live alongside one another. With the Icehouse release, great capabilities like storage policies coming to OpenStack Swift will widen adoption for use cases among both service providers and enterprises.

As mentioned before, storage policies are an enabling technology for future support of erasure codes in OpenStack Swift. The vibrant development community has been laying the groundwork for storage policies since the second half of 2013. Work continues during this current major release cycle, with a target of the OpenStack Icehouse integrated release for storage policies, second quarter of 2014. Some erasure code development will be included as well, and erasure codes will be released once completed and tested, targeting mid-year 2014, before the integrated Juno release in the fourth quarter of 2014.

Erasure codes

Why are storage policies so important for erasure codes in Swift? Erasure codes, which can be thought of like RAID for object storage, become a policy choice. Erasure codes can be a good choice for some data like backups compared to using replicas for protection. In a Swift cluster, the structure of storage policies gives deployers the ability to choose how they want to encode data for a set of hardware and/or for a region. The refactoring in the codebase to allow erasure code support also results in the ability to define arbitrary storage policies.

Learn more

In the coming months, in addition to new features, we will also be highlighting use cases to show how different companies are using OpenStack Swift to create their own private cloud storage, or build cloud storage products for their customers.

We also host frequent workshops around the United States that are free to attend, where we go over technical training and give you a hands-on experience of the latest capabilities. If you’re interested in getting involved with the project, check out the OpenStack Swift wiki.