OpenStack has been in a production environment at CERN for more than a year. One of the people that has been key to implementing the OpenStack infrastructure is Tim Bell. He is responsible for the CERN IT Operating Systems and Infrastructure group which provides a set of services to CERN users from email, web, operating systems, and the Infrastructure-as-a-Service cloud based on OpenStack.
We had a chance to interview Bell in advance of the OpenStack Summit Paris 2014 where he will deliver two talks. The first session is about cloud federation while the second session is about multi-cell OpenStack.
Bell took us behind the scenes at CERN to share how they have scaled OpenStack on over 3,000 servers to meet their users needs for their research. They are able to scale their cloud needs without increasing support staff. He also gave us a few tips on how to survive the OpenStack Summit and maximize the experience with the community.
Bell is also an elected member of the OpenStack board of directors and part of the OpenStack user committee which helps the feedback loop between end users and deployers to the development community, including running surveys on OpenStack usage and running Operator meet ups.
Let's go behind the scenes at CERN and see how they scale OpenStack and what other research facilities could learn from their experience.
How is CERN using OpenStack? What kinds of workloads are you running on your OpenStack nodes?
At CERN, the European Organization for Nuclear Research physicists and engineers are probing the fundamental structure of the universe. In order to do this, we use some of the world's largest and most complex scientific instruments such as the Large Hadron Collider, a 27 KM ring 100m underground on the border between France and Switzerland.
The experiments around the ring produce up to 27PB/year which is recorded in CERN's two data centres in Geneva and Budapest. Physicists sift through this experimental data and compare it to simulations in order to understand the nature of matter. This results in discoveries such as the Higgs Boson in 2012 from the ATLAS and CMS experiments at CERN.
OpenStack provides the infrastructure cloud which is used to provide much of the compute resources for this processing. These virtual machines can be used for running compute intensive batch workloads, production services along with self-service development, and test environments.
What would you say to other research facilities hesitant about using OpenStack to win them over?
CERN has been running OpenStack in production since July 2013, supporting over 1,000 active users. Providing a self-service cloud has allows resources to be made available to the physicists in the time it takes to get a coffee rather than waiting weeks for physical hardware allocation. The combination of a simple web interface and powerful command line and APIs supports use cases such as a user wanting a new virtual machine for testing the latest versions of Linux or an experiment scheduling 1000s of virtual machines for their production analysis workloads.
The initial production deployment was based on the OpenStack Grizzly release and we have since upgraded to Havana and Icehouse in six month intervals. With OpenStack’s distributed components, we stage the upgrade procedure so that a component is upgraded and we allow a few weeks to confirm there are no issues and then progress to the next one.
At CERN, we deploy OpenStack based on an open source distribution from the RDO community. However, there are many alternative approaches depending on the available skills and resources, the need for the latest features or commercial support. Directories such as the OpenStack marketplace provide an easy way to select the most appropriate strategy for them.
The OpenStack community is very vibrant with lots of opportunities to attend local user groups and online meetups, and interact through mailing lists and self help sites. For a new site starting out, these can provide valuable advice and an opportunity to share experiences. The OpenStack Summits, twice a year, provide a wide range of speakers at many different levels from OpenStack 101 through to deep dive user stories.
Can you give anyone new to OpenStack a sense of how the infrastructure can scale? What have you been impressed with on the technical ability for OpenStack to scale in the data center?
The CERN IT cloud is currently around 70,000 cores on more than 3,000 servers across the two data centres. We use the OpenStack cells feature to combine building blocks of between 200 and 1,000 servers to provide these resources as a single cloud for the end user. At the summits, we meet with other organisations who are running at scale to share experiences and review any potential bottlenecks with the development teams.
Of particular interest to CERN are the approaches to scale the cloud without a corresponding increase in support staff numbers. The stackforge ecosystem packages are of help here where communities can form around deployment tools such as Puppet or standard monitoring packages such that CERN can contribute its experience and benefit from others.
Besides OpenStack, are there any other open source cloud projects that standout to you?
CERN released the World Wide Web as open source in the 1990s, and with the world wide LHC computing grid of hundreds of collaborating sites, we have used open source software in a large scale for decades. In addition to open source code availability, we look for strong sustainable communities, open design, and an opportunity for CERN to contribute. Around OpenStack, we have found several projects which are part of our production cloud solution. The Puppet configurations for OpenStack ensure that all of our hypervisors, either KVM or Hyper-V, are configured in a consistent way and these configurations can be dynamically updated as we evolve the cloud. Infrastructure monitoring and management software such as Elastic Search, Kibana, Jenkins, Rundeck, and Foreman are all integrated into the new tool chain.
What changes have you seen in the OpenStack community over the past several months?
CERN first started looking at OpenStack in 2011, when we attended the Essex OpenStack summit in Boston along around 600 other people. The last summit in Atlanta was over 4,000, and this is reflected by a corresponding increase in OpenStack and its ecosystem. The most significant growth in the recent summits has been the increasing number of deployers and end users getting involved. The other aspect is the wide range of production use in industries, from entertainment, telecoms to banking and research which helps create a flexible solution to cover these varied requirements.
Any tips or tricks to surviving OpenStack Summit?
The online schedules make it easy to plan which talks to attend. With many parallel tracks, it is sometimes not possible to be in all the sessions but the videos are posted online afterwards to catch any talks you may have missed.
If you are looking to contribute code, there is an upstream training especially for newcomers which is co-located with the summit to help get on board as smoothly as possible. Birds-of-a-feather sessions and lightning talks help to discover the details on the many projects around OpenStack which can help with deploying and running the cloud in production.
However, it is important to save time to meet the community. While much of the communication in OpenStack is electronic due to the world wide distributed nature of those involved, summits give a great chance to browse the various ecosystem solutions and discuss in more detail face-to-face.