Julien Danjou is a free software hacker almost all of the time. At his day job, he hacks on OpenStack for eNovance. And, in his free time, he hacks on free software projects like Debian, Hy, and awesome. Julien has also written The Hacker's Guide to Python and given talks on OpenStack and the Ceilometer project, among other things.
Prior to his talk at OpenStack Summit 2014 in Paris this year, we interviewed him about his current work and got some great insight into the work going on for the Ceilometer project, the open source telemetry project for OpenStack.
Is there anything that you'd particularly like to see implemented in the Kilo release?
A lot! I couldn't say for OpenStack in general—there are so many projects—but I have pretty great plans and ideas for our telemetry program. What I'm most thrilled about is the work that is undergoing in Ceilometer that started in Juno towards high-availability and load-balancing. The team has started an amazing effort to tackle scalability problems that no other open source monitoring software have addressed in such a way.
We are now on the way to have a fully distributed, fault-tolerant, and scalable metering and monitoring system, tailored for OpenStack but also I hope for more general use-cases.
I hope that for Kilo we'll be able to finish that and to start tackling the event/notifications management system that we'd like to build into Ceilometer.
Why is it important to have an open source tool like OpenStack for deploying virtual infrastructure?
I think it has been said over and over again, but as infrastructure is becoming a commodity, having no vendor lock-in is a key for users. This is only possible with open source platforms, where software architecture and code is known and common among operators.
Having the ability to migrate your virtual infrastructure easily from one platform to another easily is very important. This kind of open standard is what we need to happen to keep the Internet an open place.
The platform being open source allows even users to contribute back and build new features they would like to see in their cloud provider. That's not something that could be possible with closed source cloud platform.
What changes have you seen in the community over the past several months? Where is OpenStack headed?
I've been hacking on OpenStack for more than 3 years now, and I see the community evolving a lot. I think it has become more mature, tackling a lot of its early issues. It became more professional in the way it handles many bad situations. We're lucky to be an open source project with a lot of actually hired qualified engineers behind. That means we have a lot of resources to grow the project in the right direction.
What strikes me the most is how our development process started as something as amateur as most open source projects and has ended up stronger than than most projects you'll ever see, open source or not. Each time I describe the OpenStack development workflow (design summit, blueprints, unit and functional testing, continuous integration, IRC meeting, release management, and more) people are amazed. If you add that all of this happens with engineers from different companies all around the world, it's even harder to believe that it is actually working and producing so much value.
Seeing that trajectory, I have great expectations about the future of OpenStack. Every one of us who has been in this industry for so long realize that we are constantly shifting paradigm. I think OpenStack is a key in the current change of abstraction layer that we are experiencing. GNU/Linux has been a key to the deployment of the infratructure for the last decades. It will be the fundamental for programming and instrumenting what we already have as it provides more and more services with all the properties you need (resiliency, scalability, and more) to deploy small to large infrastructure.
Your session at the Summit covers how changes in OpenStack and Ceilometer caused you to rethink the storage of metrics. Can you elaborate on this? How did it influence the creation of Gnocchi?
When we started to work on Ceilometer, a little more than 2 years ago, we had a good understanding of the use case we wanted to tackle (i.e. billing resources, usage), and a good overview on what we needed. What was fuzzier was how users will actually want to consume our data and what kind of the new possibilities it would open.
That led some of the main Ceilometer data structures to be poorly designed in respect to what we needed to provide to our new users (alarming, capacity planning, and more). Unfortunately, this damaged Ceilometer scalability and we lacked the resources to tackle this issue for a lonf time. We had many blueprints to work on and few people to address them. After being Ceilometer's Project Technical Leader for a year (2 release cycles), I've decided to step down and focus on fixing the issues myself.
One of the key point here was the team realizes that some of the early design was a mistake and that we needed to rethink our approach to the problem from scratch. I started a new project dedicated to replace the metric storage part of Ceilometer because it was the one that was causing scalability issues. I built a prototype over a month, code-named it Gnocchi, and presented it during the OpenStack design summit in Atlanta in May 2014. The Ceilometer core team agreed that Gnocchi was the right answer to the problem, and Eoghan Glynn (our new Project Technical Leader) was confident enough to put the project in the Ceilometer roadmap.
How do projects like Ceilometer and Gnocchi fit into the larger OpenStack ecosystem?
With Ceilometer metering everything happening in a cloud deployment, it's able to provide a large amount of insight on cloud operations. The first way to notice it is to go to the the OpenStack dashboard (Horizon) and look at the nice graphs that are presented. Most data here come right from Ceilometer.
It's obviously possible to leverage all these data sets to do more advanced analysis (trending, capacity planning, and more) but we still lack tools in that area unfortunately. However, the data are already consumed by a lot of different projects and systems to charge their users for their resources usage.
Add to that the development last year of the alarming subsystem, Ceilometer got its ability to trigger action based on threshold crossing on these metering. This is the base system that Heat leverage to provide its auto-scaling functionnality.
Gnocchi is not yet completely integrated into the OpenStack ecosystem as it's brand new and we're still dealing with that. Though I expect it to become a key component of Ceilometer itself and to provide a new set of possibilities to leverage. Users needing to store a large amount of metrics and index a large amount of resources will be pleased to find a scalable and useful solution offered as a service to their application.
How does working on an open source project like OpenStack improve the relationship between companies who might otherwise be seen as competitors?
No open source project can be successful to this scale without the cooperation of a lot of software developers. And all of these developers usually come from different companies and have to get along. That's where the first interactions between two companies in competition start.
It's not always that easy and obvious. Companies have different goals and different needs, so managing to have everybody getting along is not an easy task, but as time passes relationship are built and this link people together around the open source project, whichever is the company they work for.
While they will stay competitors and maybe drive different goals on an higher scale, having that common goal to drive an open source project to success and play along might build and improve a lot their relationship.