Opening up your project's infrastructure
The benefits of building an open infrastructure
The OpenStack Infrastructure team manages all the services that developers in the OpenStack project interface with on a day-to-day basis, including the code review and continuous integration system, Wiki, IRC bots, and mailing lists.
We are also an open source project in our own right. All of the code and configurations used in our infrastructure is available in a series of public code repositories and all of our documentation is publicly available. This is in contrast to many other open source projects that either rely upon proprietary resources provided by a code hosting service, such as SourceForge or GitHub, or have a company with an IT staff that manages an infrastructure, like the Ubuntu project.
Having an infrastructure that's open source and maintained by the community has afforded the OpenStack project many benefits, including:
- The ability to let companies and individuals involved with OpenStack influence the development of the infrastructure by providing direct, constructive resources and feedback to the team.
- Allowing developers to not stand idly by while they wait for a feature to be implemented. They can provide resources to help move things along.
- Encouraging better practices on our team, since we're developing not just for ourselves, but for an audience which ultimately includes a downstream audience.
- The ability to accept contributions to improve our infrastructure and support more options from our downstreams.
We are a team of committed open source advocates, and believe making our infrastructure as open as possible is the right thing to do.
This doesn't just hold true for open source projects. Commercial businesses can also benefit from making their infrastructures more open to others in their organization. Imagine empowering your development department to prioritize infrastructure requests as they need them by proposing code for them rather than waiting on the operations team to prioritize and act upon a ticket. Or, imagine how much easier it would be for a development team to replicate production environments if they have real configurations to work from. Or, imagine how much smoother onboarding new operations folks will be with best practices being adhered to by the current team, along with suitable documentation.
In a journey that has taken our team several years, we've used Puppet to create an open source infrastructure that we can be proud of. This has essentially boiled down to three big steps:
- Prepare policies and segregate code
- Prepare the configuration management system
- Document and share
Prepare policies and segregate code
The OpenStack infrastructure team has a policy of only using open source products in our infrastructure. This may not be possible for every organization, but it has allowed us to freely share all of our components and have downstream projects fully replicate our infrastructure without licensing costs or any "black box" components of the infrastructure that may need to be further scrutinized.
This is not possible for all organizations, but it must be made clear what segments of the infrastructure are available to be freely shared and replicated and what are not. This allows you to be confident that you're not sharing proprietary configuration files and so that other departments understand the costs associated with replicating the infrastructure.
Now that you've identified what is proprietary and what is not, make the appropriate decisions as to segregation of code. If it's open source, or you're confident that your license allows for sharing of configuration files and details of the proprietary deployment within your organization, make that freely available to anyone in your organization.
Finally, make this easy on yourself next time around (or for potential downstreams) and put a license on all code and configuration files that are being created by your team. Yes, even configuration files.
Prepare your configuration management system
This is the technical meat of your transition to a more open infrastructure. When working inside a dedicated operations team, it's tempting to take shortcuts, bend best practices, and dump all configuration into a monolithic configuration repository. Even as an open source project, the OpenStack infrastructure team fell victim to some of this along the way, but we've worked our way out of it as we sought to be more appealing to downstreams.
Leverage existing open source modules
There's no need for us to write a new Apache module for Puppet. We could pull the public modules directly in and propose changes upstream if we needed additional features.
Temptation is strong to simply download open source modules and modify them directly, essentially creating your own fork that is used internally. But this is a major maintenance burden, as you can no longer easily upgrade to the latest open source version, and leads to other poor practices like defining custom variables within the module. Indeed, our team did some of this early on, which eventually led to a series of projects to detangle our configurations. To begin, we wrote a specification that outlined the steps we used to separate out and normalize our modules, including some clever git commands we used to preserve history for all the files that ended up in their separate modules.
Moving forward, write your modules as if they are to be consumed by others, and be sure to keep the pristine module separate from your local modifications. Local modifications for servers you actually deploy can exist in a specific module that's tuned to your organization. We call ours openstack_project.
Split out system and project configuration
Early on, we had a monolithic configuration. We soon learned that if a downstream wishes to adopt our infrastructure for their own project, they will need our system configurations, but will want to define their own project configurations.
We needed a plan for this, so our team first wrote a specification that outlined what we needed to split out, and then went to work making the changes so this could happen. Everything related to the services themselves needed to run our infrastructure (our code review system, testing servers, etc.) was configured in one repository, and then we had a separate repository that had things like the list of OpenStack projects, our custom IRC channels set, the actual jobs run on our Jenkins server, and more.
Split out sensitive data
Now, as free as we are as a community, we still need to protect the integrity of the OpenStack development platform, which means we do have some secrets. Private SSL certificates and various types of authentication credentials need to be stored in a safe place where only a select group of administrators have access to them. In our team, we use Puppet's Hiera tool to store these values. We keep it in private revision control only accessible by our root admins and make clear in our documentation that we use it and what our variables are, so anyone consuming our infrastructure can replicate portions as needed with their own data.
Provide a window into your infrastructure
Some companies allow limited access to production servers to their developers, but we instead use a tool called PuppetBoard to give a glimpse into what is happening with our servers. With this tool, anyone with access can browse a web UI that shows them some details about the servers and whether a specific change has landed and if it was successful. This gives contributors to our project a window into how things are progressing and allows them to act independently with regard to changes that are approved. Did a change land yet? Were there any errors? Check PuppetBoard. You can browse our public instance here.
Document and share
Now that you have an infrastructure you would be proud to show off to your parents, document and share it with your organization.
Make sure you document:
- Where to find the code and configuration
- How to submit changes (code review? pull request? bug report? ticket?)
- How to test changes in a replicated environment (Look here for an example)
- Any bootstrapping and glue information that may not be obvious by looking at code and configurations. This can include anything your operations team has to do manually.
Finally, open the flood gates! Share your infrastructure with your organization and see how a more empowered development team and a more informed organization works with a clear window into how your infrastructure works.
Beyond the expertise of OpenStack
In addition to OpenStack, several other open source projects havemade all or portions of their infrastructure open source. You can learn from their work as well by browsing their open configurations: