Traditional continuous integration (CI) systems are designed as a pipeline of jobs. You have a peer review, then the build job, then the unit tests job, then the integration tests job, then the performance tests jobs, and so on.
Each job is triggered by the successful completion of the previous job, and the first job is triggered by a change in the version control source code files. Of course, it can be more complex if you are targeting multiple binary platforms or if you are building a set of components to be able to test your full application.
But what happens when a job fails? According to Jez Humble and David Farley in the reference book Continuous Delivery, you have to first follow this rule: "Don't Check in on a Broken Build." In other words, don't make it worse by pushing other changes. If you do, you won't be able to know the cause of the breakage. Humble and Farley suggest using one of these two strategies to handle breakages:
- "Never go home on a broken build," meaning everyone on the team needs to stop working on their current tasks and switch to fixing the problem.
- "Always be prepared to revert to the previous revision." Reverting can also be a strategy to avoid blocking the whole team.
Of course, it could also be a mix: timebox the attempt to fix and then revert if outside of the allowed time.
Another way to mitigate this issue is to use an integration branch to merge the modifications only once the integration branch is green (all the tests are OK). With this tactic you have the same problem on the integration branch, but your master branch is always in a usable state.
This could work in small team when you can stop all the teammates from committing but even in small teams, this process often results in a CI that is red for long periods of time. You need to enforce a good discipline to succeed with this kind of CI or you can switch to the new way to do CI.
Current-generation CI is centered around the CI server. Changes are detected by the CI server and triggering jobs.
In next-generation CI, the system is centered around the code review system to be able to do actions before a modification is merged into the versioning system.
The decision of whether or not to implement a teammate code review process is up to you. I really recommend doing it to improve the quality of your code, but that's orthogonal to the CI system. Anyway, the important thing to understand is that the triggering of builds and tests are done by a new submission to the code review system. And once all the tests are green, the merge of the code to the master branch happens. This way, your master branch is always green and your developers can work in parallel on their modifications. This new CI system allows you to scale your automation effort and be fluid, as there are no longer blocking issues.
This new approach to CI has been implemented at scale in the OpenStack project to manage the CI of all the different sub-projects. To give you an idea of the scale, every day OpenStack handles 1,000 proposed patch sets, 7,500 posted comments and votes on Gerrit, 16,000 test environments spawned, and 250 changes merged (source).
To implement this next-generation CI system, the OpenStack project uses the following components:
- Gerrit, the code review and git repository manager.
- Zuul, the git source repositories gating system.
- Jenkins, the continuous integration server.
- Nodepool, intelligent Jenkins slave provisioning on OpenStack clouds.
These tools allow for parallel tests of the same project by using speculative merging strategies. If multiple reviews for the same project arise at the same time, Zuul is able to to test them in parallel by stacking them in a speculative manner. For example, if the reviews were named A, B, and C, Zuul will test A, A+B, and A+B+C in parallel. If they all succeed, it will have been the same as if A was tested and merged, then B was tested on top of the branch (A) and then merged, and the same for C on top of A+B. This speeds up the process a lot when you have multiple contributors to the same project.
Zuul is also able to manage cross-project dependencies, allowing the merging of reviews between repositories. That's key in a git world where your components live in different git repositories.
Try it yourself
For big teams or even for small teams, you can benefit from this workflow for your project by configuring the previously described components. There are puppet modules that can be used to configure these services easily.
Another way is to use our own integration of these services called Software Factory. You will get the following features:
- A nice integration under a single web menu.
- A single sign-on between all the services easily and external authentication on LDAP, GitHub or launchpad (cauth).
- A bug tracking system (Redmine).
- Collaboration tools:
- Paste for sharing output or code extracts.
- Etherpad for collaborative editing.
- Managed upgrades from previous versions in an easy way.
Because Software Factory is self-hosted (we use Software Factory to develop Software Factory), you can see it in action at softwarefactory-project.io.
If you want to test drive Software Factory, just follow the documentation at softwarefactory-project.io/docs/deploy.html.
Keep us posted on what you have achieved using this new way of doing CI!