Most operations shops are well down the road to highly automated configuration and provisioning systems. Sometimes this transformation is part of a DevOps transformation, and other times it's because it's the best way to manage change in the environment.
These systems are very good at creating system artifacts that fit our needs, but issues still arise the first time a development team deploys an application to the node. Problems aren't caught until a human gets involved, and troubleshooting is a long, manual process involving checklists and ad-hoc fixes. How do we smooth that rocky road where operations and development artifacts meet?
Enter CI/CD
Continuous integration and delivery (CI/CD) has been a watchword in IT shops for years, but primarily as a development process. CI is about the automation of testing for changes made to the code and to prevent changes introduced to the codebase from breaking the application. CD is about making sure that any final artifacts are suitable for use in a production environment. Any application build that makes it through integration tests can be identified for easy deployment.
From an operations point of view, our "application" is a fit for a server or a container. Our "code" is the individual automation snippets that do actions. Our "build" is the blueprint for stringing those snippets together to get a working system. All changes to an automation script or file should be tested to ensure they don't cause problems.
Think of the tests of your automation like bench or smoke tests. Most automation doesn't consist of one big blob of code for each system. Rather, we've adopted other patterns, like "don't repeat yourself" (DRY) to build reusable chunks of automation that we can recombine to get the configuration we want. These sorts of tests let us find integration issues between roles, uncover problems before they show up in production, and generally prove that the system is fit for its purpose. A CI/CD pipeline is a tool designed to run and manage these sorts of tests and sign-offs.
Foundations for solid pipelines
We need to agree on a few principles to take advantage of a pipeline in operations:
- All infrastructure code is in version control
- Version control is important in a non-pipeline environment but critical to the operations of a pipeline. Ops needs to be aware of what changes "broke the build" and provide clear guidance on what is deployable. This means you can be sure that a container image built and stored in a registry by the pipeline, or a virtual machine provisioned and configured by the automation, will be identical and functional.
- All infrastructure code changes get tested individually
- We make small changes to our codebase, and those changes are vetted for basic correctness. That includes syntax checking, functionality, dependencies, etc. This level of testing is like unit testing for an application.
- All infrastructure code gets tested as a combined system
- Infrastructure components are made up of discrete, smaller chunks and need to be tested as a whole. These tests are for characteristics and behaviors of what we decide is a "working system." Our automation may be correct and working, but still be incomplete or have conflicting steps in different roles (e.g., we started MySQL but didn't open the firewall, or we locked down the port in a security role).
Concrete Python
This is all abstract, so I'm going to walk through a simple example. The roles and the tests are not production quality, but hopefully, they're functional enough for you to use them as a starting point in your investigations. I'm also going to work with the tools that I'm most familiar with. Your environment will vary, but the concepts should translate between any of the tools in your toolbox. If you'd like to see the example code, you can check out the GitHub repository.
Here's what's in my toolbox:
- Ansible: A popular automation engine written in Python that I've been using for several years, which I'll use to build a single role for testing
- Molecule: A newer, role-based testing harness for Ansible that brings some test-driven design concepts to role development
- Testinfra: A Pytest-based framework for inspecting system states, which I'll use to test the behavior of the role
- Jenkins Blue Ocean: A new pipeline plug-in for Jenkins that provides a new UI for pipelines and supports the Jenkinsfile definitions for a pipeline
Here are some other details about the setup on a Fedora 28 system:
- Since Ansible, Molecule, and Testinfra are all distributed via PyPi, I've installed them all with pip globally.
- There's a container for Jenkins with the new UI plugin, so I run that on the same Fedora 28 host.
- Molecule supports testing inside a container, and Jenkins can use that container as a builder in a pipeline. To get the Jenkins docker plugin in the Jenkins container talking to Docker on the host, I ran that container as `privileged`, mounted the docker socket file, and changed the SELinux context on the host. You will need to determine what's best for your environment, as this won't be the best option for anything beyond this proof of concept.
Later I'll show you the CentOS 7 base image I built for Molecule that includes all the same dependencies as the Fedora 28 host where we developed the role.
Create the role directory
Let's build a role to install an Apache web server. In the top-level project folder, we'll have our inventory, a site playbook, and a roles
directory. In the roles
directory, we'll use Molecule to initialize the role directory structure.
molecule init role -r webserver
--> Initializing new role webserver...
Initialized role in /root/iac-ci/blog/ansible/roles/webserver successfully.
In the newly created webserver
directory, you'll see something that looks like the result of an ansible-galaxy init
command with the addition of a molecule
directory. I haven't changed any of the defaults on the command line, which means Molecule will use Docker as a target to run playbooks and Testinfra as the verifier to run tests. You can look at molecule/default/molecule.yml
for those details or to change options.
Write our role
Normally we'd fire up our editor on tasks/main.yml
and start writing Ansible tasks. But since we're thinking ahead about tests, let's start there (otherwise called test-driven design). Since we need a running webserver, we've got two requirements:
- is the service running?
- is there a page to serve?
So we can open the default Python script that Molecule created for Testinfra, run molecule/default/tests/test_default.py
, and add the following after the test.
def test_httpd_runing(host):
httpd = host.service("httpd")
assert httpd.is_running
def test_index_contents(host):
index = host.file("/var/www/html/index.html")
assert index.exists
We're using two built-in modules, Service and File, to check the state of the system after the Ansible role executes. We'll use these same tests for our smoke testing, but in a live environment, you're going to want more sophisticated checks against expected behaviors.
Now we can add our desired tasks and templates to the role to satisfy the requirements. We'll install the package, create the templated index.htmltasks/main.yml
, and you can see the rest in the repository.
- name: Install Apache
package:
name: "{{ item }}"
state: present
with_items:
- httpd
- name: Create index
template:
src: templates/index.html.j2
dest: /var/www/html/index.html
notify:
- restart httpd
Running a test
The final step before running any tests with Molecule or Testinfra is to create the base image we need. Not only do we need the dependencies for the framework, we also want to use a container that has an init
system. This lets us test for the eventual target of a virtual machine without needing a second available VM.
FROM centos/systemd
RUN yum -y install epel-release && \
yum -y install gcc python-pip python-devel openssl-devel docker openssh-clients && \
pip install docker molecule testinfra ansible && \
yum clean all
Give the container a name you will remember since we'll use it in our Jenkins pipeline.
docker build . -t molecule-base
You can run the tests on the host now before you build the pipeline. From the roles/webserver
directory, run molecule test
, which will execute its default matrix including the Testinfra tests. You can control the matrix, but when we build our pipeline, we'll opt to run the steps individually instead of using the test
command.
With our role written and our tests in place, we can create the pipeline to control our build.
Building the pipeline
The Jenkins installation guide shows you how to get the container and how to unlock Jenkins after running. Alternatively, you can use this pipeline tutorial, which will also walk you through connecting your Jenkins instance to GitHub. The pipeline will check your code out every time it runs, so it's critical to have Ansible, Molecule, and Testinfra under source control.
Head to the web UI for Jenkins Blue Ocean at localhost:8080/blue
and click on New Pipeline. If you're using a fork of my GitHub repository, Jenkins will detect the existing Jenkinsfile and start running the pipeline right away. You may want to choose a new repository without a Jenkinsfile.
On the new pipeline, you should see a Pipeline Settings column on the right side. Select Docker in the drop-down box and add the name of your base image to the box labeled Image. This will be the base image used for all Docker containers created by this pipeline.
Under Environment, click the blue + symbol and add ROLEDIR
under Name and ansible/roles/webserver
under Value. We'll use this several times in the pipeline. Setting an environment variable at the top level means it can be accessed in any stage.
Click on the + in the center of the page to create a new stage. Stages are chunks of work done by the pipeline job, and each stage can be made up of multiple, sequential steps. For this pipeline, we're going to create a stage for each Molecule command we want to run, for the Ansible playbook run against the VM, and for the Testinfra tests run against the VM.
The Molecule stages will all be shell commands, so click Add Step and select Shell Script. In the box, add the following lines:
cd $ROLEDIR
molecule syntax
This will make sure we're in the role directory in the local Jenkins working directory before calling Molecule. You can look at the test matrix to see what specific checks you want to run. You won't need to create or destroy any instances, as Jenkins will manage those containers.
Once you've added a few stages, you can hit Save. This will automatically commit the Jenkinsfile to the repository and start the pipeline job. You have the choice of committing to master or a new branch, which means you can test new code without breaking production.
Alternately, the Jenkinsfile is committed in the same repository as the rest of our code. You can directly edit the file to duplicate the Molecule stages and use Git to commit the changes from the command line. You can have Jenkins scan the repository and pick up the new stages.
For the Ansible stage, we need to make sure we have an entry in the test host in the inventory
file and a site playbook that includes the role we want to run.
# inventory
iac-tgt.example.com ansible_user=root
# site.yml
---
- hosts: all
roles:
- role: webserver
The step type for this stage is "Invoke an Ansible playbook." Fill in all the values that are appropriate. For anything needing a path, like Playbook
, use a relative path from the base of the repository, like ansible/site.yml
. You can import SSH keys or use an Ansible vault file for credentials.
Our last stage is the Testinfra stage, which is also a shell script. To run Testinfra from the command line without invoking Molecule, we'll need to make sure to pass some variables. Testinfra can use Ansible as a connection backend, so we can use the same inventory and credentials from before.
In the Shell Script box, add the following:
testinfra --ssh-identity-file=${KEYFILE} --connection=ansible --ansible-inventory=${MOLECULE_INVENTORY_FILE}
${ROLEDIR}/molecule/default/tests/test_default.py
In the Settings for the stage, create the following environment variable:
MOLECULE_INVENTORY_FILE ansible/inventory
The KEYFILE
variable is created by a variable binding of a credential. This needs to be done in the Jenkinsfile, as configuring that step isn't yet supported in the interface. This will make the same SSH key configured for the Ansible stage available as a file for the duration of the stage.
Through the Jenkinsfile in the example repository and these steps, you should have a working pipeline. And hopefully you've got a grasp of not only how it works, but why it's worth the effort to test our infrastructure the same way our developer colleagues test application code changes. While the examples are simple, you can build a test suite that ensures the infrastructure code deploys a system that the application code can rely on. In the spirit of DevOps, you'll need to work with your developer team to hash out those acceptance tests.
Comments are closed.