How an infrastructure team starts using CI/CD

Using dev test methods in the ops environment may catch potential infrastructure issues before they become problems.

14 ways being a runner is like working in tech

Image by:

Rikki Endsley. CC BY-SA 4.0

Most operations shops are well down the road to highly automated configuration and provisioning systems. Sometimes this transformation is part of a DevOps transformation, and other times it's because it's the best way to manage change in the environment.

These systems are very good at creating system artifacts that fit our needs, but issues still arise the first time a development team deploys an application to the node. Problems aren't caught until a human gets involved, and troubleshooting is a long, manual process involving checklists and ad-hoc fixes. How do we smooth that rocky road where operations and development artifacts meet?

Enter CI/CD

Continuous integration and delivery (CI/CD) has been a watchword in IT shops for years, but primarily as a development process. CI is about the automation of testing for changes made to the code and to prevent changes introduced to the codebase from breaking the application. CD is about making sure that any final artifacts are suitable for use in a production environment. Any application build that makes it through integration tests can be identified for easy deployment.

From an operations point of view, our "application" is a fit for a server or a container. Our "code" is the individual automation snippets that do actions. Our "build" is the blueprint for stringing those snippets together to get a working system. All changes to an automation script or file should be tested to ensure they don't cause problems.

Think of the tests of your automation like bench or smoke tests. Most automation doesn't consist of one big blob of code for each system. Rather, we've adopted other patterns, like "don't repeat yourself" (DRY) to build reusable chunks of automation that we can recombine to get the configuration we want. These sorts of tests let us find integration issues between roles, uncover problems before they show up in production, and generally prove that the system is fit for its purpose. A CI/CD pipeline is a tool designed to run and manage these sorts of tests and sign-offs.

Foundations for solid pipelines

We need to agree on a few principles to take advantage of a pipeline in operations:

All infrastructure code is in version control
- Version control is important in a non-pipeline environment but critical to the operations of a pipeline. Ops needs to be aware of what changes "broke the build" and provide clear guidance on what is deployable. This means you can be sure that a container image built and stored in a registry by the pipeline, or a virtual machine provisioned and configured by the automation, will be identical and functional.
All infrastructure code changes get tested individually
- We make small changes to our codebase, and those changes are vetted for basic correctness. That includes syntax checking, functionality, dependencies, etc. This level of testing is like unit testing for an application.
All infrastructure code gets tested as a combined system
- Infrastructure components are made up of discrete, smaller chunks and need to be tested as a whole. These tests are for characteristics and behaviors of what we decide is a "working system." Our automation may be correct and working, but still be incomplete or have conflicting steps in different roles (e.g., we started MySQL but didn't open the firewall, or we locked down the port in a security role).

Concrete Python

This is all abstract, so I'm going to walk through a simple example. The roles and the tests are not production quality, but hopefully, they're functional enough for you to use them as a starting point in your investigations. I'm also going to work with the tools that I'm most familiar with. Your environment will vary, but the concepts should translate between any of the tools in your toolbox. If you'd like to see the example code, you can check out the GitHub repository.

Here's what's in my toolbox:

Ansible: A popular automation engine written in Python that I've been using for several years, which I'll use to build a single role for testing
Molecule: A newer, role-based testing harness for Ansible that brings some test-driven design concepts to role development
Testinfra: A Pytest-based framework for inspecting system states, which I'll use to test the behavior of the role
Jenkins Blue Ocean: A new pipeline plug-in for Jenkins that provides a new UI for pipelines and supports the Jenkinsfile definitions for a pipeline

Here are some other details about the setup on a Fedora 28 system:

Since Ansible, Molecule, and Testinfra are all distributed via PyPi, I've installed them all with pip globally.
There's a container for Jenkins with the new UI plugin, so I run that on the same Fedora 28 host.
Molecule supports testing inside a container, and Jenkins can use that container as a builder in a pipeline. To get the Jenkins docker plugin in the Jenkins container talking to Docker on the host, I ran that container as `privileged`, mounted the docker socket file, and changed the SELinux context on the host. You will need to determine what's best for your environment, as this won't be the best option for anything beyond this proof of concept.

Later I'll show you the CentOS 7 base image I built for Molecule that includes all the same dependencies as the Fedora 28 host where we developed the role.

Create the role directory

Let's build a role to install an Apache web server. In the top-level project folder, we'll have our inventory, a site playbook, and a roles directory. In the roles directory, we'll use Molecule to initialize the role directory structure.

molecule init role -r webserver
--> Initializing new role webserver...
Initialized role in /root/iac-ci/blog/ansible/roles/webserver successfully.

In the newly created webserver directory, you'll see something that looks like the result of an ansible-galaxy init command with the addition of a molecule directory. I haven't changed any of the defaults on the command line, which means Molecule will use Docker as a target to run playbooks and Testinfra as the verifier to run tests. You can look at molecule/default/molecule.yml for those details or to change options.

Write our role

Normally we'd fire up our editor on tasks/main.yml and start writing Ansible tasks. But since we're thinking ahead about tests, let's start there (otherwise called test-driven design). Since we need a running webserver, we've got two requirements:

is the service running?
is there a page to serve?

So we can open the default Python script that Molecule created for Testinfra, run molecule/default/tests/test_default.py, and add the following after the test.

def test_httpd_runing(host):
  httpd = host.service("httpd")
  assert httpd.is_running

def test_index_contents(host):
  index = host.file("/var/www/html/index.html")
  assert index.exists

We're using two built-in modules, Service and File, to check the state of the system after the Ansible role executes. We'll use these same tests for our smoke testing, but in a live environment, you're going to want more sophisticated checks against expected behaviors.

Now we can add our desired tasks and templates to the role to satisfy the requirements. We'll install the package, create the templated index.htmltasks/main.yml, and you can see the rest in the repository.

- name: Install Apache
  package:
    name: "{{ item }}"
    state: present
  with_items:
    - httpd

- name: Create index
  template:
    src: templates/index.html.j2
    dest: /var/www/html/index.html
  notify:
    - restart httpd

Running a test

The final step before running any tests with Molecule or Testinfra is to create the base image we need. Not only do we need the dependencies for the framework, we also want to use a container that has an init system. This lets us test for the eventual target of a virtual machine without needing a second available VM.

FROM centos/systemd

RUN yum -y install epel-release && \
    yum -y install gcc python-pip python-devel openssl-devel docker openssh-clients && \
    pip install docker molecule testinfra ansible && \
    yum clean all

Give the container a name you will remember since we'll use it in our Jenkins pipeline.

docker build . -t molecule-base

You can run the tests on the host now before you build the pipeline. From the roles/webserver directory, run molecule test, which will execute its default matrix including the Testinfra tests. You can control the matrix, but when we build our pipeline, we'll opt to run the steps individually instead of using the test command.

With our role written and our tests in place, we can create the pipeline to control our build.

Building the pipeline

The Jenkins installation guide shows you how to get the container and how to unlock Jenkins after running. Alternatively, you can use this pipeline tutorial, which will also walk you through connecting your Jenkins instance to GitHub. The pipeline will check your code out every time it runs, so it's critical to have Ansible, Molecule, and Testinfra under source control.

Head to the web UI for Jenkins Blue Ocean at localhost:8080/blue and click on New Pipeline. If you're using a fork of my GitHub repository, Jenkins will detect the existing Jenkinsfile and start running the pipeline right away. You may want to choose a new repository without a Jenkinsfile.

On the new pipeline, you should see a Pipeline Settings column on the right side. Select Docker in the drop-down box and add the name of your base image to the box labeled Image. This will be the base image used for all Docker containers created by this pipeline.

Under Environment, click the blue + symbol and add ROLEDIR under Name and ansible/roles/webserver under Value. We'll use this several times in the pipeline. Setting an environment variable at the top level means it can be accessed in any stage.

Click on the + in the center of the page to create a new stage. Stages are chunks of work done by the pipeline job, and each stage can be made up of multiple, sequential steps. For this pipeline, we're going to create a stage for each Molecule command we want to run, for the Ansible playbook run against the VM, and for the Testinfra tests run against the VM.

The Molecule stages will all be shell commands, so click Add Step and select Shell Script. In the box, add the following lines:

cd $ROLEDIR
molecule syntax

This will make sure we're in the role directory in the local Jenkins working directory before calling Molecule. You can look at the test matrix to see what specific checks you want to run. You won't need to create or destroy any instances, as Jenkins will manage those containers.

Once you've added a few stages, you can hit Save. This will automatically commit the Jenkinsfile to the repository and start the pipeline job. You have the choice of committing to master or a new branch, which means you can test new code without breaking production.

Alternately, the Jenkinsfile is committed in the same repository as the rest of our code. You can directly edit the file to duplicate the Molecule stages and use Git to commit the changes from the command line. You can have Jenkins scan the repository and pick up the new stages.

For the Ansible stage, we need to make sure we have an entry in the test host in the inventory file and a site playbook that includes the role we want to run.

# inventory
iac-tgt.example.com ansible_user=root

# site.yml
---
- hosts: all
  roles:
    - role: webserver

The step type for this stage is "Invoke an Ansible playbook." Fill in all the values that are appropriate. For anything needing a path, like Playbook, use a relative path from the base of the repository, like ansible/site.yml. You can import SSH keys or use an Ansible vault file for credentials.

Our last stage is the Testinfra stage, which is also a shell script. To run Testinfra from the command line without invoking Molecule, we'll need to make sure to pass some variables. Testinfra can use Ansible as a connection backend, so we can use the same inventory and credentials from before.

In the Shell Script box, add the following:

testinfra --ssh-identity-file=${KEYFILE} --connection=ansible --ansible-inventory=${MOLECULE_INVENTORY_FILE} 
${ROLEDIR}/molecule/default/tests/test_default.py

In the Settings for the stage, create the following environment variable:

MOLECULE_INVENTORY_FILE ansible/inventory

The KEYFILE variable is created by a variable binding of a credential. This needs to be done in the Jenkinsfile, as configuring that step isn't yet supported in the interface. This will make the same SSH key configured for the Ansible stage available as a file for the duration of the stage.

Through the Jenkinsfile in the example repository and these steps, you should have a working pipeline. And hopefully you've got a grasp of not only how it works, but why it's worth the effort to test our infrastructure the same way our developer colleagues test application code changes. While the examples are simple, you can build a test suite that ensures the infrastructure code deploys a system that the application code can rely on. In the spirit of DevOps, you'll need to work with your developer team to hash out those acceptance tests.

Comments are closed.

This work is licensed under a Creative Commons Attribution-Share Alike 4.0 International License.