A sysadmin's guide to Ansible: How to simplify tasks

Image by:

Opensource.com

In my previous article, I discussed how to use Ansible to patch systems and install applications. In this article, I'll show you how to do other things with Ansible that will make your life as a sysadmin easier. First, though, I want to share why I came to Ansible.

I started using Ansible because it made patching systems easier. I could run some ad-hoc commands here and there and some playbooks someone else wrote. I didn't get very in depth, though, because the playbook I was running used a lot of lineinfile modules, and, to be honest, my regex techniques were nonexistent. I was also limited in my capacity due to my management's direction and instructions: "You can run this playbook only and that's all you can do."

After leaving that job, I started working on a team where most of the infrastructure was in the cloud. After getting used to the team and learning how everything works, I started trying to find ways to automate more things. We were spending two to three months deploying virtual machines in large numbers—doing all the work manually, including the lifecycle of each virtual machine, from provision to decommission. Our work often got behind schedule, as we spent a lot of time doing maintenance. When folks went on vacation, others had to take over with little knowledge of the tasks they were doing.

Diving deeper into Ansible

Sharing ideas about how to resolve issues is one of the best things we can do in the IT and open source world, so I went looking for help by submitting issues in Ansible and asking questions in roles others created.

Reading the documentation (including the following topics) is the best way to get started learning Ansible.

If you are trying to figure out what you can do with Ansible, take a moment and think about the daily activities you do, the ones that take a lot of time that would be better spent on other things. Here are some examples:

Managing accounts in systems: Creating users, adding them to the correct groups, and adding the SSH keys… these are things that used to take me days when we had a large number of systems to build. Even using a shell script, this process was very time-consuming.
Maintaining lists of required packages: This could be part of your security posture and include the packages required for your applications.
Installing applications: You can use your current documentation and convert application installs into tasks by finding the correct module for the job.
Configuring systems and applications: You might want to change /etc/ssh/sshd_config for different environments (e.g., production vs. development) by adding a line or two, or maybe you want a file to look a specific way in every system you're managing.
Provisioning a VM in the cloud: This is great when you need to launch a few virtual machines that are similar for your applications and you are tired of using the UI.

Now let's look at how to use Ansible to automate some of these repetitive tasks.

Managing users

If you need to create a large list of users and groups with the users spread among the different groups, you can use loops. Let's start by creating the groups:

- name: create user groups
  group:
    name: "{{ item }}"
  loop:
    - postgresql
    - nginx-test
    - admin
    - dbadmin
    - hadoop

You can create users with specific parameters like this:

- name: all users in the department
  user:
    name:  "{{ item.name }}"
    group: "{{ item.group }}"
    groups: "{{ item.groups }}"
    uid: "{{ item.uid }}"
    state: "{{ item.state }}"
  loop:
    - { name: 'admin1', group: 'admin', groups: 'nginx', uid: '1234', state: 'present' }
    - { name: 'dbadmin1', group: 'dbadmin', groups: 'postgres', uid: '4321', state: 'present' }
    - { name: 'user1', group: 'hadoop', groups: 'wheel', uid: '1067', state: 'present' }
    - { name: 'jose', group: 'admin', groups: 'wheel', uid: '9000', state: 'absent' }

Looking at the user jose, you may recognize that state: 'absent' deletes this user account, and you may be wondering why you need to include all the other parameters when you're just removing him. It's because this is a good place to keep documentation of important changes for audits or security compliance. By storing the roles in Git as your source of truth, you can go back and look at the old versions in Git if you later need to answer questions about why changes were made.

To deploy SSH keys for some of the users, you can use the same type of looping as in the last example.

- name: copy admin1 and dbadmin ssh keys
  authorized_key:
    user: "{{ item.user }}"
    key: "{{ item.key }}"
    state: "{{ item.state }}"
    comment: "{{ item.comment }}"
  loop:
    - { user: 'admin1', key: "{{ lookup('file', '/data/test_temp_key.pub'), state: 'present', comment: 'admin1 key' }
    - { user: 'dbadmin', key: "{{ lookup('file', '/data/vm_temp_key.pub'), state: 'absent', comment: 'dbadmin key' }

Here, we specify the user, how to find the key by using lookup, the state, and a comment describing the purpose of the key.

Installing packages

Package installation can vary depending on the packaging system you are using. You can use Ansible facts to determine which module to use. Ansible does offer a generic module called package that uses ansible_pkg_mgr and calls the proper package manager for the system. For example, if you're using Fedora, the package module will call the DNF package manager.

The package module will work if you're doing a simple installation of packages. If you're doing more complex work, you will have to use the correct module for your system. For example, if you want to ignore GPG keys and install all the security packages on a RHEL-based system, you need to use the yum module. You will have different options depending on your packaging module, but they usually offer more parameters than Ansible's generic package module.

Here is an example using the package module:

  - name: install a package
    package:
      name: nginx
      state: installed

The following uses the yum module to install NGINX, disable gpg_check from the repo, ignore the repository's certificates, and skip any broken packages that might show up.

  - name: install a package
    yum:
      name: nginx
      state: installed
      disable_gpg_check: yes
      validate_certs: no
      skip_broken: yes

Here is an example using Apt. The Apt module tells Ansible to uninstall NGINX and not update the cache:

  - name: install a package
    apt:
      name: nginx
      state: absent
      update_cache: no

You can use loop when installing packages, but they are processed individually if you pass a list:

  - name:
      - nginx
      - postgresql-server
      - ansible
      - httpd

NOTE: Make sure you know the correct name of the package you want in the package manager you're using. Some names change depending on the package manager.

Starting services

Much like packages, Ansible has different modules to start services. Like in our previous example, where we used the package module to do a general installation of packages, the service module does similar work with services, including with systemd and Upstart. (Check the module's documentation for a complete list.) Here is an example:

  - name: start nginx
    service: 
      name: nginx
      state: started

You can use Ansible's service module if you are just starting and stopping applications and don't need anything more sophisticated. But, like with the yum module, if you need more options, you will need to use the systemd module. For example, if you modify systemd files, then you need to do a daemon-reload, the service module won't work for that; you will have to use the systemd module.

  - name: reload postgresql for new configuration and reload daemon
    systemd:
      name: postgresql
      state: reload
      daemon-reload: yes

This is a great starting point, but it can become cumbersome because the service will always reload/restart. This a good place to use a handler.

If you used best practices and created your role using ansible-galaxy init "role name", then you should have the full directory structure. You can include the code above inside the handlers/main.yml and call it when you make a change with the application. For example:

handlers/main.yml

  - name: reload postgresql for new configuration and reload daemon
    systemd:
      name: postgresql
      state: reload
      daemon-reload: yes

This is the task that calls the handler:

  - name: configure postgresql
    template:
      src: postgresql.service.j2
      dest: /usr/lib/systemd/system/postgresql.service
    notify: reload postgresql for new configuration and reload daemon

It configures PostgreSQL by changing the systemd file, but instead of defining the restart in the tasks (like before), it calls the handler to do the restart at the end of the run. This is a good way to configure your application and keep it idempotent since the handler only runs when a task changes—not in the middle of your configuration.

The previous example uses the template module and a Jinja2 file. One of the most wonderful things about configuring applications with Ansible is using templates. You can configure a whole file like postgresql.service with the full configuration you require. But, instead of changing every line, you can use variables and define the options somewhere else. This will let you change any variable at any time and be more versatile. For example:

[database]
DB_TYPE  = "{{ gitea_db }}"
HOST     = "{{ ansible_fqdn}}:3306"
NAME     = gitea
USER     = gitea
PASSWD   = "{{ gitea_db_passwd }}"
SSL_MODE = disable
PATH     = "{{ gitea_db_dir }}/gitea.db

This configures the database options on the file app.ini for Gitea. This is similar to writing Ansible tasks, even though it is a configuration file, and makes it easy to define variables and make changes. This can be expanded further if you are using group_vars, which allows you to define variables for all systems and specific groups (e.g., production vs. development). This makes it easier to manage variables, and you don't have to specify the same ones in every role.

Provisioning a system

We've gone over several things you can do with Ansible on your system, but we haven't yet discussed how to provision a system. Here's an example of provisioning a virtual machine (VM) with the OpenStack cloud solution.

  - name: create a VM in openstack
    osp_server:
      name: cloudera-namenode
      state: present
      cloud: openstack
      region_name: andromeda
      image: 923569a-c777-4g52-t3y9-cxvhl86zx345
      flavor_ram: 20146
      flavor: big
      auto_ip: yes
      volumes: cloudera-namenode

All OpenStack modules start with os, which makes it easier to find them. The above configuration uses the osp-server module, which lets you add or remove an instance. It includes the name of the VM, its state, its cloud options, and how it authenticates to the API. More information about cloud.yml is available in the OpenStack docs, but if you don't want to use cloud.yml, you can use a dictionary that lists your credentials using the auth option. If you want to delete the VM, just change state: to absent.

Say you have a list of servers you shut down because you couldn't figure out how to get the applications working, and you want to start them again. You can use os_server_action to restart them (or rebuild them if you want to start from scratch).

Here is an example that starts the server and tells the modules the name of the instance:

  - name: restart some servers
    os_server_action:
      action: start
      cloud: openstack
      region_name: andromeda
      server: cloudera-namenode

Most OpenStack modules use similar options. Therefore, to rebuild the server, we can use the same options but change the action to rebuild and add the image we want it to use:

  os_server_action:
    action: rebuild
    image: 923569a-c777-4g52-t3y9-cxvhl86zx345

Doing other things

There are modules for a lot of system admin tasks, but what should you do if there isn't one for what you are trying to do? Use the shell and command modules, which allow you to run any command just like you do on the command line. Here's an example using the OpenStack CLI:

  - name: run an opencli command
    command: "openstack hypervisor list"

They are so many ways you can do daily sysadmin tasks with Ansible. Using this automation tool can transform your hardest task into a simple solution, save you time, and make your work days shorter and more relaxed.