How to use Ansible to set up a Git server over SSH

Git is the most widely deployed version-control system in the open source community. Part 2 of this series looks at using Ansible to set up Git on your LAN.

9 Lessons from 25 Years of Linux Kernel development

Image by:

Internet Archive Book Images. Modified by Opensource.com. CC BY-SA 4.0

In part 1 of this series, I described basics of the Ansible remote administration tool. I set up the environment, installed the Ansible package on the control machine, set up a basic inventory, and demonstrated basic playbooks. I had no need to back up these simple and easy-to-reproduce playbooks, but as these playbooks begin to serve as the blueprint(s) and documentation of my lab environment, I'll need to consider how to back them up.

In this second part of the series, I'd planned to cover the Copy, systemd, service, apt, yum, virt, and user modules, but to keep things focused and to the point, I've decided to move most of that discussion into a subsequent article and tackle another way to use Ansible: setting up a Git SSH server for version control.

The need for version control

Version control is a method of tracking the changes that are made to a file, series of files, or projects you may work on. Typically, a version control system stores basic information such as:

which user made changes,
when the change was made, and
what files were changed.

This can be vital information, especially when combined with the ability to revert changes, and will save you a lot of trouble in the long term.

So what options are available? The most popular version control systems are Mercurial, Subversion, and Git. The pros and cons of each are outside of the scope of this series, but Git has become the most widely deployed source-control system in the open source community. There are a few popular options to consider: Stash/Bitbucket (from Atlassian), GitHub, GitLab, and Perforce with Git (along with many other corporate implementations). All of them have nice web frontends and are suitable for group collaboration; however, they are also far more involved to set up.

Git does offer another advantage though. You can run a Git server over secure shell (SSH) on your LAN, which is less complicated to set up and manage, plus has fairly low hardware resource requirements.

Git SSH server requirements

The hardware requirements for a Git SSH server are minimal. In fact, there are no server specifications provided; any machine with a network connection that is running a distribution of Linux capable of installing Git can provide Git SSH services.

In terms of software, the Git server requires the following setup:

Git package installed on the machine that will act as a Git SSH server
SSH installed and configured to allow connections via SSH keys
A dedicated user on the Git SSH server that provides access to the server over SSH (by convention this is usually the git user)
SSH public keys for all clients installed in the authorized_keys file for the dedicated user in the previous step

My Git SSH server is a CentOS 7.3 VM with 26GB of space in /home, which is where the Git repositories will reside.

Using Ansible to set up the Git server

Using Ansible to set up a Git SSH server may be overkill for a home or lab environment, but it does serve two important functions:

It provides an avenue to expand upon some more advanced Ansible topics.
It provides a repeatable set of steps, which (due to the nature of Ansible) are self-documenting. This means that replication of the Git SSH server should be trivial.

The first step is to copy your SSH key from the Ansible control machine to the Git SSH server. As a refresher, an SSH key can be generated with the ssh-keygen command. Once the SSH key has been created it can be pushed to the remote host via the ssh-copy-id user@host command.

The next step is to install the Git package. Example 1 demonstrates a simple playbook to accomplish this task.

Example 1: Package installation playbook

- hosts: "{{ hostname }}"
  gather_facts: False
  vars:
    - packages: ["git", "nmap"]
  tasks:
    - name: Installing {{ packages }} on {{ hostname }}
      yum:
        name: "{{ item }}"
        state: present
      with_items: "{{ packages }}"

Introduction to Ansible variables

This playbook uses both defined (such as packages) and undefined (such as hostname) variables. There are a few ways that Ansible can initialize arbitrary variables (i.e., those that are not defined when Ansible gathers facts):

With the vars: section of a playbook, as seen in Example 4 in the previous article. Variables defined at this level are in the local scope, which means they are only accessible to the immediate playbook and not subsequent playbooks.
Defining a vars file, which can happen if the Ansible project you are working on has the proper folder structure defined. (I won't go deeper into this here, but you can read the official documentation on roles for more information.)
Running a playbook with the --extra-vars= argument.

The playbook in Example 1 was designed to be run with the --extra-vars= argument. To run the playbook, the full command looks like this:

ansible-playbook install_git.yaml --extra-vars="hostname=git"

This method of passing in variables is the most flexible and, for specific types of playbooks, can be the most desirable.

Variables can be defined with more than just strings; Ansible also supports data structures. A data structure refers to the way in which data can be stored and accessed. The most commonly used types in Ansible are lists and dictionaries.

A list is denoted by a series of variables inside of square brackets. In Example 1, ["git", "nmap"] are defined in a list called packages. A dictionary is defined with curly braces ({}) and uses a key-value pair notation and would look like this: { username: jdoe, group: users } I won't delve into when to use each type of data structure in this series, but it is important to be aware that Ansible supports more than one type.

Note: The NMAP utility was included in Example 1 simply to demonstrate the definition of a list in Ansible.

The final new concept introduced in this small playbook is the with_items: line. This is a useful ability in Ansible. It is similar to the idea of a for-loop. Ansible will iterate over each item in a list and take action based upon it; however, unlike other programming languages, you may not define the name of the variable used in the loop. In Ansible, the looping variable is always the word item.

As a final note, you may have noticed that it is possible to use some variables in the name of the task. This is helpful for providing descriptive output during an Ansible run. For example, below is the task output from the playbook in Example 1:

TASK [Installing [u'git', u'nmap']] on git
*************************************************************
changed: [git] => (item=[u'git', u'nmap'])

Although the list is simply translated into text in the task name, the person running the playbook can tell what packages are being installed to which host.

Creating users with Ansible

The next step is to create the git user. The following playbook will do so:

Example 2: Creating the Git user

- hosts: "{{ hostname }}"
  gather_facts: false
  tasks:
    - name: create and/or change {{ username}}'s  password
      user:
        name: "{{ username }}"
        password: MYg1tpassw0rd

This playbook also relies on the --extra-vars= parameter when run. Similarly to before, the playbook is run:

ansible-playbook user_setup_with_params.yaml --extra-vars="hostname=git username=git"

This playbook introduces the user module, which has far more options that can be passed in if desired. This module does exactly as you would expect, so there is not much more to discuss. Like the Unix equivalent commands, the user module can be used to create, manage, or remove local users on the system. Generally speaking, this module should not be used if your users are centrally managed (such as in Active Directory or LDAP).

Enabling an example Git repository

There are two steps remaining to completely stand up a functioning Git SSH server:

Create a sample bare repository for users to collaborate on.
Enable SSH access to the Git user so that people can commit and pull down playbooks.

To create Git repositories, the playbook in Example 3 demonstrates both the command module (which simply runs shell commands as you would type them in the terminal), as well as the file module. Although Ansible often has a native module for most tasks (such as a Git module), there are times where the module may not do exactly what you wish. In these cases, using the command module is acceptable.

Warning: The command module is not idempotent! This means that the command module will run regardless of whether the task was previously completed. In some cases, such as dealing with SSL certificates, there may be unintended side effects of issuing the same command multiple times.

Example 3: Creating the initial Git repository

- hosts: "{{ hostname }}"
  gather_facts: False
  tasks:
    - name: git init --bare {{ project }} with the command module
      command: git init --bare {{project}}
      args:
        chdir: "{{ git_base_dir }}"
      become_user: git
    
    - name: Set the permissions on {{ git_base_dir }}/{{ project }}
      file:
        path: "{{ git_base_dir }}/{{ project }}"
        state: directory
        mode: 0755
        owner: git
        group: git 
        recurse: True

Note the args: and the chdir: sections of the first task. These are optional directives given to the command module. As one might expect, chdir: changes the current working directory to the specified location before the command is run. This ensures that the Git repository is created in the desired location.

In Example 3, I also could have used the command module to set the permissions on the directory. (The default permissions when using the command module are inherited from the current Ansible user. In this case, the repository would have been owned by root:root instead of the Git user.) However, I am trying to demonstrate several different modules, as well as maintaining as much idempotence as possible. The file module, as I am using it, should be self-explanatory. To run the playbook in Example 3, use the following command:

ansible-playbook initialize_git.yaml --extra-vars="hostname=git git_base_dir=/home/git/ project=newgitproject"

Git basics

To utilize the newly installed Git server, every user who needs to commit code must install their SSH key into the authorized_keys file of the git user on the remote host.

Several methods can be used to manage the propagation of SSH keys.

Example 4: SSH key propagation using authorized_key module with file glob

- hosts: "{{ hostname }}"
  gather_facts: false
  tasks:
    - name: copy ssh key using FILEGLOB
      authorized_key:
        key: "{{ lookup('file', item) }}"
        user: "{{ username }}"
        state: present
        exclusive: False
      with_fileglob: ../files/*.pub

In Example 4, the fileglob module allows for propagation of all SSH keys in a directory. In this case, the playbook in Example 4 will push any SSH key that ends in .pub. This is useful when you have a lot of SSH keys to push and don't want to list them all.

Example 5: SSH Key propagation using authorized_key module using with_items

- hosts: "{{ hostname }}"
  gather_facts: false
  vars:
    ssh_keyfile: [ "user1_ssh_key.pub", "user2_ssh_key.pub" ]
  tasks:  
    - name: copy ssh key using ITEM NAME
      authorized_key:
        key: "{{ lookup('file', '../files/'+item) }}"
        user: "{{ username }}"
        state: present
        exclusive: False
      with_items: 
        - "{{ ssh_keyfile }}"

In Example 5, each key is specified by name via the list ssh_keyfile. It uses with_items to loop over each key in the ssh_keyfile list. The advantage of this approach is that each key must be deliberately added instead of pushing every keyfile in a directory. Due to the way the key: attribute is formed inside of the authorized_key module, this method requires all keys to be in the same directory.

Figure 6: SSH key propagation using authorized_key module using with_file

- hosts: "{{ hostname }}"
  gather_facts: false
  tasks:
    - name: using with_file
      authorized_key:
        key: "{{ item }}"
        user: "{{ username }}"
        state: present
        exclusive: False
      with_file: 
        - ../files/user1_ssh_key.pub
        - ../files/user2_ssh_key.pub

Example 6 introduces the concept of with_file. It is similar to Example 5, except that it does not require the keys to be in the same directory. None of these approaches is inherently better than the other from a technical standpoint. Each method has both security and maintenance implications that should be considered on a case-by-case basis. For a small installation in a home setup, all of these methods are perfectly acceptable. To run any of the above playbooks use the following command:

ansible-playbook install_ssh_keys.yaml --extra-vars="hostname=git username=git"

Now that the SSH key(s) have been installed on the Git SSH server, your users are ready to check out code and start contributing to the new project. To pull down the newgitproject project off the Git SSH server, issue the following command (as a user whose SSH key has been installed using one of the methods in Example 4, 5, or 6).

[user@host ~]$ git clone ssh://git@git/home/git/newgitproject

You will receive the following output:

Cloning into 'newgitproject'...
warning: You appear to have cloned an empty repository.

I'll create an empty file to demonstrate the process:

cd newgitproject
touch test.txt

Next, add the file to Git for tracking:

git add test.txt
git commit -m "first commit of test.txt

You will see output that looks like this:

Committer: somecomment <user@host>
Your name and email address were configured automatically based
on your username and hostname. Please check that they are accurate.
You can suppress this message by setting them explicitly. Run the
following command and follow the instructions in your editor to edit
your configuration file:

    git config --global --edit

After doing this, you may fix the identity used for this commit with:

git commit --amend --reset-author
1 file changed, 0 insertions(+), 0 deletions(-)
 create mode 100644 test.txt

Information about configuring your Git client is available in the official documentation for first-time setup. However, you can still push to the Git SSH server without using the Git configuration options. If you use git log, you should see something similar to this:

commit f9bd46c57211fa5c35831bbe0ce1f9c0a34a0eba
Author: somecomment <user@host>
Date:   Wed Jul 12 08:41:20 2017 -0400

    first commit of test.txt

At this point, you should have a functional Git SSH server. As a matter of best practice, every time a change is made to a playbook, it should be checked into the Git server. This provides both the ability to track your changes as well as gives you an easy rollback point in case of problems.

Next Steps

I've walked through some of the skills needed to administer systems from Ansible. In the next article, I'll start setting up monitoring with Prometheus for data collection and Grafana for data visualization. I'll use a variety of Ansible modules to accomplish these tasks, which should provide you with a solid base from which you can create your own playbooks.