Write your own pre-commit hooks

In one of my previous blog posts I described that using pre-commit hooks makes life easier, it helps you writing better code. When you want to commit your changes, you immediately get a result if your code has met the various criteria the owner of the repository has set. When you google around, you will find various scripts that you can make use in your own setup. But what if you can not find that specific one? Then you will need to create your own hook.

I had to do this as well, not because I had a rare case to solve, I just wanted to learn how to write a hook myself so I can easily expand my own pre-commit library. And it seems out to be very simple. 🙂

Use case

My very basic use case: I want to validate a yaml file and make sure that it is properly formatted.

I found a very basic tool that helped me format an yaml file, named “yamlfmt“. I need to use this tool in a bash script, which you can see here:

#!/usr/bin/env bash
# Will pretty print YAML files.

if which yamlfmt &> /dev/null $? != 0 ; then
    echo "yamlfmt must be installed: pip install yamlfmt"
    exit 1
fi

EXIT_CODE=0

for file in $@; do
  yamlfmt ${file} -w
  if [[ $? -ne 0 ]]
    then  EXIT_CODE=1
          echo $file
  fi
done
exit $EXIT_CODE

First we need to make sure that we check that the tool is installed, otherwise we print an statement on how the user can install the tool. We need to have that tool installed, so we immediately exit the script with an exit code of 1. With that, the “git commit” action will also fail and thus the user needs to take action to install the tool before it can try to commit the changes.

The last part of the script is doing a for loop and will run the “yamlfmt” command on each file that is passed as an argument to the script. With each pre-commit hook, all of the files that are part of the commit are passed as an argument to the script (Not entirely, but we will discuss this a bit further on ;-)). We collect the exit code of the “yamlfmt” command and checks if that will not equals to 0. If that happens, then there will be an issue with the yaml file and we print the name of the file to stdout.

But how does the script know that it should only do yaml files? If we have a text file or png file, this will fail!

First, we need to add this script into a repository that contains other pre-commit hooks scripts. If you don’t have one, or don’t have other scripts is also fine. I have a dedicated repository for that, which you can find here https://github.com/dj-wasabi/pre-commit-hooks. This repository contains all the pre-commit hooks that I use in 1 or more (public) available repositories. We just need to have a git repository where we can store this script and most importantly, we need to store a “.pre-commit-hooks.yaml” file. This file will contain some information about the scripts which we can use in the rest of our repositories.

In order to get the above menionted script work, we store this script inside the “bin” directory in the git repository and name the file “verify-yaml.sh“. It doesn’t have to be specific in the “bin” directory, you can also just place it in the “ROOT” or some other directory, whatever you please. But in the “ROOT” of the git repository we will have to create the “.pre-commit-hooks.yaml” file and we will include the following contents:

- id: verify-yaml
  name: Pretty Print YAML files
  description: Checks YAML files and pretty prints them
  entry: bin/verify-yaml.sh
  language: script
  files: \.(ya?ml)$

There are 2 important keys that requires some additional information (I won’t have to tell you that the “entry” key is the location to the script right? Oh wait, I just did. :)) These are the “id” and the “files“.

The “id” is the value that you need to use in your repositories where you want to make use of the pre-commit hooks. Like with the https://github.com/dj-wasabi/dj-wasabi-release/blob/main/.pre-commit-config.yaml you will see the following:

repos:
- repo: https://github.com/dj-wasabi/pre-commit-hooks
  rev: master
  hooks:
  ...
  - id: verify-yaml
  ...

So every time I do a commit in my “dj-wasabi/dj-wasabi-release” repository, it will execute the pre-commit hook with id “verify-yaml“.

The “files” key in the “.pre-commit-hooks.yaml” entry is when the script should be executed. We only want the script to be executed when the file is a yaml file and not with a text or png file. The “\.(ya?ml)$” makes sure that it only affect files with the file extention “.yml” and “.yaml“.

Thats all folks!

It looks very easy right? Yes it sure is and when you start to work on your own hooks, you will create more. Because I am in a writing mood right now, lets take a look at the following script that I wrote:

#!/usr/bin/env bash
# Do not allow commits on provided branches.


EXITCODE=0
while getopts b: flag
do
    case "${flag}" in
        b) BRANCHES=${OPTARG};;
        *) echo "Unsupported option provided."
           exit 1;;
    esac
done

for BRANCH in $(echo ${BRANCHES} | sed 's/,/ /g');
    do
        if git rev-parse --abbrev-ref HEAD | grep -e ${BRANCH} > /dev/null 2>&1
            then  echo "You are not allowed to commit on ${BRANCH} branch."
                  EXITCODE=1
        fi
done
exit ${EXITCODE}

I don’t want to commit my changes on “master” or on “main“. And yes you can configure in most cases on the remote site (Read: the Git Server, like Bitbucket or Github) that it will not allow pushes to “master” or “main“, but I just want to prevent the commiting to actual take place. Everytime when I commit on a branch that the remote server is not accepting, I have to google for “git commit undo” and look for the command to undo my commit (Because for some reason I can not remember the git undo command :)).

Once I have undo my commit, I will create a proper branch and do the commit and push again. So if I can prevent committing to “master” or “main” with a pre-commit hook, my life will get easier because I don’t have to undo my commit etc! (Wait, wasn’t that the title of the blog post that I have written before this blog post? ) 🙂

If you have any question and or remarks, please let me know. If you have written a hook yourself and you want to show it, let me know as well!

May the hooks be with you!

Using pre-commit hooks makes software development life easier

Pull requests, you either love and see that they do provide benefits in the way of working, or you dislike them and see no purpose in them at all. I discussed the Pull Requests process in these “Author“, “Reviewer” and “Process” blogposts, so check these out as well. No matter which side you are on, but I think you want to create quality code and be consistent with each change. And if you are like me, then you would expect this as well from your teammates/co-workers. But how can we make that happen?

Pre-commit hooks can help with that. A pre-commit is one of the several hooks that are available in git that we can use to execute scripts, while running an git command. A full list with hooks can be found on this page https://githooks.com/

With a pre-commit hook, we can execute scripts as part of the “git commit” command. If the exit code of all of the scripts result in a zero (successful execution) then the commit will take place. If 1 or more of the scripts fail, the commit will be unsuccessful and fails to complete. Once that is happening we need to resolve the issue and try again.

When you have a git repository somewhere cloned on your host, you are already able to use pre-commit hooks or any of the other hooks that Git has. In the “.git/hook” directory you will find a bunch of “.sample” scripts. You can use these as an example and if you remove the “.sample” in the filename, then the script is active and triggered when the stage is executed.

$  ls -l .git/hooks
total 112
-rwxr-xr-x  1 wdijkerman  staff   478 Jun 14 21:01 applypatch-msg.sample
-rwxr-xr-x  1 wdijkerman  staff   896 Jun 14 21:01 commit-msg.sample
-rwxr-xr-x  1 wdijkerman  staff  3327 Jun 14 21:01 fsmonitor-watchman.sample
-rwxr-xr-x  1 wdijkerman  staff   189 Jun 14 21:01 post-update.sample
-rwxr-xr-x  1 wdijkerman  staff   424 Jun 14 21:01 pre-applypatch.sample
-rwxr-xr-x  1 wdijkerman  staff  1638 Jun 14 21:01 pre-commit.sample
-rwxr-xr-x  1 wdijkerman  staff   416 Jun 14 21:01 pre-merge-commit.sample
-rwxr-xr-x  1 wdijkerman  staff  1348 Jun 14 21:01 pre-push.sample
-rwxr-xr-x  1 wdijkerman  staff  4898 Jun 14 21:01 pre-rebase.sample
-rwxr-xr-x  1 wdijkerman  staff   544 Jun 14 21:01 pre-receive.sample
-rwxr-xr-x  1 wdijkerman  staff  1492 Jun 14 21:01 prepare-commit-msg.sample
-rwxr-xr-x  1 wdijkerman  staff  3610 Jun 14 21:01 update.sample

But we are focussing on to pre-commit hooks and there is an easier way to maintain the pre-commit hooks. Like you already have seen, we can only use 1 pre-commit file, and we want to use a lot more than just one. Sure you can call other scripts, but that makes the maintenance a bit more complicated. Especially when you have more than just a single git repository. There is a package that helps us to have a more maintainable soultion with regarding to pre-commit hooks and using them.

Installation

We have to install it first and we have to install it on our development workstation/laptop. Make sure you have Python and Pip installed before executing the following command:

pip install pre-commit

This will install the pre-commit application on our system.

“That is all nice, but I have no idea what kind of scripts we can execute as part of a pre-commit?”

– you?

No worries, lets use an example. This Github repository https://github.com/dj-wasabi/dj-wasabi-release/ contains some Python scripts that I use for maintaining my Github repositories. We will not discuss these scripts, but there is 1 important file in this repository and is named “.pre-commit-config.yaml“. You will probably find this file also on other repositories that are on my Github account. This file contains all the scripts that will be executed when we do a “git commit”. Lets take a look at the file.

repos:
- repo: https://github.com/dj-wasabi/pre-commit-hooks
  rev: master
  hooks:
  - id: shellcheck
  - id: markdown-toc
  - id: verify-yaml
  - id: no-commit-on-branch
    args: ['-b master,main']

It contains a “repos” key which is a list, this means you can have multiple entries configured in the file (As this is currently the case). Each of these entries needs the following 3 properties to work properly:

  1. repo: The git location where the scripts are stored;
  2. rev: The version of the repository, can be a tag, branch or git commit;
  3. 1 or more hooks, which has 1 or 2 keys: id (and args).

In the code block above, we see that there is a Github repository mentioned, namely https://github.com/dj-wasabi/pre-commit-hooks and on master we have several scripts available. The pre-commit hook will see that in this repository a file named “.pre-commit-hooks.yaml” exist. This file knows exactly what to do when the pre-commit hook is executed with the provided id’s. In the “.pre-commit-hooks.yaml” file, there is a configuration for the “shellcheck” id.

- id: shellcheck
  name: Shellcheck Bash Linter
  description: Performs linting on bash scripts
  entry: bin/shellcheck.sh
  language: script

The pre-commit hook now knows, that with the “shellcheck” id a script “bin/shellcheck.sh” in the https://github.com/dj-wasabi/pre-commit-hooks reposity needs to be executed. But in all fairness, you are not ready yet. 9 out of 10 times you will need to check the README on the mentioned repository to see if you need to install any other tool to make the script work. Because I am working on a Mac, I do need to make sure that shellcheck is installed:

brew install shellcheck

And I need to make sure that the dependencies that the other scripts in the “.pre-commit-hook.yaml” uses are installed on my Mac.

Once I have done that, we need to tell our “.git/hooks” directory that we want to make use of the pre-commit hooks. We will do that by executing the following command in the root of our git repository:

pre-commit install

This will give us the following output:

$ pre-commit install
pre-commit installed at .git/hooks/pre-commit

You can do now an “ls -l .git/hooks” and compare the output with before. When you execute a “git commit” from now on in this repository, it will execute the pre-commit hooks. Remember that for each git repository you have cloned on your host, it will contains their own configuration.

But why?

“But I still don’t know what kind of scripts I can execute as part of a pre-commit hook?”

– you?

The problem I see and/or have experienced a lot is that when people gets invited to be a reviewer on a PR, is that they focus on the small nitpicky things that don’t really that much. Things like, wrong identation, to many empty lines, missing space or trailing space (well, I do find this one annoying ;-)). They should be focussing on the actual change like I mentioned in one of my earlier blogposts. And 9/10 times you can prevent things like this by executing a linter or formatter. This linter will provide an overview of warnings and errors on what needs to be fixed before the linter is happy, so an ideal candidate to execute it as a pre-commit hook. So with pre-commit hooks we can sort of prevent “garbage” to be committed into a Git repository.

When you write for example Python scripts, like what I do with the https://github.com/dj-wasabi/dj-wasabi-release/ repository, you can make use of the “flake8” tool and with every “git commit”, the linter will be executed and provide errors. And that can be achieved by adding the following few lines in the “.pre-commit-hooks.yaml“:

- repo: https://gitlab.com/pycqa/flake8
  rev: 3.8.4
  hooks:
  - id: flake8
    additional_dependencies: [flake8-typing-imports==1.7.0]

If I now do a “git commit” (or when you just want to execute it without doing an actual commit: “pre-commit run -a“):

$ git commit -am "Some message that I know that it will not make it to the log"
Shellcheck Bash Linter...................................................Passed
Generate Markdown toc....................................................Passed
Pretty Print YAML files..................................................Passed
No commit on master or main..............................................Passed
flake8...................................................................Failed
- hook id: flake8
- exit code: 1

lib/djWasabi/git.py:59:1: E302 expected 2 blank lines, found 1
lib/djWasabi/git.py:79:22: E231 missing whitespace after ','
lib/djWasabi/git.py:80:52: W291 trailing whitespace
lib/djWasabi/git.py:83:53: E231 missing whitespace after ','

Fix End of Files.........................................................Passed
Trim Trailing Whitespace.................................................Failed
- hook id: trailing-whitespace
- exit code: 1
- files were modified by this hook

Fixing lib/djWasabi/git.py

Check for merge conflicts................................................Passed
$ echo $?
1

Look, the pre-commit “Failed” (Or succeeded, depends on how you look at it ;-)) and it prevented to do the actual commit in git. You can see that there where 4 lines in the “lib/djWasabi/git.py” file that needs to be fixed. But you will also see a 2nd script has failed, namely the “Trim Trailing Whitespace“.

A pre-commit script can also update files when needed (which isn’t a bad thing!) and not just fail on when certain criteria is met. The “Trim Trailing Whitespace” script has updated a single file “lib/djWasabi/git.py” and removed the trailing space. When a script updates a file, the script should exit with an exit code of 1 and provide info on what file has been updated. If I would run the git commit again, then you will see that the “Trim Trailing Whitespace” is “Passed” and that the “flake8” script only has 3 errors:

Shellcheck Bash Linter...................................................Passed
Generate Markdown toc....................................................Passed
Pretty Print YAML files..................................................Passed
No commit on master or main..............................................Passed
flake8...................................................................Failed
- hook id: flake8
- exit code: 1

lib/djWasabi/git.py:59:1: E302 expected 2 blank lines, found 1
lib/djWasabi/git.py:79:22: E231 missing whitespace after ','
lib/djWasabi/git.py:83:53: E231 missing whitespace after ','

Fix End of Files.........................................................Passed
Trim Trailing Whitespace.................................................Passed
Check for merge conflicts................................................Passed

Nice right?

Updating files as part of the pre-commit hooks are very common and I do make use of it a lot. When you work with Terraform, you will probably know the “terraform fmt” command as well. And I do too make use of the “terraform fmt” command as part of the pre-commit hook, it will make sure that all my terraform files are formatted in one way. So people that will review my Pull Requests, can not write any comment on formatting issues. 🙂

Before I end this blog post, I want to say one last things that really helps me with pre-commit hooks. I really like writing documentation in Asciidoctor format, i personally thinks it is a bit better that any other “language” and I don’t want to start a flamewar, but just writing on top of the document “:toc: left” and I have my Table of Content on the left side of the page. With Markdown, you’ll have to manually write one and keep the Table of Content up 2 date. Or, you install a tool with this command:

pip install md-toc

Then you can make use of the following line in your markdown file:

<!--TOC-->

And make sure that an id with “markdown-toc” is added to the “.pre-commit-hooks.yaml” file like shown in the beginning of this blog post and you are done! (small note is that you need to commit your changes to see that it is generated :))

I can now see the benefits of using pre-commit hooks and will use them as well!

– you?

I hope so? I hope I showed you how awesome pre-commit hooks are that they help you write better code. Not only that, it also helps the reviewers to focus on a PR that matters instead of looking for the nitpicky things. And with this blog post I only mentioned linting and formatting scripts, but you basically can do anything with a pre-commit hook. One small tip with using pre-commit hooks, don’t go wild on it. My commits will take some seconds to complete, but you can even run unit tests as well but that will result in longer duration of doing a commit. And when you do a “git commit” you basically are waiting on it to finish correctly, so don’t add lots and lots of tests because that will increase the duration of a commit.

But if you do have any questions and or remarks please let me know and I will happily help you.

In this post I will describe how to create your own pre-commit hook.

May the commits be with you!

Jenkins as code, part 2: Setting up the Jenkins job

This is the follow up of the previous blogpost about Jenkins as code, which you can find here. With the previous blogpost we discussed how we have created a Docker image, containing Jenkins and several files needed to correctly install the plugins and configure Jenkins with several yaml files. The yaml configuration files are used by the configuration-as-code plugin to configure the Jenkins environment how we want to have it configured. This means that no manual changes are needed in the UI.

Just like mentioned in the first blogpost:

Before we do anything I just want to remind you that this is just 1 way to achieve a Jenkins as code setup. It does not mean that this is the only or best way, it is just one way. Next to that, these blogposts and the code in the Github repository will help you kickstart your own setup and by no means you can just run it on a production environment and blame me if something is not working fine. During the blogposts I will tell you how I was able to do things, so you can redo it all yourself (and compare it with the code in the Github repository).

Seed All DSL Jobs

With this blogpost, we will continue the job-dsl.yaml file. This is the yaml file which will make sure that once Jenkins is running, a “Seed All DSL jobs” job” is configured and loads groovy files from a specific directory in a git repository. So lets continue with these groovy files. It basically consists of 2 parts, we set some variables and have several class and functions to generate the jobs and list views. 

The example.groovy file top part are some variables we need to configure.

@Field String jenkinsCredentialId = "SSH_GIT_KEY"
@Field String basePath = 'example'
@Field String defaultPollingScm = 'H/5 * * * *'

JobConstructor[] jobList = [
        [
                "example-repo",
                "https://bitbucket.org/wernerdijkerman/this-is-some-test.git",
                defaultPollingScm
        ]
]

The first one is easy, that is the name of our Jenkins credential that we are using for authorizing to the Git server. The basePath is the value which is used for creating a “directory” and a specific “tab” with this name. See the screenshot. Then we have the defaultPollingScm, which we will configure that each job will check once every 5 minutes for changes. Ideally there should be webhooks configured on the Git server, but this only works if Jenkins is available from the Git server (which is in my situation not the case)

The jobList is basically a list with all of the repositories that belong to this “example” group. For my Terraform repositories, I would have a file named terraform.groovy, with a basePath set to ‘terraform‘ and in the jobList variable, all repositories configured related to Terraform. And if you need to add (or remove) a (terraform) repository, you will update this jobList variable by either adding or removing the repository information. Once the change is merged into main (or master), with a max of 15 minutes the new job is added (or deleted) because the Seed DSL job has ran.

The rest of the file contains some specific functions to actual create the Jobs and listviews. You can see on this page https://jenkinsci.github.io/job-dsl-plugin/ with all the possible functions (and the parameters) you can use to make it your own.

Shared Library

A shared library allows you to use code that can be used in Jenkinsfile’s. The problem when you have lets say 50 Spring Boot java microservices, you probably have 50 Jenkinsfiles that are the same, with the exception of the name of the microservice. So that is a lot of duplicate code in the various repositories, especially if these Jenkinsfiles are really large. So the goal is to create small(er) Jenkinsfiles as we still need to have a Jenkinsfile in the repositories. Because we create smaller Jenkinsfile, we can even provide arguments that can be used for correctly generating the Jenkinsfile.

This is how our Jenkinsfile will look like, for our example service:

@Library('djwasabi') _
def run = new com.djwasabi.common.examplePipeline()
def NAME = "Wasabi"
run.pipeline('test-repo', NAME)

With the first line we specify which Shared Library we want to make use. You need to go to “Manage“, “Configure system” and then look for “Global Pipeline Libraries“. When you did a docker compose up from my repository, you would see that there is a Shared Library named “djwasabi“. This shared library is from a git repository and is using the master branch as the version (You could also specify a tag or branch or even an git commit). 

To understand the 2nd line, you will need to go to https://bitbucket.org/wernerdijkerman/jenkins-shared-library/src/master/ (or look into the already earlier provided github repository in the “library” folder), you will see a specific directory structure which is used within the groovy language. You will see an examplePipeline.groovy file, which will be loaded to create a new groovy object.

The 3rd line we set a variable named NAME and set it to “Wasabi“. With the 4th line we will execute the function inside the examplePipeline.groovy file named ‘pipeline‘. This function accepts multiple parameters and we only provide 1 called “NAME“.

Let us take a look at the examplePipeline.groovy file:

package com.djwasabi.common

import com.djwasabi.common.workers.*
import groovy.transform.Field

def pipeline(jobName, name = "world", agentNode = "worker") {
    def command = new Command(this)
    def git = new Git(this)
    def resultsGetter = new ResultsGetter(this)

    node(agentNode) {
        try {
            def lastResult = currentBuild.rawBuild.getPreviousBuild()?.result

            stage("Checkout") {
                git.checkout()
                tagged = git.isCurrentCommitAlreadyTagged()
            }
            if (!tagged || hudson.model.Result.SUCCESS != lastResult) {
                stage("Run command") {
                    command.echo("hello " + name + " via job " + jobName)
                }
            } else {
                resultsGetter.repeatPreviousBuildResult(currentBuild)
            }
        } catch (all) {
            stage('Destroy it') {
                command.echo('lets run when things go wrong.')
            }
        }
    }
}

As you see with the examplePipeline.groovy file, the function “pipeline” accepts multiple arguments. The first one “jobName” does not have a default and thus it is always required, but the other 2 has default (resp. “world” and “worker“). Have you noticed that the “agentNode” parameter contains the value “worker“, which is also the value for our configuration we used for starting the Docker container in the previous blog post? So if you write your own shared library, you can make 1 or more of these node definitions and based on parameters that are set in the Jenkinsfile, certain actions can be executed.

When we run the job in Jenkins, you will see something like the above showing up. Almost at the end of the log, you will see the following line:
hello Wasabi via job test-repo

That is basically logged via this piece of code that was generated in the examplePipeline.groovy file:

                stage("Run command") {
                    command.echo("hello " + name + " via job " + jobName)
                }

Summary

In these 2 parts I helped you to explain what the possibilities are for an Jenkins as code approach. This approach will have everything set into code and when someone wants to make a change in Jenkins, the person should be creating a change into the code. When you have properly setup your Git server, this person will create a branch, make the changes and will create a Pull Request before it is merged into master|main. With everything set into code, people have an audit log of what has been changed (and by whom), so it will force you into having a discussion when things are needed (or when something went wrong).

And what I personally also like, because everything is set into code you can deploy a Jenkins server somewhere else and you can almost immediately use it. Especially when you make the Configuration as Code yaml files configurable with environment variables, you can run multiple Jenkins setups from the same code base. And because everything is in code, you don’t have to worry about backing up Jenkins (It is already on the Git server and part of their backup mechanism).

I hope you enjoyed it and that it showed you that you can too automate the configuration of any Jenkins environment you want to run in your organisation. If you have any improvements and/or remarks, please let me know.

May the force (or source) be with you!

Jenkins as code, part 1: Setting up Jenkins in Docker

I hate doing things manually, I really do.

Log into an UI, do some clicks here and there to be able to have something created or configured. It is error prone (you can easily forget something or make a typo) and it is stupid and/or boring (Especially if you need to do this on a routine basis). If you can change something in a UI, then someone is able to change that as well and even do that without you knowing that it is changed (Or viceversa ;-)). So doing things manually is not the way forward and we should focus on automation. Automation is one of the pillars of doing DevOps, so we should always automate things right?

What people do probably not know, Jenkins is a tool that can be fully automated, you only have to know how. (And based on some posts on for example Reddit, I don’t think people knows that this is even possible).

Jenkins as code

So lets dive into, what I would say as: Jenkins as code.

This will basically be a 2 part blog post where we will discuss the following:

  1. This part where we will create a Docker image, containing Jenkins with its configuration files and plugins;
  2. The next part, where we will create a shared library and use that in a Jenkinsfile, with jobs we load via a “specifications” repository;

Before we do anything I just want to remind you that this is just 1 way to achieve a Jenkins as code setup. It does not mean that this is the only or the best way, it is just one way. Just like that there are 1000 ways to go to Rome. Next to that, these blog posts and the code in the Github repository will help you kickstart your own setup and by no means you can just run it on a production environment and blame me if something is not working fine. I am not a groovy expert and I can do some basic things, so don’t expect a new world wonder. During the blog posts I will tell you how I was able to do things, so you can redo it all yourself (and compare it with the code in the Github repository) and build it on your own terms/setup.

In both blog posts, you will see a lot of code and commands popping up. But no worries, all code is available on my Github in repository https://github.com/dj-wasabi/blog-jenkins-as-code . So lets start with the first part: Setting up Jenkins.

Docker(file)

We will create a Docker image, based on “jenkins/jenkins:lts” Docker image, install Docker and configure it with the configuration-as-code plugin included with several yaml files that are used for configuring Jenkins. And as part of the Github repository, we have a docker-compose.yaml file which we can use to boot our setup.

Lets start with the Dockerfile.

USER root
RUN groupadd docker && \
    curl -fsSL https://get.docker.com -o get-docker.sh && \
    sh get-docker.sh && \
    usermod -aG root jenkins && \
    usermod -aG docker jenkins
USER jenkins

Lets discuss this first, the “jenkins/jenkins:lts” Docker image does not contain the docker application, so we need to install that and make sure that the “jenkins” user is part of the “root” and “docker” group. We need Docker in this image, as each Jenkins job will run in its own Docker container.

ENV CASC_JENKINS_CONFIG=/var/jenkins_home/casc

We need set an environment variable named CASC_JENKINS_CONFIG, which we basically tell Jenkins where the configuration-as-code yaml files can be found.

COPY casc/ /var/jenkins_home/casc
COPY plugins.txt /usr/share/jenkins/plugins.txt
RUN /usr/local/bin/install-plugins.sh < /usr/share/jenkins/plugins.txt

Then we will copy the contents on the casc directory to the earlier mentioned directory and we place the file with all of our plugins into a specific directory. Then we run the install-plugins.sh script so we can download and install all the plugins that we need in our setup.

And that is our Dockerfile, easy right? This will allow us to build a Jenkins Docker image, with all of our files and configuration that will lead to an Jenkins environment we want to have. We can deploy this on some host running Docker or even make some additional changes to make it work in Kubernetes.

Plugins

Lets go to the plugins.txt file, as this one is a bit easier to explain than the casc files.

We need to create a plugins.txt that contains all of the plugins we want to make use, so how do we do that. I manually (oh yes, sorry! :)) started a Jenkins container and followed the installation steps and then I picked several plugins to install during the installation steps. When Jenkins is running and I have finnished nstalling all plugins (don’t forget the “configuration-as-code” plugin), I went to “manage” and then clicked on “Script Console“. There you see an textfield to execute groovy scripts and I used the following script:

def plugins = jenkins.model.Jenkins.instance.getPluginManager().getPlugins()
plugins.each {println "${it.getShortName()}:latest"}

This is a “script” that provides an overview of all plugins that are currenlty installed in Jenkins. I have used the “latest” version of the plugin which is fine for demo purposes, but you could also update the “latest” to ${it.getVersion()}. This will show the actual version of the installed plugin. I would suggest to use the versions in the plugins.txt file. This helps you in the future when someone create’s an PR that it shows you that there is an update in the version of a plugin, which you won’t see when it is using “latest”.

Then you hit the “Run” button and you will see some output appear. Select the output and place that in the plugins.txt file and you are done (I would also sort the contents of the file, so all plugins are order alphabetical).

Configuration as code

Let us first explain how we can get a yaml file. Go to “Manage” and then you will see “Configuration as Code” (And click on it please) in the “System Configuration” lane. There is a button called “Download Configuration” which will download the yaml file and with “View Configuration” you can see the yaml file in your browser. When you have downloaded the yaml configuration file, you can make use of it in your Docker image by placing it in the casc/ directory. I would suggest you split it into seperate files, so you won’t have 1 large file but smaller ones with each a specific set of configuration. For example, create a credentials.yaml file containing all Jenkins credentials. 

But before you commit all your changes, you can also update some values by using environment variables, see the following:

  securityRealm:
    local:
      allowsSignup: false
      enableCaptcha: false
      users:
      - id: "${JENKINS_ADMIN_USERNAME:-admin}"
        name: "${JENKINS_ADMIN_NAME:-Administrator}"
        password: "${JENKINS_ADMIN_PASSWORD}"

This piece of the configuration you see now is responsible for creating an admin user. I don’t want to hardcode the username and definitely not the password in this file, so I use environment variables for that. And this is also the case for the credentials that Jenkins use, see the following example of the Jenkins “credentials”:

credentials:
  system:
    domainCredentials:
      - credentials:
          - basicSSHUserPrivateKey:
              scope: GLOBAL
              id: "SSH_GIT_KEY"
              username: "git"
              description: "SSH Credentials for jenkins"
              privateKeySource:
                directEntry:
                  privateKey: ${JENKINS_SSH_GIT_KEY}

With the above credential configuration I won’t have to hardcode the SSH Private key in the Docker image, but can use it as an environment variable. Nice right? 🙂

When everything is done via code, we can also already configure the Security Matrix and allowing what user can and most importantly can’t do in Jenkins. As my Jenkins is running on premise and don’t allow traffic from outside the environment, I will allow people to start jobs (if they don’t want to wait on the triggering). So I will allow anonymous people to have read, build and cancel rights for the jobs. Why go for all that trouble for letting people authenticate against some source, so we can see that this person has started or cancelled a job? Most importantly, they can’t change anything unless they know the admin password. (But will be undone when Jenkins is restarted! :))

  authorizationStrategy:
    globalMatrix:
      permissions:
      - "Job/Build:anonymous"
      - "Job/Cancel:anonymous"
      - "Job/Read:anonymous"
      - "Overall/Administer:admin"
      - "Overall/Read:anonymous"

Now we are able to fully do Jenkins as code, as we will store the yaml files in the casc/ directory which are loaded when Jenkins is started. But when Jenkins is running, we also need to make sure that we will load the jobs from somewhere. We will do this with a “Seed Job“, which you can see in the “dsl-jobs.yaml” file in the casc/ directory in the Github repository.

            git {
                remote { 
                    url "${JENKINS_JOB_DSL_URL}"
                    credentials 'SSH_GIT_KEY' 
                }
                branch '*/main'
              }
        }
        triggers {
            scm('H/15 * * * *')
        }
        steps {
          dsl {
            external('${JENKINS_JOB_DSL_PATH:-jobs}/*.groovy')
            removeAction('DELETE')
          }
        }
      }

When Jenkins is started, we will automatically create the “Seed all DSL jobs” Jenkins job. And what it does is basically the following (Snippet is incomplete, see for the compete file on Github for full version):

  1. We use the credential ‘SSH_GIT_KEY‘ to checkout the repository mentioned in ${JENKINS_JOB_DSL_URL} (See the docker-compose.yaml file)
  2. We use the ‘main’ branch;
  3. The job is executed every 15 minutes;
  4. In the directory named ${JENKINS_JOB_DSL_PATH} we will find groovy files and if Jenkins has jobs which aren’t configured in these groovy files, we delete the jobs from Jenkins.

Before we finalise the Configuration as code part, we need to discuss one last file (and action). When we have the Jenkins server running, we will run each job in its own Docker container. So the Jenkins server will start a Docker container and do all of its action inside that container and the configuration is what follows:

jenkins:
  clouds:
    - docker:
        name: "docker"
        dockerApi:
          dockerHost:
            uri: "${DOCKER_HOST:-unix:///var/run/docker.sock}"
        templates:
        - connector:
            attach:
              user: "jenkins"
          dockerTemplateBase:
            bindAllPorts: true
            image: "jenkins/agent:latest"
            privileged: true
            environment:
              - "TZ=Europe/Amsterdam"
          instanceCapStr: "99"
          labelString: "worker"
          name: "worker"
          remoteFs: "/home/jenkins/agent"

This is also seen in the file “docker.yaml“. Here we have placed 1 template which we named “worker“, with the “jenkins/agent:latest” Docker image. As you know, this is just an example so you can modify this to your needs and use a Docker image that suits your needs. This Docker image should contain all the tools needed to run your jobs, so the “jenkins/agent:latest” might not be fit for your setup. And do know, as the “templates” key is a list, you can add a lot more templates with a unique name, settings and Docker image. For the dockerHost.uri, you will see the usage of a environment variable “DOCKER_HOST“. This is an variable we use in docker-compose.yaml file and if we don’t provide one, the default unix:///var/run/docker.sock is used.

You can go to “manage“, “Systems configuration” and scroll all the way down until you will see “Cloud“. It provides a link and when clicking on it, you’ll get the page where you can configure the “Cloud” configuration. When you make changes, don’t forget to export the yaml file on the “Configuration as Code” page mentioned earlier.

Build and ship it

So far we have discussed some basics on how we get our configuration, so lets build a Docker image. During the rest of this blog post, I will assume you will have the same layout as my Github repository. So lets go to the directory where we have the “Dockerfile“, “plugins.txt“, “docker-compose.yaml” file and the “casc/” directory. Here we will run the docker build command, to build the new Docker image.

cd server
docker build -t jenkins-as-code . --pull

I named it ‘jenkins-as-code‘ which works locally fine and if you want to push it into a Docker registry, you should prefix it with the correct registry name. If you prefix it with a registry or you named it differently, don’t forget to update the docker-compose.yml file with your new name. The –pull is there so if you already have a “jenkins/jenkins:lts” Docker image, you will get the latest one.

I think it is build now, otherwise we will wait a minute before we continue.

sleep 60 🙂

Ok, the Docker image is build and we can start it. If you see the docker-compose.yaml file, you will notice 2 ‘services’.

  1. socat;
  2. jenkins (The one will just build and want to start).

But lets describe the ‘socat‘ service. The ‘socat‘ service is used to make sure that our docker.sock file from our host can be used with Jenkins for starting the agents. If we do this from the Jenkins container itself and not using this ‘socat‘ service, we will get permission denied errors and Jenkins can not start any new Docker container (I am doing on a Mac, I don’t think people running it on Linux hosts will have issues ). 

The Jenkins service has several environment variables set. Before we start everything, we will need to create an environment variable first that contains the content of a private SSH key. I have used the following command for that:

export EXPORTED_PASSWORD=$(cat ~/.ssh/wd_id_rsa)

So this EXPORTED_PASSWORD contains the private SSH key and this one will be used in Jenkins as the SSH_GIT_KEY credential on multiple places. Also worth to mention is the JENKINS_ADMIN_PASSWORD environment variable, this is what is says: The password for the Admin user, so if you want to use something else here is the moment to change it.

We will start it with the following command:

docker compose up -d

I prefer starting it in the background, so that is why I added the -d argument. Once it is booted we open our favourite browser and go to http://localhost:8080 you will see something like the following:

So that is it for now. We started our newly build Docker image containing Jenkins, with the plugins we need in our environment and our own configuration!

We will go into the “Seed All DSL jobs” job with the next part of the blogpost. So stay tuned! 🙂

2nd blog post you can find here.

Continuous deployment of Ansible Roles

Ansible Logo

There are a lot of articles about Ansible with continuous deployment, but these are only about using Ansible as a tool to do continuous deployment. There is not much (Well, I can’t really find none) about continuous deployment on code changes in Ansible Roles/playbooks itself. But first: Why do you want to do that?

Well, it is very easy to make changes in a role or a playbook and deploy that to a (production) machine(s). I do hope that these changes are commited into the git repository (And pushed) so that changes on the host can be tracked back to the code. Hopefully you didn’t make any errors in the playbook or role so all will be fine during deployment and no unwanted downtime is caused, because nothing is tested.

When you are part of a team, this would be a downside of using Ansible. It is very easy to make changes to a playbook or a role locally and not commit it to the repository, deploy it to a production server and continue like it didn’t happen. You can make agreements on these kinds of procedures on when and how to execute playbooks, but you always have that coworker that don’t (or partly) want to follow procedures or just because of lack of time (“It has to be working this morning!” or just any other lame excuse to not test your code before deployment).

When the team/serverpark grows bigger and/or the company you work for matures and even has an SLA, you can’t just deploy any untested code anymore. You’ll have to make sure that changes you made to code is tested, like any other code. Application developers write unit tests on their code and the application is tested by either automated tests or by using test|qa team. Application development is not any different than writing software for your infrastructure, it all needs to be tested before you use it on production.

What I will describe in this blogpost is just a suggestion on how to do this. This might not be foolproof or maybe there are other or better ways on how todo this or … (Fill in some something other reason). As this is something that works for me, it might help you to create your own pipeline. YMMV.

I haven’t looked at all at Ansible Tower or the open sourced version, so it might be that parts or maybe all of what I am describing here can be done by Tower.

Before we do anyting, I’ll first describe how my Ansible setup looks like so we have some background before we do anything. All of my roles has their own git repository, including documentation and Jenkinsfiles. A Jenkinsfile is the Jenkins job configuration file that contains all steps that Jenkins will execute. Its the .travis.yml (Of Travis CI) file equivalent of Jenkins and we will come back later to this. I also have 1 git repository that contains all ansible data, like host_vars, group_vars and the inventory file.

I have a Jenkins running with the Docker plugin and once a job is started, a Docker container will be started and the job will be executed from this container. Once the Job is done (Succeeded or Failed doesn’t matter which), the container and all data in this container is removed.

Jenkins Jobs

All my Ansible roles has 3 jenkinsfiles stored in the git repository for the following actions:

  1. Molecule Tests
  2. Staging deployment
  3. Production deployment

Molecule Tests

The first job is that the role is tested with Molecule. With Molecule we create 1 or more Docker containers and the role is deployed to these containers. Once that is done, we do an idempotent check and with TestInfra we verify if installation/configuration is done correctly. We can also execute some commands to verify that the deployed service is running correctly. Once these tests are completed, we can successfully deploy the ansible role without any problems. (On this page I have described some information on Molecule.)

How does the Jenkinsfile looks like:

node() {
    try {
        stage ("Get Latest Code") {
            checkout scm
            sh 'git rev-parse HEAD > .git/commit-id'
        }
        stage ("Install Application Dependencies") {
            sh 'sudo pip install --upgrade ansible==${ANSIBLE_VERSION} molecule==${MOLECULE_VERSION} docker'
        }
        stage ("Executing Molecule lint") {
            sh 'molecule lint'
        }
        stage ("Executing Molecule create") {
            sh 'molecule create'
        }
        stage ("Executing Molecule converge") {
            sh 'molecule converge'
        }
        stage ("Executing Molecule idemotence") {
            sh 'molecule idempotence'
        }
        stage ("Executing Molecule verify") {
            sh 'molecule verify'
        }
        stage('Tag git'){
            def commit_id = readFile('.git/commit-id').trim()
            withEnv(["COMMIT_ID=${commit_id}"]){
                sh '''#!/bin/bash
                if [[ $(git tag | grep ${COMMIT_ID} | wc -l) -eq 1 ]]
                    then    echo "Tag already exists"
                    else    echo "Tag will be created"
                            git config user.name "jenkins"
                            git config user.email "jenkins@localhost"
                            git tag -a $COMMIT_ID -m "Added tagging"
                            git push --tags
                fi
                '''
            }
        }
        stage('Start Staging Job') {
            def commit_id = readFile('.git/commit-id').trim()
            withEnv(["COMMIT_ID=${commit_id}"]){
                build job: 'ansible-access-2-staging', wait: false, parameters: [string(name: 'COMMIT_ID', value: "${COMMIT_ID}") ]
            }
        }
    } catch(all) {
        currentBuild.result = "FAILURE"
        throw err
    }
}

First stage of the Job is the checkout of the sourcecode of the git repository, so that we have data in the container. We get the latest git commit id, because I use this id to create a tag in git once the Molecule Tests succeeds.

First Molecule action is the lint. First we do some linting on the role and test files to make sure it is compliant. If it find some errors, it fails quickly and we can fix it. Then it proceeds with the Molecule actions create, converge, idemptence and verify. For those who are familiar with Molecule will notice that I use different stages for each action and not 1 stage which executes molecule test.

Stages overview of Jenkins job.

I use separate stages with single commands so I can quickly see on which part the job fails and focus on that immediately without going to the console output and scrolling down to see where it fails. After the Molecule verify stage, the Tag git stage is executed. This will use the latest commit id as a tag, so I know that this tag was triggered by Jenkins to run a build and was successful.

Last stage in the job is to start the 2nd job in Jenkins. This stage will start the job ansible-access-2-staging with the COMMIT_ID as parameter to the job and in the background (wait: false).

Currently, the Molecule configuration only has 1 “default” scenario. If I had more scenarios than the Jenkinsfile had probably a lot more stages or maybe more Jenkinsfiles.

Staging deployment

The first job was executed correctly and now this job is triggered. As mentioned before, the commit id of the previous job is passed into this job. The goal for this job is to deploy the role to an staging server and validate if everything is still working correctly. In this case we will execute the same tests on the staging staging as we did with Molecule, but we can also create an other test file and use that. In my case, there is only one staging server but it could also be a group of servers.

The Jenkinsfile for this job looks like this:

node() {
    try {
        stage ("Get the Code") {
            checkout scm: [$class: 'GitSCM', branches: [[name: "refs/tags/${params.COMMIT_ID}"]], extensions: [[$class: 'RelativeTargetDirectory', relativeTargetDir: 'ansible-access']], userRemoteConfigs: [[url: 'ssh://git@192.168.1.206:2222/ansible/access.git']]]
            checkout scm: [$class: 'GitSCM', branches: [[name: '*/master']], extensions: [[$class: 'RelativeTargetDirectory', relativeTargetDir: 'environment']], doGenerateSubmoduleConfigurations: false, userRemoteConfigs: [[url: 'ssh://git@192.168.1.206:2222/ansible/environment.git']]]
            sh 'pwd > workspace'
        }
        stage ("Install Application Dependencies") {
            sh 'sudo pip install --upgrade ansible==${ANSIBLE_VERSION} testinfra docker'
        }
        stage ("Execute role on host(s)") {
            dir("environment") {
                sh "ansible-playbook -i hosts -l staging playbooks/ansible-access.yml"
            }
        }
        stage ("Test Role execution") {
            workspace = readFile('workspace').trim()
            withEnv(["WORKSPACE_DIR=${workspace}", "MOLECULE_INVENTORY_FILE=${workspace}/environment/hosts"]){
                dir("environment/") {
                    sh "testinfra --connection=ansible --ansible-inventory=hosts --hosts=staging ${WORKSPACE_DIR}/ansible-access/molecule/default/tests/test_default.py --verbose"
                }
            }
        }
        stage('Tag git'){
            withEnv(["COMMIT_ID=${params.COMMIT_ID}_staging"]){
                dir("ansible-access/") {
                    sh '''#!/bin/bash
                    if [[ $(git tag | grep "${COMMIT_ID}" | wc -l) -eq 1 ]]
                        then    echo "Tag already exists"
                        else    echo "Tag will be created"
                                git config user.name "jenkins"
                                git config user.email "jenkins@localhost"
                                git tag -a $COMMIT_ID -m "Added tagging"
                                git push --tags
                    fi
                    '''
                }
            }
        }
        stage('Start Production Job') {
            build job: 'ansible-access-3-production', wait: false, parameters: [string(name: 'COMMIT_ID', value: "${params.COMMIT_ID}") ]
        }
    } catch(all) {
        currentBuild.result = "FAILURE"
        throw err
    }
}

The first stage is to checkout 2 git repositories: The Ansible Role and the 2nd is my “environment” repository that contains all Ansible data and both are stored in their own sub directory. With the Ansible role we checkout the provided tag refs/tags/${params.COMMIT_ID}. I also had to configure the url for the git repositories. Last step is to create a file that holds the output of the pwd file. We need this location in a later stage.

The 2nd Stage is to install the required applications, so not very interesting. The 3rd stage is to execute the playbook. In my “environment” repository (That holds all Ansible data) there is a playbooks directory and in that directory contains the playbooks for the roles. For deploying the ansible-access role, a playbook named ansible-access.yml is present and will be use to install the role on the host:

---
- hosts: all:!localhost
  become: True
  roles:
    - role: ansible-access

Very basic/simple. The 4th stage is to execute the Testinfra test script from the molecule directory to the staging server to verify the correct installation/configuration. In this case I used the same tests as Molecule, but I could also create a seperate file with some extra or other tests to verify the correct behaviour of the host.

And when all tests are complete, we create a new tag. In this job we create a new tag ${params.COMMIT_ID}_staging and push it so we know that the provided tag is deployed to our staging server.

With the last stage, we start the 3rd and last job, the job to deploy the role on the rest of the servers.

Production deployment

This is the job that deploys the Ansible role to the rest of the servers. This Jenkinsfile looks almost the same as the previous one, but with a few exceptions.

node() {
    try {
        stage ("Get the Code") {
            checkout scm: [$class: 'GitSCM', branches: [[name: "refs/tags/${params.COMMIT_ID}"]], extensions: [[$class: 'RelativeTargetDirectory', relativeTargetDir: 'ansible-access']], userRemoteConfigs: [[url: 'ssh://git@192.168.1.206:2222/ansible/access.git']]]
            checkout scm: [$class: 'GitSCM', branches: [[name: '*/master']], extensions: [[$class: 'RelativeTargetDirectory', relativeTargetDir: 'environment']], doGenerateSubmoduleConfigurations: false, userRemoteConfigs: [[url: 'ssh://git@192.168.1.206:2222/ansible/environment.git']]]
            sh 'pwd > workspace'
        }
        stage ("Install Application Dependencies") {
            sh 'sudo pip install --upgrade ansible==${ANSIBLE_VERSION} testinfra docker'
        }
        stage ("Execute role on host(s)") {
            dir("environment") {
                sh "ansible-playbook -i hosts -l 'all:!localhost:!staging' playbooks/ansible-access.yml"
            }
        }
        stage ("Test Role execution") {
            workspace = readFile('workspace').trim()
            withEnv(["WORKSPACE_DIR=${workspace}", "MOLECULE_INVENTORY_FILE=${workspace}/environment/hosts"]){
                dir("environment") {
                    sh "testinfra --connection=ansible --ansible-inventory=hosts --hosts='all:!localhost:!staging' ${WORKSPACE_DIR}/ansible-access/molecule/default/tests/test_default.py --verbose"
                }
            }
        }
        stage('Tag git'){
            withEnv(["COMMIT_ID=${params.COMMIT_ID}_production"]){
                dir("ansible-access") {
                    sh '''#!/bin/bash
                    if [[ $(git tag | grep "${COMMIT_ID}" | wc -l) -eq 1 ]]
                        then    echo "Tag already exists"
                        else    echo "Tag will be created"
                                git config user.name "jenkins"
                                git config user.email "jenkins@localhost"
                                git tag -a $COMMIT_ID -m "Added tagging"
                                git push --tags
                    fi
                    '''
                }
            }
        }
    } catch(all) {
        currentBuild.result = "FAILURE"
        throw err
    }
}

With the 3rd stage “Execute role on host(s)” we use an different limit. We now use all:!localhost:!staging to deploy to all hosts, but not to localhost and staging. Same is for the 4th stage, for executing the tests. As the last stage in the job, we create a tag ${params.COMMIT_ID}_production and we push it. Once we see this tag in our repository, we know that the changes is installed correctly on all servers.

Keep in mind that this can only be successful if you use proper and correct tests. You’ll really need to be sure that your tests is covering all of the components that is changed by your role. This deployment will fail or succeed with the quality of your tests.

Good luck and if you have suggestions please let me know.

Testing Ansible roles in a cluster setup with Docker and Molecule

ansible_logo_black_square

This page is updated on 2017-05-01.

This is a follow up on the previous 2 blog posts. With the first blog we discussed some steps to test your Ansible role with some basic steps. With the 2nd blog we added some features to extend our test by using CI tooling and group vars. With this post however, we might be configuring Molecule that will not occur very common with testing Ansible roles.

This time we configure Molecule for a role that is installing and configuring a cluster on Docker, like MySQL or MongoDB. We don’t go into a specific role (as there are so many), I only give some information on how to do this. We are only configuring Molecule for this setup, I’m still busy with running some specific TestInfra tests on a specific container.

Keep in mind these actions are only needed when using the {{ ansible_eth0.ipv4.address }} isn’t enough and you need a list with all ips.

For configuring Molecule, we’ll have to change 2 files:

  1. molecule.yml
  2. playbook.yml

molecule.yml

First we update the ‘molecule.yml’ file by configuring 3 (Or more, depends on what you need) docker containers. See the following example:

docker:
  containers:
  - name: node1
    ansible_groups:
      - cluster_service
    image: milcom/centos7-systemd
    image_version: latest
    privileged: True
    port_bindings:
      3306: 3306,
      4444: 4444
  - name: node2
    ansible_groups:
      - cluster_service
    image: milcom/centos7-systemd
    image_version: latest
    privileged: True
    port_bindings:
      3307: 3306,
      4445: 4444
  - name: node3
    ansible_groups:
      - cluster_service
    image: milcom/centos7-systemd
    image_version: latest
    privileged: True
    port_bindings:
      3308: 3306,
      4446: 4444

As you see, I have added the ‘port_bindings’ configuration to the instances, which is different with the examples in the previous blog posts. With the containers, we open the ports on the host (Before the ‘:’) and proxy them to the ports to the docker container (After the ‘:’ ).

In the above example, this ports configuration is used for configuring a MySQL (Or MariaDB) Galera cluster setup. You’ll have to update the ports configuration to your needs.

playbook.yml

Before we specify the roles in the ‘playbook.yml’, we add an ‘pre_tasks’. We add 2 blocks of code and will discuss them one by one. First we add the following in the ‘pre_tasks’ part:

    - name: "Get ip node 1"
      local_action: shell docker inspect --format \{\{.NetworkSettings.IPAddress\}\} node1
      register: node_ip_1
      changed_when: False
    - name: "Get ip node 2"
      local_action: shell docker inspect --format \{\{.NetworkSettings.IPAddress\}\}  node2
      register: node_ip_2
      changed_when: False
    - name: "Get ip node 3"
      local_action: shell docker inspect --format \{\{.NetworkSettings.IPAddress\}\}  node3
      register: node_ip_3
      changed_when: False

We have added 3 tasks, that do the same command but for each container. We execute a ‘docker inspect’ command to get the ip address of the docker container. As the docker inspect command used the ‘{{ .NetworkSettings.IPAddress }}’ format, Ansible will try to replace it because it thinks it is a variable. Luckily, we can make use of ‘{{ raw }} {{ endraw }}’ for this and Ansible will not use this as a variable anymore.

We register a variable, because we need to use the output, because the ‘docker inspect’ outputs the ip address. We also have to add the property ‘changed_when: False’.

What do you mean with the last one?

We have to fool Ansible with the ‘changed_when’ command for the ‘idempotence’ check. As this task will run every time, the ‘idempotence’ check will fail because of it (because it sees that there are tasks with state “Changed”).

Now we add the 2nd block of tasks to the ‘pre_tasks’, just after the first block of code we added earlier:

    - name: "Set fact"
      set_fact:
        node_ip: "{{ node_ip_1.stdout }}"
      when: inventory_hostname  == 'node1'
    - name: "Set fact"
      set_fact:
        node_ip: "{{ node_ip_2.stdout }}"
      when: inventory_hostname  == 'node2'
    - name: "Set fact"
      set_fact:
        node_ip: "{{ node_ip_3.stdout }}"
      when: inventory_hostname  == 'node3'

With this block we add 3 tasks again, 1 task for each container, to create a fact. In this case, I used to create the fact with the name ‘node_ip’, but you may name it differently. In my case when I use the ‘node_ip’ in my Ansible role, it get the actual IP of the host.

You can also create a list with all the ips if you need this in your role:

cluster_host_ips:
  - "{{ node_ip_1.stdout }}"
  - "{{ node_ip_2.stdout }}"
  - "{{ node_ip_3.stdout }}"

(You have to use the corrent name for the list of course 😉 )

I don’t know if this is the correct way, but it works in my case. I have a Jenkins job that validates a role that is configured to run a 3 node setup (Elasticsearch, MariaDB and some others).

If you have a other or a better way for doing this, please let me know!

Extending Ansible Role testing with Molecule by adding group_vars, dependencies and using travis ci

ansible_logo_black_square

This blog post will extend the actions we described on this https://werner-dijkerman.nl/2016/07/10/testing-ansible-roles-with-molecule-testinfra-and-docker/ page. We only configured a very basic configuration with 2 tests in the previously mentioned page, but thats not enough to and on this page we will continue to complete the tests. Github page for Molecule (In case you forgot 😉 )

We will discuss the following in this blogpost:

  • Make the TestInfra tests OS aware
  • Use group_vars
  • Role dependencies
  • Configure Travis

Lets dive into the tasks.

TestInfra OS Aware

Ok, not really molecule related, but very important if our role needs to work on several different operating systems. In the earlier mentioned blogpost we had configured Molecule to use a Ubuntu docker image. Before we can make the tests OS aware, we have to add some docker containers to the configuration which have different operating systems.

We create the following configuration:

docker:
  containers:
  - name: zabbix-agent-centos
    ansible_groups:
      - group1
    image: milcom/centos7-systemd
    image_version: latest
    privileged: True
  - name: zabbix-agent-debian
    ansible_groups:
      - group2
    image: maint/debian-systemd
    image_version: latest
    privileged: True
  - name: zabbix-agent-ubuntu
    ansible_groups:
      - group2
    image: rastasheep/ubuntu-sshd
    image_version: latest
    privileged: True

We have 3 docker containers configured: Ubuntu, Debian and CentOS. I’ve searched for Docker containers that have systemd configured, as the Zabbix Agent uses this. The official docker images of the mentioned OS’es doesn’t have the systemd enabled/configured.

We would like to run the ‘molecule test’ command, but it will fail on the ‘verify’ part.
We currently have this test:

def test_zabbix_package(Package, SystemInfo):
    zabbixagent = Package('zabbix-agent')
    assert zabbixagent.is_installed

    assert zabbixagent.version.startswith("3.0")

This test will validate if Zabbix is installed and the version starts with 3.0 (Thats the default version to be installed). When running this test, it will fail on the Debian and Ubuntu host. Why?They use a slightly different version naming than CentOS.

We have to update the function by adding the SystemInfo class to the function:

def test_zabbix_package(Package, SystemInfo):
    zabbixagent = Package('zabbix-agent')
    assert zabbixagent.is_installed

    if SystemInfo.distribution == 'debian':
        assert zabbixagent.version.startswith("1:3.0")
    if SystemInfo.distribution == 'centos':
        assert zabbixagent.version.startswith("3.0")

Now we have added the SystemInfo class to the function, we can use this to determine which tests we can execute on which OS. In the above example we see that if the distribution ‘debian’ is, we validate if the version starts with ‘1:3.0’. With CentOS we validate if it is ‘3.0’. When we execute the test on all 3 hosts, it will run successfully.

Using group_vars

This part also applies to using ‘host_vars’, just replace the word ‘group_vars’ with ‘host_vars’ 😉 We configure the molecule.yml file by adding some group_vars related data. We configure the 1st block in the molecule.yml like this:

ansible:
  playbook: playbook.yml
  group_vars:
    mysql:
      database_type: mysql
      database_type_long: mysql
    postgresql:
      database_type: pgsql
      database_type_long: postgresql
      postgresql_pg_hba_conf:
        - "host all all 127.0.0.1/32 trust"
        - "host all all ::1/128 trust"
      postgresql_pg_hba_local_ipv4: false
      postgresql_pg_hba_local_ipv6: false

This setup will “create” 2 group_var files: mysql and postgresql. When we run a molecule command, the group_vars will be created in the .molecule directory. Lets run a ‘molecule list’ command. For know we ignore the output, but lets see what is created in the .molecule directory:

[vagrant@localhost ansible-zabbix-server]$ find .molecule -type f | xargs ls -lrt
-rw-rw-r--. 1 501 games   3 Jul 17 20:31 .molecule/state
-rw-rw-r--. 1 501 games 215 Jul 17 20:31 .molecule/group_vars/postgresql
-rw-rw-r--. 1 501 games  51 Jul 17 20:31 .molecule/group_vars/mysql
[vagrant@localhost ansible-zabbix-server]$ cat .molecule/group_vars/mysql
---
database_type: mysql
database_type_long: mysql
[vagrant@localhost ansible-zabbix-server]$ cat .molecule/group_vars/postgresql
---
database_type: pgsql
database_type_long: postgresql
postgresql_pg_hba_conf:
- host all all 127.0.0.1/32 trust
- host all all ::1/128 trust
postgresql_pg_hba_local_ipv4: false
postgresql_pg_hba_local_ipv6: false

Within the .molecule directory a group_vars directory is created and 2 files are present (We will ignore the ‘state’ file). The output of these files are the same as how we configured the molecule.yml file.

Role dependencies

When your role has some dependencies, we really want them to download before we execute our role in Molecule. It will fail if we don’t do this. We have to download them first.

Molecule has a simple configuration for this. Within the molecule.yml file, we add the ‘requirements_file’ option in the Ansible configuration. Our example now looks like this:


dependency:
  name: galaxy
  requirements_file: requirements.yml 
  options:
    ignore-certs: True
    ignore-errors: True

ansible:
  playbook: playbook.yml
  group_vars:
    mysql:
      database_type: mysql
      database_type_long: mysql
    postgresql:
      database_type: pgsql
      database_type_long: postgresql
      postgresql_pg_hba_conf:
        - "host all all 127.0.0.1/32 trust"
        - "host all all ::1/128 trust"
      postgresql_pg_hba_local_ipv4: false
      postgresql_pg_hba_local_ipv6: false

In the root directory of the repository, a file called ‘requirements.yml’ is found which contains the dependencies.

When we run the ‘converge’ subcommand, the roles will be downloaded and after this the role is executed:

[vagrant@localhost ansible-zabbix-server]$ molecule converge
WARNING:vagrant:The Vagrant executable cannot be found. Please check if it is in the system path.
--> Installing role dependencies ...
- downloading role 'apache', owned by geerlingguy
- downloading role from https://github.com/geerlingguy/ansible-role-apache/archive/1.7.2.tar.gz
- extracting geerlingguy.apache to .molecule/roles/geerlingguy.apache
- geerlingguy.apache was installed successfully
- downloading role 'mysql', owned by geerlingguy
- downloading role from https://github.com/geerlingguy/ansible-role-mysql/archive/2.3.0.tar.gz
- extracting geerlingguy.mysql to .molecule/roles/geerlingguy.mysql
- geerlingguy.mysql was installed successfully
- downloading role 'postgresql', owned by galaxyprojectdotorg
- downloading role from https://github.com/galaxyproject/ansible-postgresql/archive/0.9.2.tar.gz
- extracting galaxyprojectdotorg.postgresql to .molecule/roles/galaxyprojectdotorg.postgresql
- galaxyprojectdotorg.postgresql was installed successfully
--> Starting Ansible Run ...

PLAY [all] *********************************************************************

TASK [setup] *******************************************************************
ok: [zabbix-server-centos-mysql]
... 
<snip>

Configure CI

We don’t want to run manually the molecule commands every time a change occurs. To quote Apple: “There is an app for that”. Well almost. I explain 2 ways of running Molecule in a automated way: Using Travis and Using Jenkins.

Travis

A commonly known cloud platform for CI, and it is very simple to make use of it. The Molecule documentation has a very nice example on how to configure the travis setup. We create in the root directory of our role, a file named: .travis.yml

sudo: required
language: python
services:
- docker

before_install:
- sudo apt-get -qq update
- sudo apt-get install -o Dpkg::Options::="--force-confold" --force-yes -y docker-engine

install:
- pip install molecule ansible docker

script:
- molecule test
notifications:
  webhooks: https://galaxy.ansible.com/api/v1/notifications/

Its a very basic travis configuration. First block is to let Travis know that we need to make use of sudo, that the language Python is and we use the ‘docker’ service.

2nd block is to update the Ubuntu’s apt cache and install Docker. When Docker is installed, we install molecule and after that, molecule test is executed. When the job is finished, we send some data via a web hook to the Ansible Galaxy page. The last step is to show the ‘passing’ (Or failing) badge on the Ansible Role page.

Jenkins

This example will show you a basic Jenkinsfile. I don’t do a single action manually in Jenkins. A  Jenkins job is basically code and we should use it as code. With Jenkins 2, we can make use of a Jenkinsfile (Also with Jenkins 1, but was named jenkins-template I believe?)

When you have a Jenkins server running, create a new Pipeline job and configure the correct git repository. (Okay, for this blogpost we do this manually, but I have a playbook for this. So this is on my part automated as well. 😉 )

A basic example:

node(){
    stage 'INFO: Checkout code'
        checkout scm
    stage 'Installing Molecule'
        sh 'sudo pip install molecule'
    stage 'Creating Containers'
        sh 'molecule create'
    stage 'Installing the Role'
        sh 'molecule converge'
    stage 'Idempotence check'
        sh 'molecule idempotence'
    stage 'Verify the application'
        sh 'molecule verify'
    stage 'Destroy the containers'
        sh 'molecule destroy'
}

When the job is executed, it will first download the Jenkinsfile from the configured git repository and the tasks will be executed one by one. This is just to get you started and the Jenkinsfile should be extended with some error handling like e-mail configuration etc.

So this was the follow up on the original blogpost for how to use Molecule with your Ansible role. The guys are really busy with Molecule so Molecule grows really fast. Maybe doing an 3rd blogpost very soon with new features.. 🙂

Testing Ansible roles with Molecule, Testinfra and Docker

ansible_logo_black_square

“On 2017-05-01 I’ve updated this post to the current situation. Some things where outdated and where removed.”

In some earlier posts I’ve described how you can use Test Kitchen for testing Ansible Roles (This one and the one extending it.). Test Kitchen was created for testing Chef Cookbooks and like Chef, Test Kitchen is a Ruby application. On this page we describe an other tool for the same purpose. This tool is what you might see as a Python clone of Test Kitchen, but more specific to Ansible: Molecule (Github)

Molecule isn’t that old, only few years and when I browse the internet it is not yet really known in the community. Unlike Test kitchen with the many different drivers, Molecule supports several backends, like Vagrant, Docker, and OpenStack. With Molecule you can make use of Serverspec (Like Test Kitchen), but you can also make use of ‘Testinfra’. Testinfra is like Serverspec a tool for writing unit tests, but it is written in Python.

Lets dive into Molecule and create some tests for Molecule. On this page, we make use of the Docker backend and if you following this page please install docker.

Installing Molecule is really simple:

pip install molecule docker

Voila, it is installed. With the installation of Molecule, Testinfra is installed to. We had to provide the docker module as well, otherwise molecule doesn’t know how to connect to the docker daemon. We can configure a Ansible Role. I used my ‘zabbix-agent’ role as test case for the Test Kitchen setup, so I will use it again for Molecule.

When you haven’t created an Ansible role yet, instead of using the ansible-galaxy command you can use the following command:

molecule init --driver docker --role role_name

This will create just like the ‘ansible-galaxy’ some default directories and files, but also gives us a starting point with a few extra files for testing this role with Molecule.

When you already have a working module and want to make use of Molecule, please execute the following command:

molecule init --driver docker

This will install several files specific for Molecule. No worries, we can recreate these files manually. Lets do that in a role and see what the files do.

File: <root>/molecule.yml

---
ansible:
  playbook: playbook.yml

driver:
  docker
docker:
  containers:
    - name: zabbix-01
      ansible_groups:
        - group1
      image: debian
      image_version: latest
      privileged: True

verifier:
  name: testinfra

This is the configuration file for Molecule. We specify which playbook Molecule will execute, in this case playbook.yml.
We specify that we want to make use of the Docker driver and that we have a docker container configuration. In this case, we only have 1 docker container specified. We use a Debian docker container with the ‘latest’ tag. We name the container ‘test-01’ and is in the group ‘group1’. And at last we configure molecule to use testinfra as the testtool.

File: <root>/playbook.yml

---
- hosts: all
  roles:
    - role: ansible-zabbix-agent

The playbook that is executed in the Docker container. This is a very basic one, we only have to specify the correct .

File: <root>/tests/inventory

localhost
[group1]
zabbix-01 ansible_connection=docker

The Ansible inventory file. Should be a known file to you 😉

File: <root>/tests/test_default.py


from testinfra.utils.ansible_runner import AnsibleRunner
testinfra_hosts = AnsibleRunner('.molecule/ansible_inventory').get_hosts('all')

def test_hosts_file(File):
    hosts = File('/etc/hosts')
    assert hosts.user == 'root'
    assert hosts.group == 'root' 

(Edit 2016-09-14: As of release 1.9, the first 2 lines should be present in the TestInfra script.)

This is the test infra python file, containing the tests. After the init command, we have 1 test that will check if there is a hosts file, and the user and group of the file belongs to user ‘root’. We discuss this file later on by adding some more tests.

File: <root>/tests/test.yml

---
- hosts: localhost
  remote_user: root
  roles:
    - zabbix-agent-role

Now we have discussed the files.

We add some infra test checks in the ‘test_default.py’ file. We add the following 2 tests:

def test_zabbix_package(Package):
    zabbixagent = Package('zabbix-agent')
    assert zabbixagent.is_installed
    assert zabbixagent.version.startswith("1:3.0")

def test_zabbixagent_running_and_enabled(Service):
    zabbixagent = Service("zabbix-agent")
    # assert zabbixagent.is_running
    assert zabbixagent.is_enabled

These are 2 Python function which are executed with Testinfra. With the first function, we validate if the package ‘zabbix-agent’ is installed. Also we check if the version starts with: 1:3.0. If you have some experience with testing Python code, this might be familiar to you. Test Infra uses ‘PyTest‘ to execute the tests and validate the.

The 2nd function we validate the ‘zabbix-agent’ service. We make sure the service is enabled. As you see, I’ve commented the check if the service is running. When it is enabled, I get this error message:

Failed to get D-Bus connection: Unknown error -1

Strange, because I’ve configured the privileged mode on the docker container. So maybe this is a bug or misconfiguration on my part, but for now I leave it commented and need to find a solution for this.

Within the molecule.yml we have to update the docker container configuration by adding the following property for all the docker containers:

    required: True

Now we added it, we have to do a “molecule destroy” and start again. The container will be recreated and we won’t get an error message about the dbus.

Now we are ready to move on (I’m well aware that these 2 tests that I added will not be enough, I’ll add these myself later on).

Molecule has several subcommands, let run molecule -h and see what is available:

No handlers could be found for logger "vagrant"
Usage:
    molecule [-hv] &amp;amp;lt;command&amp;amp;gt; [&amp;amp;lt;args&amp;amp;gt;...]

Commands:
    check         check playbook syntax
    create        create instances
    converge      create and provision instances
    idempotence   converge and check the output for changes
    test          run a full test cycle: destroy, create, converge, idempotency-check, verify and destroy instances
    verify        create, provision and test instances
    destroy       destroy instances
    status        show status of instances
    list          show available platforms, providers
    login         connects to instance via SSH
    init          creates the directory structure and files for a new Ansible role compatible with molecule

Options:
    -h --help     shows this screen
    -v --version  shows the version

We first start with the ‘check’ command:

[vagrant@localhost ansible-zabbix-agent]$ molecule check
No handlers could be found for logger "vagrant"

playbook: playbook.yml
[vagrant@localhost ansible-zabbix-agent]$ echo $?
0

Seems very well, the check commands validate if the playbook.yml doesn’t have any problems/syntax errors.
We can continue with the next command: create.

[vagrant@localhost ansible-zabbix-agent]$ molecule create
No handlers could be found for logger "vagrant"
 Building ansible compatible image ...
 Step 1 : FROM debian:latest

  ---&amp;amp;gt; 1b088884749b

 Step 2 : RUN bash -c 'if [ -x "$(command -v apt-get)" ]; then apt-get update &amp;amp;amp;&amp;amp;amp; apt-get install -y python sudo; fi'

  ---&amp;amp;gt; Using cache

  ---&amp;amp;gt; 8ef54383599a

 Step 3 : RUN bash -c 'if [ -x "$(command -v yum)" ]; then yum makecache fast &amp;amp;amp;&amp;amp;amp; yum update -y &amp;amp;amp;&amp;amp;amp; yum install -y python sudo; fi'

  ---&amp;amp;gt; Running in 6d3142fa72aa

...
 Finished building molecule_local/debian:latest
 Creating container zabbix-01 with base image debian:latest ...
 Container created.

[vagrant@localhost ansible-zabbix-agent]$

Now we have created a docker container where we can install our Ansible role on to, we do that with the ‘converge’ subcommand.

[vagrant@localhost ansible-zabbix-agent]$ molecule converge
No handlers could be found for logger "vagrant"

PLAY [all] *********************************************************************

TASK [setup] *******************************************************************
ok: [zabbix-01]

TASK [ansible-zabbix-agent : Include OS-specific variables] ********************
ok: [zabbix-01]

TASK [ansible-zabbix-agent : Install the correct repository] *******************
skipping: [zabbix-01]

...
RUNNING HANDLER [ansible-zabbix-agent : restart zabbix-agent] ******************
changed: [zabbix-01]

PLAY RECAP *********************************************************************
zabbix-01                  : ok=12  changed=7    unreachable=0    failed=0

Nice, the role is installed correctly without any issues on the container. With Test Kitchen we had to use BATS to validate if the Role is idempotent, but luckily molecule has just a simple sub command for it: idempotence

Well, it seems that the Role has passed the idempotence test:

[vagrant@localhost ansible-zabbix-agent]$ molecule idempotence
No handlers could be found for logger "vagrant"
Idempotence test in progress (can take a few minutes)...
Idempotence test passed.

[vagrant@localhost ansible-zabbix-agent]$

Testing the role is nicely going on right now, but we are not there yet. Now we need to use the ‘verify’ command to actually validate our role on the Docker container:

[vagrant@localhost ansible-zabbix-agent]$ molecule verify
No handlers could be found for logger "vagrant"
Trailing whitespace found in ./defaults/main.yml on lines: 35
Trailing newline found at the end of ./handlers/main.yml
Trailing whitespace found in ./library/zabbix_host.py on lines: 29
Trailing newline found at the end of ./library/zabbix_hostmacro.py
[vagrant@localhost ansible-zabbix-agent]$

Whoops, it seems it has found some issues. Let me fix that first, probably need to run the verify again after fixing it.

[vagrant@localhost ansible-zabbix-agent]$ molecule verify
No handlers could be found for logger "vagrant"

Executing testinfra tests found in tests/.
============================= test session starts ==============================
platform linux2 -- Python 2.7.5, pytest-2.9.2, py-1.4.31, pluggy-0.3.1
rootdir: /git/ansible/ansible-zabbix-agent/tests, inifile:
plugins: xdist-1.14, testinfra-1.4.0
collected 3 itemss

tests/test_default.py ...

=========================== 3 passed in 0.63 seconds ===========================

No serverspec tests found in spec/.

[vagrant@localhost ansible-zabbix-agent]$

After fixing it, everything seems to work fine. Nice!

Now we are done with the container, so we can execute molecule again, but with the delete sub command and the container will be deleted.

These were the basics for testing an Ansible role with Molecule, Docker and Test Infra. This page uses the ‘Debian’ Docker image, whereas I normally use CentOS for this. I have some issues (Get the same error message when I enable the Test Infra test to validate if the service is running) to make this work on CentOS. So maybe Molecule isn’t mature enough yet, but it is getting there.

I’ll update my Ansible roles so it will use Molecule instead of Test Kitchen (No hard feelings ;-))