Jenkins as code, part 2: Setting up the Jenkins job

This is the follow up of the previous blogpost about Jenkins as code, which you can find here. With the previous blogpost we discussed how we have created a Docker image, containing Jenkins and several files needed to correctly install the plugins and configure Jenkins with several yaml files. The yaml configuration files are used by the configuration-as-code plugin to configure the Jenkins environment how we want to have it configured. This means that no manual changes are needed in the UI.

Just like mentioned in the first blogpost:

Before we do anything I just want to remind you that this is just 1 way to achieve a Jenkins as code setup. It does not mean that this is the only or best way, it is just one way. Next to that, these blogposts and the code in the Github repository will help you kickstart your own setup and by no means you can just run it on a production environment and blame me if something is not working fine. During the blogposts I will tell you how I was able to do things, so you can redo it all yourself (and compare it with the code in the Github repository).

Seed All DSL Jobs

With this blogpost, we will continue the job-dsl.yaml file. This is the yaml file which will make sure that once Jenkins is running, a “Seed All DSL jobs” job” is configured and loads groovy files from a specific directory in a git repository. So lets continue with these groovy files. It basically consists of 2 parts, we set some variables and have several class and functions to generate the jobs and list views. 

The example.groovy file top part are some variables we need to configure.

@Field String jenkinsCredentialId = "SSH_GIT_KEY"
@Field String basePath = 'example'
@Field String defaultPollingScm = 'H/5 * * * *'

JobConstructor[] jobList = [
        [
                "example-repo",
                "https://bitbucket.org/wernerdijkerman/this-is-some-test.git",
                defaultPollingScm
        ]
]

The first one is easy, that is the name of our Jenkins credential that we are using for authorizing to the Git server. The basePath is the value which is used for creating a “directory” and a specific “tab” with this name. See the screenshot. Then we have the defaultPollingScm, which we will configure that each job will check once every 5 minutes for changes. Ideally there should be webhooks configured on the Git server, but this only works if Jenkins is available from the Git server (which is in my situation not the case)

The jobList is basically a list with all of the repositories that belong to this “example” group. For my Terraform repositories, I would have a file named terraform.groovy, with a basePath set to ‘terraform‘ and in the jobList variable, all repositories configured related to Terraform. And if you need to add (or remove) a (terraform) repository, you will update this jobList variable by either adding or removing the repository information. Once the change is merged into main (or master), with a max of 15 minutes the new job is added (or deleted) because the Seed DSL job has ran.

The rest of the file contains some specific functions to actual create the Jobs and listviews. You can see on this page https://jenkinsci.github.io/job-dsl-plugin/ with all the possible functions (and the parameters) you can use to make it your own.

Shared Library

A shared library allows you to use code that can be used in Jenkinsfile’s. The problem when you have lets say 50 Spring Boot java microservices, you probably have 50 Jenkinsfiles that are the same, with the exception of the name of the microservice. So that is a lot of duplicate code in the various repositories, especially if these Jenkinsfiles are really large. So the goal is to create small(er) Jenkinsfiles as we still need to have a Jenkinsfile in the repositories. Because we create smaller Jenkinsfile, we can even provide arguments that can be used for correctly generating the Jenkinsfile.

This is how our Jenkinsfile will look like, for our example service:

@Library('djwasabi') _
def run = new com.djwasabi.common.examplePipeline()
def NAME = "Wasabi"
run.pipeline('test-repo', NAME)

With the first line we specify which Shared Library we want to make use. You need to go to “Manage“, “Configure system” and then look for “Global Pipeline Libraries“. When you did a docker compose up from my repository, you would see that there is a Shared Library named “djwasabi“. This shared library is from a git repository and is using the master branch as the version (You could also specify a tag or branch or even an git commit). 

To understand the 2nd line, you will need to go to https://bitbucket.org/wernerdijkerman/jenkins-shared-library/src/master/ (or look into the already earlier provided github repository in the “library” folder), you will see a specific directory structure which is used within the groovy language. You will see an examplePipeline.groovy file, which will be loaded to create a new groovy object.

The 3rd line we set a variable named NAME and set it to “Wasabi“. With the 4th line we will execute the function inside the examplePipeline.groovy file named ‘pipeline‘. This function accepts multiple parameters and we only provide 1 called “NAME“.

Let us take a look at the examplePipeline.groovy file:

package com.djwasabi.common

import com.djwasabi.common.workers.*
import groovy.transform.Field

def pipeline(jobName, name = "world", agentNode = "worker") {
    def command = new Command(this)
    def git = new Git(this)
    def resultsGetter = new ResultsGetter(this)

    node(agentNode) {
        try {
            def lastResult = currentBuild.rawBuild.getPreviousBuild()?.result

            stage("Checkout") {
                git.checkout()
                tagged = git.isCurrentCommitAlreadyTagged()
            }
            if (!tagged || hudson.model.Result.SUCCESS != lastResult) {
                stage("Run command") {
                    command.echo("hello " + name + " via job " + jobName)
                }
            } else {
                resultsGetter.repeatPreviousBuildResult(currentBuild)
            }
        } catch (all) {
            stage('Destroy it') {
                command.echo('lets run when things go wrong.')
            }
        }
    }
}

As you see with the examplePipeline.groovy file, the function “pipeline” accepts multiple arguments. The first one “jobName” does not have a default and thus it is always required, but the other 2 has default (resp. “world” and “worker“). Have you noticed that the “agentNode” parameter contains the value “worker“, which is also the value for our configuration we used for starting the Docker container in the previous blog post? So if you write your own shared library, you can make 1 or more of these node definitions and based on parameters that are set in the Jenkinsfile, certain actions can be executed.

When we run the job in Jenkins, you will see something like the above showing up. Almost at the end of the log, you will see the following line:
hello Wasabi via job test-repo

That is basically logged via this piece of code that was generated in the examplePipeline.groovy file:

                stage("Run command") {
                    command.echo("hello " + name + " via job " + jobName)
                }

Summary

In these 2 parts I helped you to explain what the possibilities are for an Jenkins as code approach. This approach will have everything set into code and when someone wants to make a change in Jenkins, the person should be creating a change into the code. When you have properly setup your Git server, this person will create a branch, make the changes and will create a Pull Request before it is merged into master|main. With everything set into code, people have an audit log of what has been changed (and by whom), so it will force you into having a discussion when things are needed (or when something went wrong).

And what I personally also like, because everything is set into code you can deploy a Jenkins server somewhere else and you can almost immediately use it. Especially when you make the Configuration as Code yaml files configurable with environment variables, you can run multiple Jenkins setups from the same code base. And because everything is in code, you don’t have to worry about backing up Jenkins (It is already on the Git server and part of their backup mechanism).

I hope you enjoyed it and that it showed you that you can too automate the configuration of any Jenkins environment you want to run in your organisation. If you have any improvements and/or remarks, please let me know.

May the force (or source) be with you!

Jenkins as code, part 1: Setting up Jenkins in Docker

I hate doing things manually, I really do.

Log into an UI, do some clicks here and there to be able to have something created or configured. It is error prone (you can easily forget something or make a typo) and it is stupid and/or boring (Especially if you need to do this on a routine basis). If you can change something in a UI, then someone is able to change that as well and even do that without you knowing that it is changed (Or viceversa ;-)). So doing things manually is not the way forward and we should focus on automation. Automation is one of the pillars of doing DevOps, so we should always automate things right?

What people do probably not know, Jenkins is a tool that can be fully automated, you only have to know how. (And based on some posts on for example Reddit, I don’t think people knows that this is even possible).

Jenkins as code

So lets dive into, what I would say as: Jenkins as code.

This will basically be a 2 part blog post where we will discuss the following:

  1. This part where we will create a Docker image, containing Jenkins with its configuration files and plugins;
  2. The next part, where we will create a shared library and use that in a Jenkinsfile, with jobs we load via a “specifications” repository;

Before we do anything I just want to remind you that this is just 1 way to achieve a Jenkins as code setup. It does not mean that this is the only or the best way, it is just one way. Just like that there are 1000 ways to go to Rome. Next to that, these blog posts and the code in the Github repository will help you kickstart your own setup and by no means you can just run it on a production environment and blame me if something is not working fine. I am not a groovy expert and I can do some basic things, so don’t expect a new world wonder. During the blog posts I will tell you how I was able to do things, so you can redo it all yourself (and compare it with the code in the Github repository) and build it on your own terms/setup.

In both blog posts, you will see a lot of code and commands popping up. But no worries, all code is available on my Github in repository https://github.com/dj-wasabi/blog-jenkins-as-code . So lets start with the first part: Setting up Jenkins.

Docker(file)

We will create a Docker image, based on “jenkins/jenkins:lts” Docker image, install Docker and configure it with the configuration-as-code plugin included with several yaml files that are used for configuring Jenkins. And as part of the Github repository, we have a docker-compose.yaml file which we can use to boot our setup.

Lets start with the Dockerfile.

USER root
RUN groupadd docker && \
    curl -fsSL https://get.docker.com -o get-docker.sh && \
    sh get-docker.sh && \
    usermod -aG root jenkins && \
    usermod -aG docker jenkins
USER jenkins

Lets discuss this first, the “jenkins/jenkins:lts” Docker image does not contain the docker application, so we need to install that and make sure that the “jenkins” user is part of the “root” and “docker” group. We need Docker in this image, as each Jenkins job will run in its own Docker container.

ENV CASC_JENKINS_CONFIG=/var/jenkins_home/casc

We need set an environment variable named CASC_JENKINS_CONFIG, which we basically tell Jenkins where the configuration-as-code yaml files can be found.

COPY casc/ /var/jenkins_home/casc
COPY plugins.txt /usr/share/jenkins/plugins.txt
RUN /usr/local/bin/install-plugins.sh < /usr/share/jenkins/plugins.txt

Then we will copy the contents on the casc directory to the earlier mentioned directory and we place the file with all of our plugins into a specific directory. Then we run the install-plugins.sh script so we can download and install all the plugins that we need in our setup.

And that is our Dockerfile, easy right? This will allow us to build a Jenkins Docker image, with all of our files and configuration that will lead to an Jenkins environment we want to have. We can deploy this on some host running Docker or even make some additional changes to make it work in Kubernetes.

Plugins

Lets go to the plugins.txt file, as this one is a bit easier to explain than the casc files.

We need to create a plugins.txt that contains all of the plugins we want to make use, so how do we do that. I manually (oh yes, sorry! :)) started a Jenkins container and followed the installation steps and then I picked several plugins to install during the installation steps. When Jenkins is running and I have finnished nstalling all plugins (don’t forget the “configuration-as-code” plugin), I went to “manage” and then clicked on “Script Console“. There you see an textfield to execute groovy scripts and I used the following script:

def plugins = jenkins.model.Jenkins.instance.getPluginManager().getPlugins()
plugins.each {println "${it.getShortName()}:latest"}

This is a “script” that provides an overview of all plugins that are currenlty installed in Jenkins. I have used the “latest” version of the plugin which is fine for demo purposes, but you could also update the “latest” to ${it.getVersion()}. This will show the actual version of the installed plugin. I would suggest to use the versions in the plugins.txt file. This helps you in the future when someone create’s an PR that it shows you that there is an update in the version of a plugin, which you won’t see when it is using “latest”.

Then you hit the “Run” button and you will see some output appear. Select the output and place that in the plugins.txt file and you are done (I would also sort the contents of the file, so all plugins are order alphabetical).

Configuration as code

Let us first explain how we can get a yaml file. Go to “Manage” and then you will see “Configuration as Code” (And click on it please) in the “System Configuration” lane. There is a button called “Download Configuration” which will download the yaml file and with “View Configuration” you can see the yaml file in your browser. When you have downloaded the yaml configuration file, you can make use of it in your Docker image by placing it in the casc/ directory. I would suggest you split it into seperate files, so you won’t have 1 large file but smaller ones with each a specific set of configuration. For example, create a credentials.yaml file containing all Jenkins credentials. 

But before you commit all your changes, you can also update some values by using environment variables, see the following:

  securityRealm:
    local:
      allowsSignup: false
      enableCaptcha: false
      users:
      - id: "${JENKINS_ADMIN_USERNAME:-admin}"
        name: "${JENKINS_ADMIN_NAME:-Administrator}"
        password: "${JENKINS_ADMIN_PASSWORD}"

This piece of the configuration you see now is responsible for creating an admin user. I don’t want to hardcode the username and definitely not the password in this file, so I use environment variables for that. And this is also the case for the credentials that Jenkins use, see the following example of the Jenkins “credentials”:

credentials:
  system:
    domainCredentials:
      - credentials:
          - basicSSHUserPrivateKey:
              scope: GLOBAL
              id: "SSH_GIT_KEY"
              username: "git"
              description: "SSH Credentials for jenkins"
              privateKeySource:
                directEntry:
                  privateKey: ${JENKINS_SSH_GIT_KEY}

With the above credential configuration I won’t have to hardcode the SSH Private key in the Docker image, but can use it as an environment variable. Nice right? 🙂

When everything is done via code, we can also already configure the Security Matrix and allowing what user can and most importantly can’t do in Jenkins. As my Jenkins is running on premise and don’t allow traffic from outside the environment, I will allow people to start jobs (if they don’t want to wait on the triggering). So I will allow anonymous people to have read, build and cancel rights for the jobs. Why go for all that trouble for letting people authenticate against some source, so we can see that this person has started or cancelled a job? Most importantly, they can’t change anything unless they know the admin password. (But will be undone when Jenkins is restarted! :))

  authorizationStrategy:
    globalMatrix:
      permissions:
      - "Job/Build:anonymous"
      - "Job/Cancel:anonymous"
      - "Job/Read:anonymous"
      - "Overall/Administer:admin"
      - "Overall/Read:anonymous"

Now we are able to fully do Jenkins as code, as we will store the yaml files in the casc/ directory which are loaded when Jenkins is started. But when Jenkins is running, we also need to make sure that we will load the jobs from somewhere. We will do this with a “Seed Job“, which you can see in the “dsl-jobs.yaml” file in the casc/ directory in the Github repository.

            git {
                remote { 
                    url "${JENKINS_JOB_DSL_URL}"
                    credentials 'SSH_GIT_KEY' 
                }
                branch '*/main'
              }
        }
        triggers {
            scm('H/15 * * * *')
        }
        steps {
          dsl {
            external('${JENKINS_JOB_DSL_PATH:-jobs}/*.groovy')
            removeAction('DELETE')
          }
        }
      }

When Jenkins is started, we will automatically create the “Seed all DSL jobs” Jenkins job. And what it does is basically the following (Snippet is incomplete, see for the compete file on Github for full version):

  1. We use the credential ‘SSH_GIT_KEY‘ to checkout the repository mentioned in ${JENKINS_JOB_DSL_URL} (See the docker-compose.yaml file)
  2. We use the ‘main’ branch;
  3. The job is executed every 15 minutes;
  4. In the directory named ${JENKINS_JOB_DSL_PATH} we will find groovy files and if Jenkins has jobs which aren’t configured in these groovy files, we delete the jobs from Jenkins.

Before we finalise the Configuration as code part, we need to discuss one last file (and action). When we have the Jenkins server running, we will run each job in its own Docker container. So the Jenkins server will start a Docker container and do all of its action inside that container and the configuration is what follows:

jenkins:
  clouds:
    - docker:
        name: "docker"
        dockerApi:
          dockerHost:
            uri: "${DOCKER_HOST:-unix:///var/run/docker.sock}"
        templates:
        - connector:
            attach:
              user: "jenkins"
          dockerTemplateBase:
            bindAllPorts: true
            image: "jenkins/agent:latest"
            privileged: true
            environment:
              - "TZ=Europe/Amsterdam"
          instanceCapStr: "99"
          labelString: "worker"
          name: "worker"
          remoteFs: "/home/jenkins/agent"

This is also seen in the file “docker.yaml“. Here we have placed 1 template which we named “worker“, with the “jenkins/agent:latest” Docker image. As you know, this is just an example so you can modify this to your needs and use a Docker image that suits your needs. This Docker image should contain all the tools needed to run your jobs, so the “jenkins/agent:latest” might not be fit for your setup. And do know, as the “templates” key is a list, you can add a lot more templates with a unique name, settings and Docker image. For the dockerHost.uri, you will see the usage of a environment variable “DOCKER_HOST“. This is an variable we use in docker-compose.yaml file and if we don’t provide one, the default unix:///var/run/docker.sock is used.

You can go to “manage“, “Systems configuration” and scroll all the way down until you will see “Cloud“. It provides a link and when clicking on it, you’ll get the page where you can configure the “Cloud” configuration. When you make changes, don’t forget to export the yaml file on the “Configuration as Code” page mentioned earlier.

Build and ship it

So far we have discussed some basics on how we get our configuration, so lets build a Docker image. During the rest of this blog post, I will assume you will have the same layout as my Github repository. So lets go to the directory where we have the “Dockerfile“, “plugins.txt“, “docker-compose.yaml” file and the “casc/” directory. Here we will run the docker build command, to build the new Docker image.

cd server
docker build -t jenkins-as-code . --pull

I named it ‘jenkins-as-code‘ which works locally fine and if you want to push it into a Docker registry, you should prefix it with the correct registry name. If you prefix it with a registry or you named it differently, don’t forget to update the docker-compose.yml file with your new name. The –pull is there so if you already have a “jenkins/jenkins:lts” Docker image, you will get the latest one.

I think it is build now, otherwise we will wait a minute before we continue.

sleep 60 🙂

Ok, the Docker image is build and we can start it. If you see the docker-compose.yaml file, you will notice 2 ‘services’.

  1. socat;
  2. jenkins (The one will just build and want to start).

But lets describe the ‘socat‘ service. The ‘socat‘ service is used to make sure that our docker.sock file from our host can be used with Jenkins for starting the agents. If we do this from the Jenkins container itself and not using this ‘socat‘ service, we will get permission denied errors and Jenkins can not start any new Docker container (I am doing on a Mac, I don’t think people running it on Linux hosts will have issues ). 

The Jenkins service has several environment variables set. Before we start everything, we will need to create an environment variable first that contains the content of a private SSH key. I have used the following command for that:

export EXPORTED_PASSWORD=$(cat ~/.ssh/wd_id_rsa)

So this EXPORTED_PASSWORD contains the private SSH key and this one will be used in Jenkins as the SSH_GIT_KEY credential on multiple places. Also worth to mention is the JENKINS_ADMIN_PASSWORD environment variable, this is what is says: The password for the Admin user, so if you want to use something else here is the moment to change it.

We will start it with the following command:

docker compose up -d

I prefer starting it in the background, so that is why I added the -d argument. Once it is booted we open our favourite browser and go to http://localhost:8080 you will see something like the following:

So that is it for now. We started our newly build Docker image containing Jenkins, with the plugins we need in our environment and our own configuration!

We will go into the “Seed All DSL jobs” job with the next part of the blogpost. So stay tuned! 🙂

2nd blog post you can find here.

Using Molecule V2 to test Ansible Roles

Its been a few weeks now since Molecule V2 was released. So lets go into some details with Molecule V2 and lets upgrade my dj-wasabi.zabbix-agent role to Molecule V2 during this blogpost.

For those who are unfamiliar with Molecule: Molecule allows you to development and test Ansible Roles. With Molecule, 1 or more Docker containers are created and the Ansible role is executed on these Docker containers (You can also configure Vagrant and some other providers). You can then verify if the role is installed/configured correctly in the container. Is the package installed by Ansible, is the service running, is the configuration file correctly placed with the correct information etc etc.

This would allow you to increase reliability and stability of your Role. For almost all of my publicly available Ansible Roles, I have tests configured. If someone makes an Pull Request on Github with a change, these tests will help me to see if the change won’t break anything and thus the Pull Request can easily been merged. If not, some more attention to the change is needed.

This might be obvious, but if you do not have Molecule installed or already have something installed lets update it to the latest version (As moment of writing 2.0.3):

pip install —upgrade molecule

When we execute the –version:

$ molecule --version
molecule, version 2.0.3

We see the version: 2.0.3

Porting

The people behind Molecule have created a page for porting a role that is already configured with Molecule V1 to port it to Molecule V2. As the page mentions about a python script and doing it manually, we use the manually option for migrating the role to Molecule V2.

I have created a git branch (port_molecule_v2) on my mac and will execute the first command that is described on the porting guide:

(environment) wdijkerman@Werners-MacBook-Pro [ ~/git/ansible/ansible-zabbix-agent -- Tue Sep 05 13:16:00 ]
(port_molecule_v2) $ molecule init scenario -r ansible-zabbix-agent -s default -d docker
--> Initializing new scenario default...
Initialized scenario in /Users/wdijkerman/git/ansible/zabbix-agent/molecule/default successfully.
(environment) wdijkerman@Werners-MacBook-Pro [ ~/git/ansible/ansible-zabbix-agent -- Tue Sep 05 13:16:02 ]
(port_molecule_v2) $

This command will create a “default” scenario. The biggest improvement of using Molecule V2 is using scenarios. You can use 1 “default” scenario or you might want to use 5 scenario’s. Its completely up to you on how you want to test your role.

The init command has created a new directory called “molecule”. This directory will contain all scenario’s:

(environment) wdijkerman@Werners-MacBook-Pro [ ~/git/ansible/ansible-zabbix-agent -- Tue Sep 05 13:22:22 ] (port_molecule_v2) $ tree molecule
molecule
└── default
    ├── Dockerfile.j2
    ├── INSTALL.rst
    ├── create.yml
    ├── destroy.yml
    ├── molecule.yml
    ├── playbook.yml
    └── tests
        └── test_default.py

2 directories, 7 files

Here you see the “default” scenario we just created earlier. This scenario contains several files. We will discuss some files later on this post.

Back to the porting guide. The 2nd option on the porting guide is to move the current testinfra tests to the file molecule/default/tests/test_default.py. So lets move all the tests (And I mean only the tests and not the other testinfra specific code) from one file to the other. Keep the contents of the new test_default.yml in place, as this is needed for Molecule.

The 3rd option on the porting guide is for ServerSpec, as we don’t use this we will skip this and continue with the 4th option. The 4th option on the porting guide is to port the old molecule.yml file to the new one. Now its get interesting.

The current default molecule.yml file in the scenario/default directory:

---
dependency:
  name: galaxy
driver:
  name: docker
lint:
  name: yamllint
platforms:
  - name: instance
    image: centos:7
provisioner:
  name: ansible
  lint:
    name: ansible-lint
scenario:
  name: default
verifier:
  name: testinfra
  lint:
    name: flake8

It will end like this:

---
dependency:
  name: galaxy
driver:
  name: docker
lint:
  name: yamllint

platforms:
  - name: zabbix-agent-centos
    image: milcom/centos7-systemd:latest
    groups:
      - group1
    privileged: True
  - name: zabbix-agent-debian
    image: maint/debian-systemd:latest
    groups:
      - group1
    privileged: True
  - name: zabbix-agent-ubuntu
    image: solita/ubuntu-systemd:latest
    groups:
      - group1
    privileged: True
  - name: zabbix-agent-mint
    image: vcatechnology/linux-mint
    groups:
      - group1
    privileged: True
provisioner:
  name: ansible
  lint:
    name: ansible-lint
scenario:
  name: default
verifier:
  name: testinfra
  lint:
    name: flake8

platforms

The platforms is a generic configuration approach to configure the instances in Molecule V2. With Molecule V1, you’ll had a docker configuration, a vagrant configuration etc etc for configuring the instances, but with V2 you only have platforms.

In the above example I have configured 4 instances, named zabbix-agent-centos, zabbix-agent-debian, zabbix-agent-ubuntu and zabbix-agent-mint. The all have an image configured and I have placed them in the group1 group. I don’t do anything with the groups with this Role, but lets add them anyways. I also added the “privileged: True”, because the role does use systemd and needs a privileged container to execute successfully. Later in this blog post we do something with dependencies and some Ansible configuration, so don’t run away just yet. 😉

The 5th option in the porting guide is to port the existing playbook.yml to the new playbook.yml in the default directory. So I’ll move the contents from one file to an other file.

As 6th and last option in the porting guide is to cleanup the old stuff. So remove the old files and directories and we can continue with the molecule test command.

Lets execute it.

(port_molecule_v2) $ molecule test
--> Test matrix
    
└── default
    ├── destroy
    ├── dependency
    ├── syntax
    ├── create
    ├── converge
    ├── idempotence
    ├── lint
    ├── side_effect
    ├── verify
    └── destroy
--> Scenario: 'default'
--> Action: 'destroy'
    
    PLAY [Destroy] *****************************************************************
    
    TASK [Destroy molecule instance(s)] ********************************************
    changed: [localhost] => (item=(censored due to no_log))
    
    PLAY RECAP *********************************************************************
    localhost                  : ok=1    changed=1    unreachable=0    failed=0
    
    
--> Scenario: 'default'
--> Action: 'dependency'
Skipping, missing the requirements file.
--> Scenario: 'default'
--> Action: 'syntax'
    
    playbook: /Users/wdijkerman/git/ansible/zabbix-agent/molecule/default/playbook.yml
    
--> Scenario: 'default'
--> Action: 'create'

There is a lot of output now which I won’t add now, but just take a look at the beginning which I pasted above this line. During the output it shows which scenario is executing and which task. You can see that the line begins with “–> Scenario: ” and with “–> Action: “.

This is why Molecule V2 is awesome:

Molecule V2 uses Ansible itself to create the instances on which we want to install/test our Ansible role. You can see that by opening the create.yml file in the default directory. If we just place the last task in this blogpost

- name: Create molecule instance(s)
  docker_container:
    name: "{{ item.name }}"
    hostname: "{{ item.name }}"
    image: "molecule_local/{{ item.image }}"
    state: started
    recreate: False
    log_driver: syslog
    command: "{{ item.command | default('sleep infinity') }}"
    privileged: "{{ item.privileged | default(omit) }}"
    volumes: "{{ item.volumes | default(omit) }}"
    capabilities: "{{ item.capabilities | default(omit) }}"
  with_items: "{{ molecule_yml.platforms }}"

This last task in the create.yml file will create the actual Docker instance which we have configured in the molecule.yml file in the “platforms” section, which you can see at the “with_items” option. This is very cool, this means that you can configure the docker container with all the settings that Ansible allows you to use and Molecule will not limit this for you.

You can easily add for example the “oom_killer” option to the create.yml playbook and add it to the platform configuration in molecule.yml, without adding an feature request at Molecule and waiting when the feature is implemented.  Not that the waiting was long, the people behind Molecule are very fast fixing issues and adding features, so kuddo’s to them!

As you have guessed already, the create.yml file is for creating the instanced and destroy.yml will destroy those instances. You can override this if you don’t like the names.

This is an example if you really want to use other names for the playbooks (Or if you want to share playbooks when you have multiple scenario’s):

provisioner:
  name: ansible
  options:
    vvv: True
  playbooks:
    create: ../playbook/create-instances.yml
    converge: playbook.yml
    destroy:../playbook/destroy-instances.yml

Back to the molecule test command. The molecule test command fails on my first run during the lint action. (I will not show all output, as the list is very long!)

--> Scenario: 'default'
--> Action: 'lint'
--> Executing Yamllint on files found in /Users/wdijkerman/git/ansible/zabbix-agent/...
    /Users/wdijkerman/git/ansible/zabbix-agent/defaults/main.yml
      7:37      warning  too few spaces before comment  (comments)
      10:17     warning  truthy value is not quoted  (truthy)
      15:81     error    line too long (120 > 80 characters)  (line-length)
      21:81     error    line too long (106 > 80 characters)  (line-length)
      30:30     warning  truthy value is not quoted  (truthy)
      31:26     warning  truthy value is not quoted  (truthy)
    
    /Users/wdijkerman/git/ansible/zabbix-agent/handlers/main.yml
      8:11      warning  truthy value is not quoted  (truthy)
      8:14      error    no new line character at the end of file  (new-line-at-end-of-file)

Some of these messages is something I can work with, some I actually do not care. The output shows you every “failing” rule with the file. So the first file, defaults/main.yml has 6 failing rules. Per rule it shows you the following:

  • which line and character position
  • Type of error (warning or error)
  • The message

In my output of the lint actions, I see a lot of “line too long” messages. Personally I find the 80 characters limit a little bit to small these days, so lets update it to something higher. We first have to update the molecule.yml file and we have to update the lint section. First the lint section looked like this:

lint:
  name: yamllint

Now configure it like so it looks like this:

lint:
  name: yamllint
  options:
    config-file: molecule/default/yaml-lint.yml

We specify the yamllint by configuring a configuration file. Lets create the file yaml-lint.yml in the default directory and add something like this:

---

extends: default

rules:
  line-length:
    max: 120
    level: warning

We extend the current yaml-lint configuration by adding some of our own rules to overwrite the defaults. In this case, we overwrite the “line-length” rule to set the max to 120 characters and we set the level to warning (It was error). Every rule that results in an error will fail the lint action and in this case I don’t want to fail the tests because the line length was 122 characters.

When we run it again (I have fixed some other linting issues now, so output is a little different)

--> Scenario: 'default'
--> Action: 'lint'
--> Executing Yamllint on files found in /Users/wdijkerman/git/ansible/zabbix-agent/...
    /Users/wdijkerman/git/ansible/zabbix-agent/defaults/main.yml
      15:81     warning  line too long (120 > 80 characters)  (line-length)
      21:81     warning  line too long (106 > 80 characters)  (line-length)
    
    /Users/wdijkerman/git/ansible/zabbix-agent/molecule/default/create.yml
      9:81      warning  line too long (87 > 80 characters)  (line-length)
      10:81     warning  line too long (85 > 80 characters)  (line-length)
      16:81     warning  line too long (116 > 80 characters)  (line-length)
      30:81     warning  line too long (92 > 80 characters)  (line-length)
      33:81     warning  line too long (124 > 80 characters)  (line-length)

It keeps showing the “line too long”,  but as a warning and the lint action continues working. After this, the verify works too and the test is done!

Well, now I can commit my changes and push them to GitHub and lets Travis verify that it works. (Will not discuss that here).

group_vars

The zabbix-agent role doesn’t have any group_vars configured, but some of my other roles have group_vars configured in Molecule. Lets give a basic example of configuring the pizza property in the group_vars.

We have to update the provisioner section in molecule.yml:

provisioner:
  name: ansible
  lint:
    name: ansible-lint
  inventory:
    group_vars:
      group1:
        pizza: "Yes Please"

Here we “add” a property named pizza for all hosts that are in the group “group1”. If we had configured this with earlier on with the zabbix-agent role, all of the configured instances had access to the pizza property.

What if we have multiple scenarios and all use the same group_vars? We can create in the git_root directory of the role a directory named inventory and this has 1 or 2 subdirectories: group_vars and host_vars (if needed). To make the pizza property work, we create a file inventory/group_vars/group1 and add

---
pizza: "Yes Please"

Then we update the provisioner section in molecule.yml:

provisioner:
  name: ansible
  inventory:
    links:
      group_vars: ../../../inventory/group_vars/
      host_vars: ../../../inventory/host_vars/

Is this awesome or not?

Dependencies

This is almost the same as with Molecule V1, but with Molecule V2 the file should be present in the specific scenario directory (In my case molecule/default/) and should have the name requirement.yml.

The file requirement.yml is still in the same format as how it was (As this is specific to Ansible and not Molecule ;-))

---
- src: geerlingguy.apache
- src: geerlingguy.mysql
- src: geerlingguy.postgresql

If you want to add some options, you can do that by changing the dependency section of molecule.yml:

dependency:
  name: galaxy
  options:
    ignore-certs: True
    ignore-errors: True

With Molecule V1, there was a possibility to point to a requirements file, with Molecule V2 not.

ansible.cfg

This file is not needed anymore, we can all do this with the provisioner section in molecule.yml. So we don’t have to store the ansible.cfg and point it to the molecule.yml file like how it was with Molecule V1.

Lets say we have an ansible.cfg with the following contents:

[defaults]
library = Library

[ssh_connection]
scp_if_ssh = True

We can easily do this by updating the provisioner section to this:

provisioner:
  name: ansible
  config_options:
    defaults:
      library: Library
    ssh_connection:
      scp_if_ssh: True

TL;DR

Just upgrade to Molecule V2 and have fun! This is just awesome.

@Molecule coders: Thank you for this awesome version!

Setting up a secure Vault with a Consul backend

vault_logo

With this blogpost we continue working with a secure Consul environment: We are configuring a secure Vault setup with Consul as backend. YMMV, but this is what I needed to configure to make it work.

Environment

We should have an working Consul Cluster environment. If you don’t have one, please take a look at here for creating one. With this blogpost we expect a secure Consul cluster with SSL certificates and using ACL’s.

In this blogpost we make use of the wdijkerman/vault container. This container is created by myself and is running Vault (At moment of writing release 0.6.4) on Alpine (running on 3.5). Vault is running as user ‘vault’ and the container can be configured to use SSL certificates.

prerequisites

We have to create SSL certificates for the vault service. In this blogpost we use the domain ‘dj-wasabi.local’, as Consul is already running with this domain configuration so we have to create ssl certificates for the FQDN: ‘vault.service.dj-wasabi.local’.

On my host where my OpenSSL CA configuration is stored, I execute the following commands:

openssl genrsa -out private/vault.service.dj-wasabi.local.key 4096

Generate the key.

openssl req -new -extensions usr_cert -sha256 -subj "/C=NL/ST=Utrecht/L=Nieuwegin/O=dj-wasabi/CN=vault.service.dj-wasabi.local" -key private/vault.service.dj-wasabi.local.key -out csr/vault.service.dj-wasabi.local.csr

Create a signing request file and then sign it with the CA.

openssl ca -batch -config /etc/pki/tls/openssl.cnf -notext -in csr/vault.service.dj-wasabi.local.csr -out certs/vault.service.dj-wasabi.local.crt

We copy the ‘vault.service.dj-wasabi.local.key’, ‘vault.service.dj-wasabi.local.crt’ and the caroot certificate file to the hosts which will be running the Vault container into the directory /data/vault/ssl. Hashicorp advises to run vault on hosts where Consul Agents are running, not Consul Servers. This has probably todo with that for most use cases they see is that Consul is part of large networks and thus the servers will handle a lot of request (High load). As the Consul Servers will be very busy, it would then be wise to not run anything else on those servers.

But this is my own versy small environment (With 10 machines) so I will run Vault on the hosts running the Consul Server.

ACL

Before we do anything on these hosts, we create a ACL in Consul. We have to make sure that Vault can create keys in the key/value store and we have to allow that Vault may create a service in Consul named vault.

So our (Client) ACL will look like this:

key "vault/" {
  policy = "write"
}
service "vault" {
  policy = "write"
}

We use this in the ui on the Consul Server and create the ACL. In my case, the ACL is created with id ’94c507b4-6be8-9132-ea15-3fc5b196ea29′. This ID is needed later on when we configure Vault. Also check your ACL for the ‘Anonymous token’. Please make sure you have set the following rule if the Consul default policy is set to deny:

service "vault" {
  policy = "read"
}

With this, we make sure the service is resolvable via dns. In my case this is for ‘vault.service.dj-wasabi.local’.

Configuration

We have to configure the vault docker container. We have to create a directory that will be mounted in the container. First we have to create an user on the host and then we create the directory: /data/vault/config and own it to the just created user.

useradd -u 994 vault
mkdir /data/vault/config
chown vault:vault /data/vault/config

The container is using a user named vault and has UID 994 and we have to make sure that everything is in sync with names and id. Now we create a config.hcl file in the earlier mentioned directory:

backend "consul" {
  address = "vserver-202.dc1.dj-wasabi.local:8500"
  check_timeout = "5s"
  path = "vault/"
  token = "94c507b4-6be8-9132-ea15-3fc5b196ea29"
  scheme = "https"
  tls_skip_verify = 0
  tls_key_file = "/vault/ssl/vault.service.dj-wasabi.local.key"
  tls_cert_file = "/vault/ssl/vault.service.dj-wasabi.local.crt”
  tls_ca_file = "/vault/ssl/dj-wasabi.local.pem"
}

listener "tcp" {
  address = "0.0.0.0:8200"
  tls_disable = 0
  tls_key_file = "/vault/ssl/vault.service.dj-wasabi.local.key"
  tls_cert_file = "/vault/ssl/vault.service.dj-wasabi.local.crt"
  cluster_address = "0.0.0.0:8201"
}

disable_mlock = false

First we configure a backend for Vault. As we use Consul, we use the Consul backend. Because the Consul is running on https and is using certificates, we have to use the fqdn of the Consul node as the address (same as how we did in configuring Registratror in this post). We also have to configure the options ‘tls_key_file’, ‘tls_cert_file’ and ‘tls_ca_file’, these are the ssl certificates needed for accessing the secure Consul via SSL. Because of this, we have to set the ‘scheme’ to ‘https’ and we have to specify the token for the ACL we created earlier and add the value to the the token option.

Next we configure the listener for Vault. We configure the listener that it listens on all ips on port 8200. We also make sure we configure the earlier created SSL certificates by using them in the ‘tls_key_file’ and ‘tls_cert_file’ options.

The last option is to make sure that Vault can not swap data to the local disk.

Starting Vault

Now we are ready to start the docker container. We use the following command for this:

docker run -d -h vserver-202 --name vault \
--dns=172.17.0.2 --dns-search=service.dj-wasabi.local \
--cap-add IPC_LOCK -p 8200:8200 -p 8201:8201 \
-v /data/vault/ssl:/vault/ssl:ro \
-e VAULT_ADDR=https://vault.service.dj-wasabi.local:8200 \
-e VAULT_CLUSTER_ADDR=https://192.168.1.202:8200 \
-e VAULT_REDIRECT_ADDR=https://192.168.1.202:8200 \
-e VAULT_ADVERTISE_ADDR=https://192.168.1.202:8200 \
-e VAULT_CACERT=/vault/ssl/dj-wasabi.local.pem \
wdijkerman/vault

We have the SSL certificates stored in the /data/vault/ssl and we mount these as read only on /vault/ssl. With the VAULT_ADDR we specifiy on which url the vault service is available on, this is the url which Consul provides like any other server. With the VAULT_CACERT we specify on which location the CA Certificate file of our domain. The other 3 environment variables are needed for a High Available Vault environment and is to make sure how other vault instances can contact it.

When Vault is started, we will see something like this with the docker logs vault command:

==> Vault server configuration:

Backend: consul (HA available)
Cgo: disabled
Cluster Address: https://192.168.1.202:8200
Listener 1: tcp (addr: "0.0.0.0:8200", cluster address: "0.0.0.0:8201", tls: "enabled")
Log Level: info
Mlock: supported: true, enabled: true
Redirect Address: https://192.168.1.202:8200
Version: Vault v0.6.4
Version Sha: f4adc7fa960ed8e828f94bc6785bcdbae8d1b263

==> Vault server started! Log data will stream in below:

But where are not done yet. When Vault is started, it is in a sealed state and because this is the first vault in the cluster we have to initialise it to. Also when you check the ui of Consul, you’ll see that the vault is in an error state. Why? When Vault starts, it automatically creates a service in Consul and add health checks. These health checks will check if a vault instance is sealed or not.

Initialise

As vault is running in the container, we open a terminal to the container:

docker exec -it vault bash

Now we have a bash shell running and we going to initialise vault. First we have to make sure we set the ‘VAULT_ADDR’ to this container, by executing the following command:

export VAULT_ADDR='https://127.0.0.1:8200'

Every time we want to do something with the vault instance, we have to set the ‘VAULT_ADDR’ to localhost. If we won’t do that, we will send the commands directly against the cluster.

As this is the first vault instance in the environment, we have to initialise it and we do that by executing the following command:

vault init -tls-skip-verify
Unseal Key 1: hemsIyJD+KQSWtKp0fQ0r109fOv8TUBnugGUKVl5zjAB
Unseal Key 2: lIiIaKI1F6pJ11Jw/g1CiLyZurpfhCM9AYIylrG/SKUC
Unseal Key 3: 298bn4H8bLbJRsPASOl3R+RPuDKIt6i5fYzqxQ3wL4ED
Unseal Key 4: W4RUiOU3IzQSZ8GD2z8jBEg2wK/q17ldr3zJipFjzKQE
Unseal Key 5: FNPHf8b+WCiS9lAzbdsWyxDgwic95DLZ03IR2S0sq4AF
Initial Root Token: ed220674-24da-d446-375d-bbd0334bcb31

Vault initialized with 5 keys and a key threshold of 3. Please
securely distribute the above keys. When the Vault is re-sealed,
restarted, or stopped, you must provide at least 3 of these keys
to unseal it again.

Vault does not store the master key. Without at least 3 keys,
your Vault will remain permanently sealed.

As we set the ‘VAULT_ADDR’ to ‘https://127.0.0.1:8200&#8217;, we have to add the ‘-tls-skip-verify’ option to the vault command. If we don’t do that, it will complain the it can not validate the certificate that matches the configured url ‘vault.service.dj-wasabi.local.

After executing the command, we see some output appear. This output is very important and needs to be saved somewhere on a secure location. The output provides us 5 unseal keys and the root token. Every time a vault instance is (re)started, the instance will be in a sealed state and needs to be unsealed. 3 of the 5 tokens needs to be used when you need to unseal a vault instance.

bash-4.3$ vault unseal -tls-skip-verify
Key (will be hidden):
Sealed: true
Key Shares: 5
Key Threshold: 3
Unseal Progress: 1
bash-4.3$ vault unseal -tls-skip-verify
Key (will be hidden):
Sealed: true
Key Shares: 5
Key Threshold: 3
Unseal Progress: 2
bash-4.3$ vault unseal -tls-skip-verify
Key (will be hidden):
Sealed: false
Key Shares: 5
Key Threshold: 3
Unseal Progress: 0

We have executed 3 times the unseal command and now this Vault instance is unsealed. You can see the ‘Unseal Progress’ changing after we enter an unseal key. We can verify that state of the vault instance by executing the vault status command:

bash-4.3$ vault status -tls-skip-verify
Sealed: false
Key Shares: 5
Key Threshold: 3
Unseal Progress: 0
Version: 0.6.4
Cluster Name: vault-cluster-7e01e371
Cluster ID: b9446acf-4551-e4c2-fa5f-03bd1bcf872f

High-Availability Enabled: true
Mode: active
Leader: https://192.168.1.202:8200

We see that this vault instance is not sealed and that the mode of this node is active. You can also see that the leader of the vault instance is in my case the current host. (Not strange as this is the first Vault instance of the environment.) If we want to add a 2nd and more, we have to execute the same commands as before. With the exception of the vault init command, as we already have an initialised environment.

As we are still logged in on the node, lets create a simple entry.

bash-4.3$ export VAULT_TOKEN=ed220674-24da-d446-375d-bbd0334bcb31
bash-4.3$ vault write secret/password value=secret
Success! Data written to: secret/password

We first set the ‘VAULT_TOKEN’ variable, this value of this variable is the value of the ‘Initial root token’. After that, we created a simple entry in the database. Key ‘secret/password’ is created and had the value ‘secret’.

It took some time to investigate how to setup a High Available Vault environment with Consul, not much information can be found on the internet. So maybe this page will help you setting one up yourself. If you do have improvements please let me know.

Configuring Access Control Lists in Consul

consul_logo

This is the 2nd post in securing Consul and this is about using ACLs in Consul. The first post (this one) we configured a Consul cluster by using gossip encryption and using SSL|TLS certificates. Now we cover the basics about Consul ACL’s (Access Control List) and configuring them in our cluster.

Master Token

First we have to create a master token. This is the token that has all rights (Thats why its called the master), sort of the ‘root’ token. We have to generate it first and we can use the uuidgen command in Linux (or Mac) for this. We use this output of the uuidgen command and place it in the following file: /data/consul/config/master-token.json

{
  "acl_master_token":"d9f1928e-1f84-407c-ab50-9579de563df5",
  "acl_datacenter":"dc1",
  "acl_default_policy":"deny",
  "acl_down_policy":"deny"
}

We have to store/configure this file on all Consul Servers. You’ll see that we set the default policy to “deny”, so we block everything and only enable the things we want. When we have created the file, we have to restart all Consul Servers to make the ACL’s active.

If you may recall what we did configure the Consul Server in the previous blogpost, we have configured the Consul Servers with this property:

"verify_incoming": true,

We have to open the ui on the Consul Server and because we have the property above configured, we need to load a SSL client certificate in our browser. (Or for now, you can also remove the property and restart Consul. But make sure you add it again when you are done!)

Now open the ui on the server and click on the right button (Settings). You’ll see something like this:

consul_settings

We enter the token we placed in the file in the field we see in our browser. Now we click on the button “ACL” (Token is saved automatically in your browser) and we see something like this:

consul_acl

This is an overview of all tokens available in Consul. You’ll see that 2 tokens exists in Consul right now:

  • Anonymous Token
  • Master Token

Anonymous Token

The anonymous token is used when you didn’t configure a Token in the settings page or didn’t supply it when using 3rd party software. You’ll only see the “consul” service, but won’t see anything else. If we would create a key in the key/value store, it will fail because the Anonymous token can’t do anything (Because of the property “acl_default_policy”:”deny”).

Master token

The master token is the token we just filled in the settings tab and the one configured in the json file in the beginning of this blogpost and is sort of the root token. The one token to rule them all.

So what do you need when you want to create an ACL? There are 3 types of policies that can be used:

  • read
  • write
  • deny

Might be obvious that the “read” policy is for reading data, “write” policy is for reading and writing data and “deny” is for NOT reading or writing data to Consul.

The ACL is written in the HCL language (HCL stands for HashiCorp Language) and we will create an ACL via the ui. You can also do that via the Consul API and automatically maintain them with for example Ansible, but that is out of the scope for this blogpost. In the ui we see on the right side of the page “New ACL”.

In the “name” field we enter for now “test” and select “client” as type. In the “Rules” field we enter the following:

key "" {
  policy = "read"
}
key "foo/" {
  policy = "write"
}

When we click on “create”, the ACL will be created. With this ACL, we choose the type “client” instead of the “management” type. When you have selected “management” as ACL type, the users/services which will use this ACL can also create/update/delete this and other ACL’s in the cluster. As we don’t want that, we select the “client” type.

We created 2 rules, both are for the key/value store. The first “key” rule specifies that all keys in the key/value store can be read with the ACL. With the 2nd “key” we specify that all keys in the “foo/” directory can be read and written. When we use this ACL, we can create the key “foo/bar”, but not the key “foobar”.

Next for using “key” in the rules, you can also configure “service”, “event” and “query” rules. It has the same format as the “key” example above and uses the same policies. With this you can easily give each application (or user) the correct rights.

Registrator

With registrator we can easily add docker containers as services into Consul. Now we have configured a default ACL policy to “deny” we have to update our configuration for the registrator. Registrator will attempt to sent the data to Consul for creating the services and registrator will think this is done, but Consul will deny because of the default policy. We can create a ACL specific to registrator.

Let’s create one via the UI. We enter the name “Registrator” and select “client” type. There are 2 possibilities to proceed regarding the “Rules”:

We can add a rule that will be used for all services the registry will add:

service "" {
  policy = "write"
}

Or we mention each service independently:

service "kibana" {
  policy = "write"
}
service "jenkins" {
  policy = "write"
}

Both have their pros and cons. With the first rule we allow that registrator can add all services into Consul and requires not much “maintenance”, it is a little bit to “open”. The 2nd rule requires more maintenance by adding all services but is more secure. With this, not all containers are added automatically and thus no rogue containers will be available in Consul.

We click on “create” to create the ACL. Now we have an token id and use that token in our docker run command. Our command to start registrator will look like this now:

docker run -h vserver-201 \
-v /var/run/docker.sock:/tmp/docker.sock \
-v /data/consul/config/ssl:/consul:ro \
-e CONSUL_CACERT=/consul/dj-wasabi.local.pem \
-e CONSUL_TLSCERT=/consul/vserver-201.dc1.dj-wasabi.local.crt \
-e CONSUL_TLSKEY=/consul/vserver-201.dc1.dj-wasabi.local.key \
-e CONSUL_HTTP_TOKEN=5c7d6559-cd90-d244-bbed-14d459a74bd2 \
gliderlabs/registrator:master \
-ip=192.168.1.201 consul-tls://vserver-201.dc1.dj-wasabi.local:8500

We had to add the -e CONSUL_HTTP_TOKEN variable with the token id as value. When I start the “kibana” container it will be added to Consul and we see the service is created.

We covered the basics for creating and using ACL’s in Consul. Using ACL’s in Consul will help securing Consul more by only allowing settings that is needed for the container purpose. Hopefully this will help you configuring ACLs in your environment.

Setting up a secure Consul cluster with docker

consul_logo

This post is the first of 2 blog items about setting up a secure Consul environment.

With the first post – which is this one – we will discuss how we setup a secure Consul environment. We will use a docker container and configure it with SSL certificates to secure the traffic from and to Consul. The 2nd post (This one), we will dive into ACLs and how we can make use of ACLs in Consul.

We will use the ‘wdijkerman/consul’ docker container to setup a secure environment. For now we create a Consul cluster with 2 hosts, named ‘vserver-201′ and ‘vserver-202′. ‘vserver-201′ will be the Consul Agent and ‘vserver-202′ will be the Consul Server. There is no specific need to use this container, you can also make this work with other Consul (containers) or installations.

Before we are going to setup the environment, we will briefly discuss the used docker container first.

wdijkerman/consul

This is a docker container created by myself which has Consul installed and configured. This container holds some basic Consul configuration and we can easily add some new configuration options by either supplying them to the command line or by creating a configuration json file. This container is running Consul 0.7.2 (Which is the latest version at moment of writing) and is running Alpine 3.5 (Also latest version at moment of writing). The most important thing is is that Consul isn’t running as user root, it is running as user ‘consul’ (with a fixed UID).

Before we start anything with the container, we going to add a user with that UID on the hosts running Consul.

useradd -u 995 consul

After this, we have to create 2 directories on the hosts running Consul. We use the following 2 directories:

mkdir -p /data/consul/data /data/consul/config
chown consul /data/consul/data /data/consul/config

The first directory is where Consul will store the Consul data and is only needed for the host running the Consul Server. The 2nd directory is where Consul will look for configuration files in which we create some files further in this post. On the host running the Consul Agent (In my case the host ‘vserver-201′) we only have to create the /data/consul/config directory. After the creation of the directories, we make sure these directories are owned by the earlier created user consul.

Before we are going to create some configuration files, take a look at the following json file. This json file is already present in the Consul docker container (So we don’t have to create it ourself) and is the default configuration of Consul:

{
  "data_dir": "/consul/data",
  "ui_dir": "/consul/ui",
  "log_level": "INFO",
  "client_addr": "0.0.0.0",
  "ports": {
    "dns": 53
  },
  "recursor": "8.8.8.8",
  "disable_update_check": true
}

As you see, this is a very basic configuration and we need to add some options to make it secure.

encrypt

We are going to expand our configuration by adding a new file in the /data/consul/config directory. With this file we are going to encrypt all of our internal Consul gossip traffic. This file should be placed on all of the hosts running Consul that will be/is part of this cluster.

Lets create a string with the following command:

docker run --rm --entrypoint consul wdijkerman/consul keygen

We use the output of this command and place it in the following file: /data/consul/config/encrypt.json

{
  "encrypt": "iuwMf/cScjTvKUKDC77kJA=="
}

We make sure that the rights of the file is set to 0400 and owned by the user consul.

chown consul:consul /data/consul/config/encrypt.json
chmod 0400 /data/consul/config/encrypt.json

All of the Consul nodes (Server and Agent) need this file, so make sure your Ansible (or Puppet, Chef or Saltstack) is configured to place this file on all of your nodes.

ssl

As all requests to and from Consul are done via http, we need to configure Consul that it listens on https instead of http. Before we do anything with Consul, we need access to a ssl crt, key and ca file first.

Before we execute a openssl command, we have to make sure that our CA SSL configuration is correct. Consul (Well, actually the go language: https://github.com/golang/go/issues/7423) requires some extra configuration specifically for using extentions in certificates. We have to add (or update) the property ‘extendedKeyUsage’ in the SSL CA configuration file so that the following values are added:

serverAuth,clientAuth

The usr_cert configuration in the CA openssl configuration file will look something like this:

[ usr_cert ]

basicConstraints=CA:FALSE
nsComment = "OpenSSL Generated Certificate"
subjectKeyIdentifier=hash
authorityKeyIdentifier=keyid,issuer
extendedKeyUsage = critical,timeStamping,serverAuth,clientAuth

(I have no idea why critical and timeStamping are there, so I just keep them there. :-))

We have to create the certificates now, the FQDN for this is:

<name_of_node>.<datacenter>.<domain>

In my case, my nodes are ‘vserver-201′ and ‘vserver-201′ , my domain is ‘dj-wasabi.local’ and have the default ‘dc1′ as datacenter. I need to create a crt and key for the host ‘vserver-201.dc1.dj-wasabi.local’ and ‘vserver-202.dc1.dj-wasabi.local’.

So on the host where my ‘dj-wasabi.local’ CA is configured, I need to execute the following set of commands:

cd /etc/pki/CA
openssl genrsa -out private/vserver-202.dc1.dj-wasabi.local.key 4096

We first generate the SSL key.

openssl req -new -extensions usr_cert -sha256 -subj "/C=NL/ST=Utrecht/L=Nieuwegin/O=dj-wasabi/CN=vserver-202.dc1.dj-wasabi.local" -key private/vserver-202.dc1.dj-wasabi.local.key -out csr/vserver-202.dc1.dj-wasabi.local.csr

We generate the csr file from the earlier created key.

openssl ca -batch -config /etc/pki/tls/openssl.cnf -notext -in csr/vserver-202.dc1.dj-wasabi.local.csr -out certs/vserver-202.dc1.dj-wasabi.local.crt

And now we will create a crt by signing the csr via the OpenSSL CA.

(And I do the same for host vserver-201.dc1.dj-wasabi.local)

Now we have to copy these files (Including the CA certificate file) to the servers and make sure these files are stored in the /data/consul/config directory, owned and only available by user consul. I create a ssl directory and places all the ssl files in this directory.

Now we have to create a configuration file, so Consul knows that it has SSL certificates. First we configure the Consul Server, in my case it is running on the ‘vserver-202′ host. We create the file /data/consul/config/ssl.json with the following content:

{
  "ca_file": "/consul/config/ssl/dj-wasabi.local.pem",
  "cert_file": "/consul/config/ssl/vserver-202.dc1.dj-wasabi.local.crt",
  "key_file": "/consul/config/ssl/vserver-202.dc1.dj-wasabi.local.key",
  "verify_incoming": true,
  "verify_outgoing": true
}

(Keep in mind that /data/consul/config is mounted in the container as /consul/config).

With the ‘verify_incoming‘ and ‘verify_outgoing‘ we make sure that all traffic to and from the Server is encrypted. If we would start the container right now, you can only access the ui if you have have created client ssl certificates and loaded it in your browser.

For the Consul agent, we use the same ssl.conf configuration file as mentioned above, but without the ‘verify_incoming‘ option.

ports

Before we start the container, we have to do 1 small thing. With a default configuration which we currently have, port 8500 is used for http. We create a new configuration file and assign the http listener to a different port number, so we can configure port 8500 to be https.

We create the file: /data/consul/config/ports.json with the following content:

{
  "ports": {
    "http": 8501,
    "https": 8500
  }
}

We have to specifiy the http port and give this a port number, otherwise it will be set default to 8500. When we start the container with the next step, we only configure port 8500 to be opened and not port 8501 and thus we have a https enabled Consul container.

Start Consul

Now we are able to start the Consul Server on the Consul server ‘vserver-202‘. We execute the following command:

docker run -h vserver-202 --name consul \
-v /data/consul/cluster:/consul/data \
-v /data/consul/config:/consul/config \
-p 8300:8300 -p 8301:8301 -p 8301:8301/udp \
-p 8302:8302 -p 8302:8302/udp -p 8400:8400 \
-p 8500:8500 -p 8600:53/udp wdijkerman/consul \
-server -ui -ui-dir /consul/ui -bootstrap-expect=1 \
-advertise 192.168.1.202 -domain dj-wasabi.local \
-recursor=8.8.8.8 -recursor=8.8.4.4

The following output appears:

[root@vserver-202 config]# docker logs consul
==> WARNING: BootstrapExpect Mode is specified as 1; this is the same as Bootstrap mode.
==> WARNING: Bootstrap mode enabled! Do not enable unless necessary
==> Starting Consul agent...
==> Starting Consul agent RPC...
==> Consul agent running!
Version: 'v0.7.2'
Node name: 'vserver-202'
Datacenter: 'dc1'
Server: true (bootstrap: false)
Client Addr: 0.0.0.0 (HTTP: 8501, HTTPS: 8500, DNS: 53, RPC: 8400)
Cluster Addr: 192.168.1.202 (LAN: 8301, WAN: 8302)
Gossip encrypt: true, RPC-TLS: true, TLS-Incoming: true
Atlas: <disabled>

==> Log data will now stream in as it occurs:

Most important in this output are these 2 lines:

Client Addr: 0.0.0.0 (HTTP: 8501, HTTPS: 8500, DNS: 53, RPC: 8400)
Gossip encrypt: true, RPC-TLS: true, TLS-Incoming: true

First line we can see that port 8500 is used for HTTPS and port 8501 is used for HTTP.
2nd line we see that the parameter encrypt is active (Is set to true) and both the ‘verify_incoming’ and ‘verify_outgoing’ are also set to true.

Now we can start Consul on the ‘vserver-201′ (Consul Agent):

docker run -h vserver-201 --name consul \
-v /data/consul/config:/consul/config \
-p 8300:8300 -p 8301:8301 -p 8301:8301/udp \
-p 8302:8302 -p 8302:8302/udp -p 8400:8400 \
-p 8500:8500 -p 8600:53/udp wdijkerman/consul \
-join 192.168.1.202 -advertise 192.168.1.201 \
-domain dj-wasabi.local

The Consul Agent will connect to the Consul Server and we can open the ui on the Agent with url https://vserver-201.dc1.dj-wasabi.local:8500. In my case it complains that the certificate is not validated (I’m using a self-signed CA certficate), but I’m able to access the ui and see the service ‘consul’. I do have an issue with opening the ui on the Consul Server. Why?

We have added the following property in the file /data/consul/config/ssl.json

"verify_incoming": true,

This means that ALL traffic to the Consul Server should be done via SSL certificates. If we really want to access the ui on the Consul Server (And we do want that, ACL’s ;-)) we have to create a client SSL certificate, load it in the browser and try opening the ui again.

Registrator

I use registrator in my environment and have to make sure that it can work with SSL to. For registrator, we have to configure 3 environment variables which are used for the locations of the ssl crt, key and ca file. To do this, we also have to mount the ssl directory in the registrator container so it has access to theses files.

Next, we have to use the consul-tls:// option instead of the consul:// when starting registrator.
Our command looks like this now:

docker run -h vserver-201 \
-v /var/run/docker.sock:/tmp/docker.sock \
-v /data/consul/config/ssl:/consul:ro \
-e CONSUL_CACERT=/consul/dj-wasabi.local.pem \
-e CONSUL_TLSCERT=/consul/vserver-201.dc1.dj-wasabi.local.crt \
-e CONSUL_TLSKEY=/consul/vserver-201.dc1.dj-wasabi.local.key \
gliderlabs/registrator:master \
-ip=192.168.1.201 consul-tls://vserver-201.dc1.dj-wasabi.local:8500

After executing the above command, new docker containers will be added automatically in Consul as a service via tls.

We successfully created a secure Consul environment where all traffic from and to Consul are encrypted. Even with the registrator tool we add new services via TLS connections.

Next blog item we will discuss the ACLs in Consul to make sure that not everyone can create/update/delete keys in the k/v store and or create/add/delete services.

Extending Ansible Role testing with Molecule by adding group_vars, dependencies and using travis ci

ansible_logo_black_square

This blog post will extend the actions we described on this https://werner-dijkerman.nl/2016/07/10/testing-ansible-roles-with-molecule-testinfra-and-docker/ page. We only configured a very basic configuration with 2 tests in the previously mentioned page, but thats not enough to and on this page we will continue to complete the tests. Github page for Molecule (In case you forgot 😉 )

We will discuss the following in this blogpost:

  • Make the TestInfra tests OS aware
  • Use group_vars
  • Role dependencies
  • Configure Travis

Lets dive into the tasks.

TestInfra OS Aware

Ok, not really molecule related, but very important if our role needs to work on several different operating systems. In the earlier mentioned blogpost we had configured Molecule to use a Ubuntu docker image. Before we can make the tests OS aware, we have to add some docker containers to the configuration which have different operating systems.

We create the following configuration:

docker:
  containers:
  - name: zabbix-agent-centos
    ansible_groups:
      - group1
    image: milcom/centos7-systemd
    image_version: latest
    privileged: True
  - name: zabbix-agent-debian
    ansible_groups:
      - group2
    image: maint/debian-systemd
    image_version: latest
    privileged: True
  - name: zabbix-agent-ubuntu
    ansible_groups:
      - group2
    image: rastasheep/ubuntu-sshd
    image_version: latest
    privileged: True

We have 3 docker containers configured: Ubuntu, Debian and CentOS. I’ve searched for Docker containers that have systemd configured, as the Zabbix Agent uses this. The official docker images of the mentioned OS’es doesn’t have the systemd enabled/configured.

We would like to run the ‘molecule test’ command, but it will fail on the ‘verify’ part.
We currently have this test:

def test_zabbix_package(Package, SystemInfo):
    zabbixagent = Package('zabbix-agent')
    assert zabbixagent.is_installed

    assert zabbixagent.version.startswith("3.0")

This test will validate if Zabbix is installed and the version starts with 3.0 (Thats the default version to be installed). When running this test, it will fail on the Debian and Ubuntu host. Why?They use a slightly different version naming than CentOS.

We have to update the function by adding the SystemInfo class to the function:

def test_zabbix_package(Package, SystemInfo):
    zabbixagent = Package('zabbix-agent')
    assert zabbixagent.is_installed

    if SystemInfo.distribution == 'debian':
        assert zabbixagent.version.startswith("1:3.0")
    if SystemInfo.distribution == 'centos':
        assert zabbixagent.version.startswith("3.0")

Now we have added the SystemInfo class to the function, we can use this to determine which tests we can execute on which OS. In the above example we see that if the distribution ‘debian’ is, we validate if the version starts with ‘1:3.0’. With CentOS we validate if it is ‘3.0’. When we execute the test on all 3 hosts, it will run successfully.

Using group_vars

This part also applies to using ‘host_vars’, just replace the word ‘group_vars’ with ‘host_vars’ 😉 We configure the molecule.yml file by adding some group_vars related data. We configure the 1st block in the molecule.yml like this:

ansible:
  playbook: playbook.yml
  group_vars:
    mysql:
      database_type: mysql
      database_type_long: mysql
    postgresql:
      database_type: pgsql
      database_type_long: postgresql
      postgresql_pg_hba_conf:
        - "host all all 127.0.0.1/32 trust"
        - "host all all ::1/128 trust"
      postgresql_pg_hba_local_ipv4: false
      postgresql_pg_hba_local_ipv6: false

This setup will “create” 2 group_var files: mysql and postgresql. When we run a molecule command, the group_vars will be created in the .molecule directory. Lets run a ‘molecule list’ command. For know we ignore the output, but lets see what is created in the .molecule directory:

[vagrant@localhost ansible-zabbix-server]$ find .molecule -type f | xargs ls -lrt
-rw-rw-r--. 1 501 games   3 Jul 17 20:31 .molecule/state
-rw-rw-r--. 1 501 games 215 Jul 17 20:31 .molecule/group_vars/postgresql
-rw-rw-r--. 1 501 games  51 Jul 17 20:31 .molecule/group_vars/mysql
[vagrant@localhost ansible-zabbix-server]$ cat .molecule/group_vars/mysql
---
database_type: mysql
database_type_long: mysql
[vagrant@localhost ansible-zabbix-server]$ cat .molecule/group_vars/postgresql
---
database_type: pgsql
database_type_long: postgresql
postgresql_pg_hba_conf:
- host all all 127.0.0.1/32 trust
- host all all ::1/128 trust
postgresql_pg_hba_local_ipv4: false
postgresql_pg_hba_local_ipv6: false

Within the .molecule directory a group_vars directory is created and 2 files are present (We will ignore the ‘state’ file). The output of these files are the same as how we configured the molecule.yml file.

Role dependencies

When your role has some dependencies, we really want them to download before we execute our role in Molecule. It will fail if we don’t do this. We have to download them first.

Molecule has a simple configuration for this. Within the molecule.yml file, we add the ‘requirements_file’ option in the Ansible configuration. Our example now looks like this:


dependency:
  name: galaxy
  requirements_file: requirements.yml 
  options:
    ignore-certs: True
    ignore-errors: True

ansible:
  playbook: playbook.yml
  group_vars:
    mysql:
      database_type: mysql
      database_type_long: mysql
    postgresql:
      database_type: pgsql
      database_type_long: postgresql
      postgresql_pg_hba_conf:
        - "host all all 127.0.0.1/32 trust"
        - "host all all ::1/128 trust"
      postgresql_pg_hba_local_ipv4: false
      postgresql_pg_hba_local_ipv6: false

In the root directory of the repository, a file called ‘requirements.yml’ is found which contains the dependencies.

When we run the ‘converge’ subcommand, the roles will be downloaded and after this the role is executed:

[vagrant@localhost ansible-zabbix-server]$ molecule converge
WARNING:vagrant:The Vagrant executable cannot be found. Please check if it is in the system path.
--> Installing role dependencies ...
- downloading role 'apache', owned by geerlingguy
- downloading role from https://github.com/geerlingguy/ansible-role-apache/archive/1.7.2.tar.gz
- extracting geerlingguy.apache to .molecule/roles/geerlingguy.apache
- geerlingguy.apache was installed successfully
- downloading role 'mysql', owned by geerlingguy
- downloading role from https://github.com/geerlingguy/ansible-role-mysql/archive/2.3.0.tar.gz
- extracting geerlingguy.mysql to .molecule/roles/geerlingguy.mysql
- geerlingguy.mysql was installed successfully
- downloading role 'postgresql', owned by galaxyprojectdotorg
- downloading role from https://github.com/galaxyproject/ansible-postgresql/archive/0.9.2.tar.gz
- extracting galaxyprojectdotorg.postgresql to .molecule/roles/galaxyprojectdotorg.postgresql
- galaxyprojectdotorg.postgresql was installed successfully
--> Starting Ansible Run ...

PLAY [all] *********************************************************************

TASK [setup] *******************************************************************
ok: [zabbix-server-centos-mysql]
... 
<snip>

Configure CI

We don’t want to run manually the molecule commands every time a change occurs. To quote Apple: “There is an app for that”. Well almost. I explain 2 ways of running Molecule in a automated way: Using Travis and Using Jenkins.

Travis

A commonly known cloud platform for CI, and it is very simple to make use of it. The Molecule documentation has a very nice example on how to configure the travis setup. We create in the root directory of our role, a file named: .travis.yml

sudo: required
language: python
services:
- docker

before_install:
- sudo apt-get -qq update
- sudo apt-get install -o Dpkg::Options::="--force-confold" --force-yes -y docker-engine

install:
- pip install molecule ansible docker

script:
- molecule test
notifications:
  webhooks: https://galaxy.ansible.com/api/v1/notifications/

Its a very basic travis configuration. First block is to let Travis know that we need to make use of sudo, that the language Python is and we use the ‘docker’ service.

2nd block is to update the Ubuntu’s apt cache and install Docker. When Docker is installed, we install molecule and after that, molecule test is executed. When the job is finished, we send some data via a web hook to the Ansible Galaxy page. The last step is to show the ‘passing’ (Or failing) badge on the Ansible Role page.

Jenkins

This example will show you a basic Jenkinsfile. I don’t do a single action manually in Jenkins. A  Jenkins job is basically code and we should use it as code. With Jenkins 2, we can make use of a Jenkinsfile (Also with Jenkins 1, but was named jenkins-template I believe?)

When you have a Jenkins server running, create a new Pipeline job and configure the correct git repository. (Okay, for this blogpost we do this manually, but I have a playbook for this. So this is on my part automated as well. 😉 )

A basic example:

node(){
    stage 'INFO: Checkout code'
        checkout scm
    stage 'Installing Molecule'
        sh 'sudo pip install molecule'
    stage 'Creating Containers'
        sh 'molecule create'
    stage 'Installing the Role'
        sh 'molecule converge'
    stage 'Idempotence check'
        sh 'molecule idempotence'
    stage 'Verify the application'
        sh 'molecule verify'
    stage 'Destroy the containers'
        sh 'molecule destroy'
}

When the job is executed, it will first download the Jenkinsfile from the configured git repository and the tasks will be executed one by one. This is just to get you started and the Jenkinsfile should be extended with some error handling like e-mail configuration etc.

So this was the follow up on the original blogpost for how to use Molecule with your Ansible role. The guys are really busy with Molecule so Molecule grows really fast. Maybe doing an 3rd blogpost very soon with new features.. 🙂

Testing Ansible roles with Molecule, Testinfra and Docker

ansible_logo_black_square

“On 2017-05-01 I’ve updated this post to the current situation. Some things where outdated and where removed.”

In some earlier posts I’ve described how you can use Test Kitchen for testing Ansible Roles (This one and the one extending it.). Test Kitchen was created for testing Chef Cookbooks and like Chef, Test Kitchen is a Ruby application. On this page we describe an other tool for the same purpose. This tool is what you might see as a Python clone of Test Kitchen, but more specific to Ansible: Molecule (Github)

Molecule isn’t that old, only few years and when I browse the internet it is not yet really known in the community. Unlike Test kitchen with the many different drivers, Molecule supports several backends, like Vagrant, Docker, and OpenStack. With Molecule you can make use of Serverspec (Like Test Kitchen), but you can also make use of ‘Testinfra’. Testinfra is like Serverspec a tool for writing unit tests, but it is written in Python.

Lets dive into Molecule and create some tests for Molecule. On this page, we make use of the Docker backend and if you following this page please install docker.

Installing Molecule is really simple:

pip install molecule docker

Voila, it is installed. With the installation of Molecule, Testinfra is installed to. We had to provide the docker module as well, otherwise molecule doesn’t know how to connect to the docker daemon. We can configure a Ansible Role. I used my ‘zabbix-agent’ role as test case for the Test Kitchen setup, so I will use it again for Molecule.

When you haven’t created an Ansible role yet, instead of using the ansible-galaxy command you can use the following command:

molecule init --driver docker --role role_name

This will create just like the ‘ansible-galaxy’ some default directories and files, but also gives us a starting point with a few extra files for testing this role with Molecule.

When you already have a working module and want to make use of Molecule, please execute the following command:

molecule init --driver docker

This will install several files specific for Molecule. No worries, we can recreate these files manually. Lets do that in a role and see what the files do.

File: <root>/molecule.yml

---
ansible:
  playbook: playbook.yml

driver:
  docker
docker:
  containers:
    - name: zabbix-01
      ansible_groups:
        - group1
      image: debian
      image_version: latest
      privileged: True

verifier:
  name: testinfra

This is the configuration file for Molecule. We specify which playbook Molecule will execute, in this case playbook.yml.
We specify that we want to make use of the Docker driver and that we have a docker container configuration. In this case, we only have 1 docker container specified. We use a Debian docker container with the ‘latest’ tag. We name the container ‘test-01’ and is in the group ‘group1’. And at last we configure molecule to use testinfra as the testtool.

File: <root>/playbook.yml

---
- hosts: all
  roles:
    - role: ansible-zabbix-agent

The playbook that is executed in the Docker container. This is a very basic one, we only have to specify the correct .

File: <root>/tests/inventory

localhost
[group1]
zabbix-01 ansible_connection=docker

The Ansible inventory file. Should be a known file to you 😉

File: <root>/tests/test_default.py


from testinfra.utils.ansible_runner import AnsibleRunner
testinfra_hosts = AnsibleRunner('.molecule/ansible_inventory').get_hosts('all')

def test_hosts_file(File):
    hosts = File('/etc/hosts')
    assert hosts.user == 'root'
    assert hosts.group == 'root' 

(Edit 2016-09-14: As of release 1.9, the first 2 lines should be present in the TestInfra script.)

This is the test infra python file, containing the tests. After the init command, we have 1 test that will check if there is a hosts file, and the user and group of the file belongs to user ‘root’. We discuss this file later on by adding some more tests.

File: <root>/tests/test.yml

---
- hosts: localhost
  remote_user: root
  roles:
    - zabbix-agent-role

Now we have discussed the files.

We add some infra test checks in the ‘test_default.py’ file. We add the following 2 tests:

def test_zabbix_package(Package):
    zabbixagent = Package('zabbix-agent')
    assert zabbixagent.is_installed
    assert zabbixagent.version.startswith("1:3.0")

def test_zabbixagent_running_and_enabled(Service):
    zabbixagent = Service("zabbix-agent")
    # assert zabbixagent.is_running
    assert zabbixagent.is_enabled

These are 2 Python function which are executed with Testinfra. With the first function, we validate if the package ‘zabbix-agent’ is installed. Also we check if the version starts with: 1:3.0. If you have some experience with testing Python code, this might be familiar to you. Test Infra uses ‘PyTest‘ to execute the tests and validate the.

The 2nd function we validate the ‘zabbix-agent’ service. We make sure the service is enabled. As you see, I’ve commented the check if the service is running. When it is enabled, I get this error message:

Failed to get D-Bus connection: Unknown error -1

Strange, because I’ve configured the privileged mode on the docker container. So maybe this is a bug or misconfiguration on my part, but for now I leave it commented and need to find a solution for this.

Within the molecule.yml we have to update the docker container configuration by adding the following property for all the docker containers:

    required: True

Now we added it, we have to do a “molecule destroy” and start again. The container will be recreated and we won’t get an error message about the dbus.

Now we are ready to move on (I’m well aware that these 2 tests that I added will not be enough, I’ll add these myself later on).

Molecule has several subcommands, let run molecule -h and see what is available:

No handlers could be found for logger "vagrant"
Usage:
    molecule [-hv] &amp;amp;lt;command&amp;amp;gt; [&amp;amp;lt;args&amp;amp;gt;...]

Commands:
    check         check playbook syntax
    create        create instances
    converge      create and provision instances
    idempotence   converge and check the output for changes
    test          run a full test cycle: destroy, create, converge, idempotency-check, verify and destroy instances
    verify        create, provision and test instances
    destroy       destroy instances
    status        show status of instances
    list          show available platforms, providers
    login         connects to instance via SSH
    init          creates the directory structure and files for a new Ansible role compatible with molecule

Options:
    -h --help     shows this screen
    -v --version  shows the version

We first start with the ‘check’ command:

[vagrant@localhost ansible-zabbix-agent]$ molecule check
No handlers could be found for logger "vagrant"

playbook: playbook.yml
[vagrant@localhost ansible-zabbix-agent]$ echo $?
0

Seems very well, the check commands validate if the playbook.yml doesn’t have any problems/syntax errors.
We can continue with the next command: create.

[vagrant@localhost ansible-zabbix-agent]$ molecule create
No handlers could be found for logger "vagrant"
 Building ansible compatible image ...
 Step 1 : FROM debian:latest

  ---&amp;amp;gt; 1b088884749b

 Step 2 : RUN bash -c 'if [ -x "$(command -v apt-get)" ]; then apt-get update &amp;amp;amp;&amp;amp;amp; apt-get install -y python sudo; fi'

  ---&amp;amp;gt; Using cache

  ---&amp;amp;gt; 8ef54383599a

 Step 3 : RUN bash -c 'if [ -x "$(command -v yum)" ]; then yum makecache fast &amp;amp;amp;&amp;amp;amp; yum update -y &amp;amp;amp;&amp;amp;amp; yum install -y python sudo; fi'

  ---&amp;amp;gt; Running in 6d3142fa72aa

...
 Finished building molecule_local/debian:latest
 Creating container zabbix-01 with base image debian:latest ...
 Container created.

[vagrant@localhost ansible-zabbix-agent]$

Now we have created a docker container where we can install our Ansible role on to, we do that with the ‘converge’ subcommand.

[vagrant@localhost ansible-zabbix-agent]$ molecule converge
No handlers could be found for logger "vagrant"

PLAY [all] *********************************************************************

TASK [setup] *******************************************************************
ok: [zabbix-01]

TASK [ansible-zabbix-agent : Include OS-specific variables] ********************
ok: [zabbix-01]

TASK [ansible-zabbix-agent : Install the correct repository] *******************
skipping: [zabbix-01]

...
RUNNING HANDLER [ansible-zabbix-agent : restart zabbix-agent] ******************
changed: [zabbix-01]

PLAY RECAP *********************************************************************
zabbix-01                  : ok=12  changed=7    unreachable=0    failed=0

Nice, the role is installed correctly without any issues on the container. With Test Kitchen we had to use BATS to validate if the Role is idempotent, but luckily molecule has just a simple sub command for it: idempotence

Well, it seems that the Role has passed the idempotence test:

[vagrant@localhost ansible-zabbix-agent]$ molecule idempotence
No handlers could be found for logger "vagrant"
Idempotence test in progress (can take a few minutes)...
Idempotence test passed.

[vagrant@localhost ansible-zabbix-agent]$

Testing the role is nicely going on right now, but we are not there yet. Now we need to use the ‘verify’ command to actually validate our role on the Docker container:

[vagrant@localhost ansible-zabbix-agent]$ molecule verify
No handlers could be found for logger "vagrant"
Trailing whitespace found in ./defaults/main.yml on lines: 35
Trailing newline found at the end of ./handlers/main.yml
Trailing whitespace found in ./library/zabbix_host.py on lines: 29
Trailing newline found at the end of ./library/zabbix_hostmacro.py
[vagrant@localhost ansible-zabbix-agent]$

Whoops, it seems it has found some issues. Let me fix that first, probably need to run the verify again after fixing it.

[vagrant@localhost ansible-zabbix-agent]$ molecule verify
No handlers could be found for logger "vagrant"

Executing testinfra tests found in tests/.
============================= test session starts ==============================
platform linux2 -- Python 2.7.5, pytest-2.9.2, py-1.4.31, pluggy-0.3.1
rootdir: /git/ansible/ansible-zabbix-agent/tests, inifile:
plugins: xdist-1.14, testinfra-1.4.0
collected 3 itemss

tests/test_default.py ...

=========================== 3 passed in 0.63 seconds ===========================

No serverspec tests found in spec/.

[vagrant@localhost ansible-zabbix-agent]$

After fixing it, everything seems to work fine. Nice!

Now we are done with the container, so we can execute molecule again, but with the delete sub command and the container will be deleted.

These were the basics for testing an Ansible role with Molecule, Docker and Test Infra. This page uses the ‘Debian’ Docker image, whereas I normally use CentOS for this. I have some issues (Get the same error message when I enable the Test Infra test to validate if the service is running) to make this work on CentOS. So maybe Molecule isn’t mature enough yet, but it is getting there.

I’ll update my Ansible roles so it will use Molecule instead of Test Kitchen (No hard feelings ;-))

Docker containers for Zabbix Server and Zabbix Web

dockerZabbix Logo

Since a few weeks I started using Docker and building containers and this is really fun to do. So one of my first public docker containers had to be something with Zabbix. 🙂

So I have created 2 docker containers;

  • zabbix-server
  • zabbix-web

So, here follows an description about the 2 containers.

Zabbix-Server

This container will run an zabbix-server. Jeah!

Its an Debian based container (As Debian is one of the smaller ones) and will only run the Zabbix Server. No database is running in this container, it is configured to use an MySQL database as backend. Before you can make use of this container, you’ll have to have an MySQL Server running somewhere in your environment. It will install Zabbix 3.0.1

How do we use this container? First we have to download it:

docker pull wdijkerman/zabbix-server

And this is how we start it:

docker run  -p 10051:10051 --name zabbix-server \
            -v /data/zabbix:/zabbix \
            -e ROOTPASSWORD=secretpassword \
            -e DBHOST=192.168.1.153 -e DBUSER=zabbix \
            -e DBPASSWORD="zabbix-pass" \
            -e DBPORT=3306 -e DBNAME=zabbix wdijkerman/zabbix-server

This docker container make use of an volume, mentioned with the -v parameter. This will mount the ‘/data/zabbix’ directory in the docker container as ‘/zabbix’. This directory contains the directories which are used for storing SSL (configuration) files, modules and scripts. With the -p option, we open the port on the host (10051) and forward it to the port to the docker container (10051).

The -e values which you see are environment settings that are passed into the docker container. These environment settings are the actual Zabbix Server configuration options but in uppercase. As you might see, the settings in the example are used for connecting to the database ‘zabbix’ on host ‘192.168.1.153’ with username ‘zabbix’ and password ‘zabbix-pass’. I also specified the ‘ROOTPASSWORD’ setting, this is the password for the MySQL root user. When this is supplied, it will create the database (DBNAME) and create the user (DBUSER). If you don’t specify it (Which can of course) an database and user should already be created.

With this in mind, if we want to set the StartPollers parameter to 10, we have to update the run command by adding the following:

-e STARTPOLLERS=10

Now you can configure the Zabbix Server exactly like you want, just by adding some environment parameters in the command line before starting it.

But this is only the Zabbix Server, not the frontend.

Zabbix Web

This container contains only the Zabbix Web part, or the ‘frontend’. (docker hub)

Like the Zabbix Server, this is also an Debian based docker container and will only work with MySQL as database. It is running Apache 2.4.

How do we use this container? First we have to download it:

docker pull wdijkerman/zabbix-web

And this is how we start it:

docker run  -p 80:80 --name zabbix-web \
            -e ZABBIXURL=zabbix.example.com \
            -e ZBXSERVERNAME=vserver-151 \
            -e ZBXSERVER=192.168.1.151 \
            -e DBHOST=192.168.1.153 -e DBUSER=zabbix \
            -e DBPASSWORD="zabbix-pass" \
            -e DBPORT=3306 -e DBNAME=zabbix wdijkerman/zabbix-web

The DB* settings are the same as for the Zabbix Server container, so I won’t describe them again. With the Zabbix Web container we open port 80 on the host and forward it to port 80 on the docker container.

With the ZABBIXURL setting, we specify the url on which the web interface is available. In this case, when we open ‘zabbix.example.com’ we get the login page of Zabbix. (Well, if you have access to the zabbix.example.com domain 😉 ) With the ZBXSERVERNAME setting we specify the name of the Zabbix Server and with ZBXSERVER we let the Zabbix Web know where it can find the Zabbix Server.

Please let me know if you find any issues with configuring it or encounter an bug. Also if you have improvements, please create an PR on github! 🙂

Links:

Using Test Kitchen with Docker and serverspec to test Ansible roles

ansible_logo_black_square

Lets write some tests for our Ansible roles. When we are testing our roles, we can validate the quality of the role. With this blog item we are using the test kitchen framework and serverspec for the Ansible role: dj-wasabi.zabbix-agent.

With test kitchen we can start an vagrant box or an docker image and our Ansible role will be executed on this instance. There is an whole list of Test Kitchen drivers which can be found here. (If you have an more recent up2date list, please let me know and I update the link). When the Ansible role is installed, serverspec will be executed so we can verify if the installation and configuration is done correctly. Ideally you want to execute this every time when an change is done for the Role, so the best way is to do everything with Jenkins.

We will make use of the following tools:

  • Test Kitchen
  • docker
  • Serverspec

Installation of Jenkins is out of scope for this blog item, same as installation of docker. You’ll need to check these websites for installing Jenkins and  docker on your machine.

Before we even continue, we need to install test kitchen. We do this with the following command:

wdijkerman@curiosity [ ~/git/ansible/ansible-zabbix-agent ] (13:02:01 - Thu Aug 20)
 (master) > gem install test-kitchen

This is easy, it only take around 10 seconds to install. We continue with the test kitchen setup by executing the next command:

wdijkerman@curiosity [ ~/git/ansible/ansible-zabbix-agent ] (13:03:05 - Thu Aug 20)
 (master) > kitchen init --create-gemfile --driver=kitchen-docker
      create  .kitchen.yml
      create  chefignore
      create  test/integration/default
      create  .gitignore
      append  .gitignore
      append  .gitignore
      create  Gemfile
      append  Gemfile
      append  Gemfile
You must run `bundle install' to fetch any new gems.

The command creates some files and directories at default which we almost all will be using. We remove the chefignore file, as we don’t use chef in this case.

We update the Gemfile by adding the next line at the end of the file:

gem "kitchen-ansible"

Now we run the “bundle install” command, it will install the kitchen-docker and kitchen-ansible gems with their dependencies.

wdijkerman@curiosity [ ~/git/ansible/ansible-zabbix-agent ] (13:03:55 - Thu Aug 20)
 (master) > bundle install
Using multipart-post 2.0.0
Using faraday 0.9.1
Using highline 1.7.3
Using thor 0.19.1
Using librarian 0.1.2
Using librarian-ansible 1.0.6
Using mixlib-shellout 2.1.0
Using net-ssh 2.9.2
Using net-scp 1.2.1
Using safe_yaml 1.0.4
Using test-kitchen 1.4.2
Using kitchen-ansible 0.0.23
Using kitchen-docker 2.3.0
Using bundler 1.6.2
Your bundle is complete!
Use `bundle show [gemname]` to see where a bundled gem is installed.
wdijkerman@curiosity [ ~/git/ansible/ansible-zabbix-agent ] (13:03:59 - Thu Aug 20)
 (master) > 

We can also set some version restrictions in this file, an example looks like this:

gem 'test-kitchen', '>= 1.4.0'
gem 'kitchen-docker', '>= 2.3.0'
gem 'kitchen-ansible'

With the above example, you’ll install test-kitchen with version 1.4.0 or higher and version 2.3.0 or higher for kitchen-docker. But for now, we use it without the versions.

We have an basic .kitchen.yml file, like this:

---
driver:
  name: docker

provisioner:
  name: chef_solo

platforms:
  - name: ubuntu-14.04
  - name: centos-7.1

suites:
  - name: default
    run_list:
    attributes:

We are changing the file, so it will look like this:

---
driver:
  name: docker
  provision_command: sed -i '/tsflags=nodocs/d' /etc/yum.conf

provisioner:
  name: ansible_playbook
#  ansible_yum_repo: "http://mirror.logol.ru/epel/6/x86_64/epel-release-6-8.noarch.rpm"
  hosts: localhost
#  requirements_path: requirements.yml

platforms:
  - name: centos-6.6

verifier:
  ruby_bindir: '/usr/bin'

suites:
  - name: default

What does it do:

We use the docker driver to run our playbook and tests. It will start an docker image and executes the playbook and after this, we execute serverspec tests to validate of everything should work as expected.

We use the “ansible_playbook” as provisioner. I have 2 lines commented in this .kitchen.yml file. My Ansible role doesn’t have any dependencies. With the “ansible_yum_repo” we can point it to for example the epel-release.rpm file. When the docker image is started, it will download and install this epel repository file so if the role needs some packages from Epel, it will succeed. Same as for the “requirements_path”. This is an yml file which can be used for downloading the role dependencies.

The Zabbix Agent role doesn’t have any dependencies, but for the Zabbix Server role, it has 3 dependencies. You could use it like this:

wdijkerman@curiosity [ ~/git/ansible/ansible-zabbix-agent ] (13:07:22 - Thu Aug 20)
 (master) > cat requirements.yml
---
- src: geerlingguy.apache
- src: geerlingguy.mysql
- src: galaxyprojectdotorg.postgresql

There is only 1 platform in this test, centos-6.6. Same as for the “suites” there is only one. You can  specify more platforms and suits. The following example is the “suites” part of the dj-wasabi.zabbix-server role:

suites:
  - name: zabbix-server-mysql
    provisioner:
        name: ansible_playbook
        playbook: test/integration/zabbix-server-mysql.yml
  - name: zabbix-server-pgsql
    provisioner:
        name: ansible_playbook
        playbook: test/integration/zabbix-server-pgsql.yml

There are 2 suits with their own playbooks. In the above case, there is an playbook which will be executed with the “MySQL” as backend and there is an playbook with the “PostgreSQL” as backend. Both playbooks will be executed in their own docker instance. So it will start with for example the  ‘zabbix-server-mysql’ suits and when this is finished successfully, it continues with the suit ‘zabbix-server-pgsql’.

You can find on these pages some more information on the configuration of test kitchen, ansible and docker parts:

We now create our only playbook:

wdijkerman@curiosity [ ~/git/ansible/ansible-zabbix-agent ] (13:07:25 - Thu Aug 20)
 (master) > vi test/integration/default.ym
--- 
- hosts: localhost
  roles:
    - role: ansible-zabbix-agent

We have configured the playbook which will be executed against the docker image on ‘localhost’.

But we are not there yet, we will need create some serverspec tests to. With these tests we validate if the execution of the Ansible role went successful.

First we have to create the following directory: test/integration/default/serverspec/localhost

We create the spec_helper.rb file:

wdijkerman@curiosity [ ~/git/ansible/ansible-zabbix-agent ] (13:07:41 - Thu Aug 20)
 (master) > vi test/integration/default/serverspec/spec_helper.rb
require 'serverspec'
set :backend, :exec

We will create the serverspec file:

wdijkerman@curiosity [ ~/git/ansible/ansible-zabbix-agent ] (13:08:22 - Thu Aug 20)
 (master) > vi test/integration/default/serverspec/localhost/ansible_zabbix_agent_spec.rb

The name is not that important, but it has to end with: _spec.rb. We add the following content to the file:

require 'serverspec'
require 'spec_helper'

describe 'Zabbix Agent Packages' do
    describe package('zabbix-agent') do
        it { should be_installed }
    end
end

describe 'Zabbix Agent Configuration' do
    describe file('/etc/zabbix/zabbix_agent.conf') do
        it { should be_file}
        it { should be_owned_by 'zabbix'}
        it { should be_grouped_into 'zabbix'}
    end
end

With the first “describe” we print ‘Zabbix Agent Packages’ and this is an block. When you have some rspec experience with Puppet, you would for example add the params like this: let(:params) { {:server => ‘192.168.1.1’} }
In our case, this is nothing more than printing the text.

Now we proceed with the 2nd describe: package(‘zabbix-agent’). All actions till the firsy “end” is related to this package. For the package we only have 1: it should be_installed. So, when this spec file is executed it will check if the ‘zabbix-agent’ package is installed. If not, you’ll see an error message.

We proceed with the 4th describe, file(‘/etc/zabbix/zabbix_agent.conf’). We have several checks for this file:

  • It Should be an file. (You could also choose for link, directory or even device)
  • The owner of the file needs to be user zabbix
  • The group of the file needs to be group zabbix

There are a lot of other options and checks to use in your spec file, but if we explain it here this is really gonna be an long post.

Only thing that we need to do is to run kitchen. So we execute the following command:

wdijkerman@curiosity [ ~/git/ansible/ansible-zabbix-agent ] (13:09:13 - Thu Aug 20)
 (master) > kitchen test

If everything goes fine, you’ll see a lot of output. At the end of the run, serverspec will be executed:

       Zabbix Agent Packages
         Package "zabbix-agent"
           should be installed
      
       Zabbix Agent Configuration
         File "/etc/zabbix/zabbix_agent.conf"
           should be file (FAILED - 1)
           should be owned by "zabbix" (FAILED - 2)
           should be grouped into "zabbix" (FAILED - 3)
      
       Failures:
      
         1) Zabbix Agent Configuration File "/etc/zabbix/zabbix_agent.conf" should be file
            Failure/Error: it { should be_file}
              expected `File "/etc/zabbix/zabbix_agent.conf".file?` to return true, got false
              /bin/sh -c test\ -f\ /etc/zabbix/zabbix_agent.conf
             
            # /tmp/verifier/suites/serverspec/localhost/ansible_zabbix_agent_spec.rb:12:in `block (3 levels) in <top (required)>
      
         2) Zabbix Agent Configuration File "/etc/zabbix/zabbix_agent.conf" should be owned by "zabbix"
            Failure/Error: it { should be_owned_by 'zabbix'}
              expected `File "/etc/zabbix/zabbix_agent.conf".owned_by?("zabbix")` to return true, got false
              /bin/sh -c stat\ -c\ \%U\ /etc/zabbix/zabbix_agent.conf\ \|\ grep\ --\ \\\^zabbix\\\$
             
            # /tmp/verifier/suites/serverspec/localhost/ansible_zabbix_agent_spec.rb:13:in `block (3 levels) in <top (required)>'
      
         3) Zabbix Agent Configuration File "/etc/zabbix/zabbix_agent.conf" should be grouped into "zabbix"
            Failure/Error: it { should be_grouped_into 'zabbix'}
              expected `File "/etc/zabbix/zabbix_agent.conf".grouped_into?("zabbix")` to return true, got false
              /bin/sh -c stat\ -c\ \%G\ /etc/zabbix/zabbix_agent.conf\ \|\ grep\ --\ \\\^zabbix\\\$
             
            # /tmp/verifier/suites/serverspec/localhost/ansible_zabbix_agent_spec.rb:14:in `block (3 levels) in <top (required)>'
      
       Finished in 0.11823 seconds (files took 0.37248 seconds to load)
       4 examples, 3 failures

Whoops, it seems that my serverspec file was expecting something else. I made an typo, the file should be /etc/zabbix/zabbix_agentd.conf with an d! 🙂

We can see the docker image is created with the “kitchen list” command, the “Last Action” is “Set Up”:

wdijkerman@curiosity [ ~/git/ansible/ansible-zabbix-agent ] (13:14:51- Thu Aug 20)
 (master) > kitchen list
Instance          Driver  Provisioner      Verifier  Transport  Last Action
default-centos-66  Docker  AnsiblePlaybook  Busser    Ssh        Set Up

There are 2 ways to proceed when we fix the typo:

  • We run ‘kitchen test’ again, but it will destroy our docker image and starts again from the start.
  • We run “kitchen verify’ and we only run the serverspec tests. (A lot quicker!)

We use the “kitchen verify” command:

wdijkerman@curiosity [ ~/git/ansible/ansible-zabbix-agent ] (13:15:12 - Thu Aug 20)
 (master) >  kitchen verify
-----> Starting Kitchen (v1.4.2)
-----> Verifying <default-centos-66>...
$$$$$$ Running legacy verify for 'Docker' Driver
       Preparing files for transfer
       Removing /tmp/verifier/suites/serverspec
       Transferring files to <default-centos-66>
-----> Running serverspec test suite
       /opt/rh/ruby193/root/usr/bin/ruby -I/tmp/verifier/suites/serverspec -I/tmp/verifier/gems/gems/rspec-support-3.3.0/lib:/tmp/verifier/gems/gems/rspec-core-3.3.2/lib /tmp/verifier/gems/bin/rspec --pattern /tmp/verifier/suites/serverspec/\*\*/\*_spec.rb --color --format documentation --default-path /tmp/verifier/suites/serverspec
      
       Zabbix Agent Packages
         Package "zabbix-agent"
           should be installed
      
       Zabbix Agent Configuration
         File "/etc/zabbix/zabbix_agentd.conf"
           should be file
           should be owned by "zabbix"
           should be grouped into "zabbix"
      
       Finished in 0.10183 seconds (files took 0.33263 seconds to load)
       4 examples, 0 failures
      
       Finished verifying <default-centos-66> (0m1.12s).
-----> Kitchen is finished. (0m1.17s)

As you see, we only run the serverspec tests and everything is ok. 4 examples with 0 failures.
We can continue with creating serverspec tests and rerun the “kitchen verify” command till we are satisfied.

In theory, you’ll create the tests first before creating the playbook. With practice ….

At the end when you are ready, you’ll create an Jenkins job which pulls for changes from your git repository. You’ll create an job which has 2 “Execute shells” steps:

  • Install bundler and test-kitchen and run bundle install
  • Execute kitchen test

Installation of the gem step:s:

gem install bundler --no-rdoc --no-ri
gem install test-kitchen --no-rdoc --no-ri
bundle install

And 2nd build step:

kitchen test

Whoot! 🙂