Testing Ansible roles in a cluster setup with Docker and Molecule


This page is updated on 2017-05-01.

This is a follow up on the previous 2 blog posts. With the first blog we discussed some steps to test your Ansible role with some basic steps. With the 2nd blog we added some features to extend our test by using CI tooling and group vars. With this post however, we might be configuring Molecule that will not occur very common with testing Ansible roles.

This time we configure Molecule for a role that is installing and configuring a cluster on Docker, like MySQL or MongoDB. We don’t go into a specific role (as there are so many), I only give some information on how to do this. We are only configuring Molecule for this setup, I’m still busy with running some specific TestInfra tests on a specific container.

Keep in mind these actions are only needed when using the {{ ansible_eth0.ipv4.address }} isn’t enough and you need a list with all ips.

For configuring Molecule, we’ll have to change 2 files:

  1. molecule.yml
  2. playbook.yml


First we update the ‘molecule.yml’ file by configuring 3 (Or more, depends on what you need) docker containers. See the following example:

  - name: node1
      - cluster_service
    image: milcom/centos7-systemd
    image_version: latest
    privileged: True
      3306: 3306,
      4444: 4444
  - name: node2
      - cluster_service
    image: milcom/centos7-systemd
    image_version: latest
    privileged: True
      3307: 3306,
      4445: 4444
  - name: node3
      - cluster_service
    image: milcom/centos7-systemd
    image_version: latest
    privileged: True
      3308: 3306,
      4446: 4444

As you see, I have added the ‘port_bindings’ configuration to the instances, which is different with the examples in the previous blog posts. With the containers, we open the ports on the host (Before the ‘:’) and proxy them to the ports to the docker container (After the ‘:’ ).

In the above example, this ports configuration is used for configuring a MySQL (Or MariaDB) Galera cluster setup. You’ll have to update the ports configuration to your needs.


Before we specify the roles in the ‘playbook.yml’, we add an ‘pre_tasks’. We add 2 blocks of code and will discuss them one by one. First we add the following in the ‘pre_tasks’ part:

    - name: "Get ip node 1"
      local_action: shell docker inspect --format \{\{.NetworkSettings.IPAddress\}\} node1
      register: node_ip_1
      changed_when: False
    - name: "Get ip node 2"
      local_action: shell docker inspect --format \{\{.NetworkSettings.IPAddress\}\}  node2
      register: node_ip_2
      changed_when: False
    - name: "Get ip node 3"
      local_action: shell docker inspect --format \{\{.NetworkSettings.IPAddress\}\}  node3
      register: node_ip_3
      changed_when: False

We have added 3 tasks, that do the same command but for each container. We execute a ‘docker inspect’ command to get the ip address of the docker container. As the docker inspect command used the ‘{{ .NetworkSettings.IPAddress }}’ format, Ansible will try to replace it because it thinks it is a variable. Luckily, we can make use of ‘{{ raw }} {{ endraw }}’ for this and Ansible will not use this as a variable anymore.

We register a variable, because we need to use the output, because the ‘docker inspect’ outputs the ip address. We also have to add the property ‘changed_when: False’.

What do you mean with the last one?

We have to fool Ansible with the ‘changed_when’ command for the ‘idempotence’ check. As this task will run every time, the ‘idempotence’ check will fail because of it (because it sees that there are tasks with state “Changed”).

Now we add the 2nd block of tasks to the ‘pre_tasks’, just after the first block of code we added earlier:

    - name: "Set fact"
        node_ip: "{{ node_ip_1.stdout }}"
      when: inventory_hostname  == 'node1'
    - name: "Set fact"
        node_ip: "{{ node_ip_2.stdout }}"
      when: inventory_hostname  == 'node2'
    - name: "Set fact"
        node_ip: "{{ node_ip_3.stdout }}"
      when: inventory_hostname  == 'node3'

With this block we add 3 tasks again, 1 task for each container, to create a fact. In this case, I used to create the fact with the name ‘node_ip’, but you may name it differently. In my case when I use the ‘node_ip’ in my Ansible role, it get the actual IP of the host.

You can also create a list with all the ips if you need this in your role:

  - "{{ node_ip_1.stdout }}"
  - "{{ node_ip_2.stdout }}"
  - "{{ node_ip_3.stdout }}"

(You have to use the corrent name for the list of course 😉 )

I don’t know if this is the correct way, but it works in my case. I have a Jenkins job that validates a role that is configured to run a 3 node setup (Elasticsearch, MariaDB and some others).

If you have a other or a better way for doing this, please let me know!

Testing Ansible roles with Molecule, Testinfra and Docker


“On 2017-05-01 I’ve updated this post to the current situation. Some things where outdated and where removed.”

In some earlier posts I’ve described how you can use Test Kitchen for testing Ansible Roles (This one and the one extending it.). Test Kitchen was created for testing Chef Cookbooks and like Chef, Test Kitchen is a Ruby application. On this page we describe an other tool for the same purpose. This tool is what you might see as a Python clone of Test Kitchen, but more specific to Ansible: Molecule (Github)

Molecule isn’t that old, only few years and when I browse the internet it is not yet really known in the community. Unlike Test kitchen with the many different drivers, Molecule supports several backends, like Vagrant, Docker, and OpenStack. With Molecule you can make use of Serverspec (Like Test Kitchen), but you can also make use of ‘Testinfra’. Testinfra is like Serverspec a tool for writing unit tests, but it is written in Python.

Lets dive into Molecule and create some tests for Molecule. On this page, we make use of the Docker backend and if you following this page please install docker.

Installing Molecule is really simple:

pip install molecule docker

Voila, it is installed. With the installation of Molecule, Testinfra is installed to. We had to provide the docker module as well, otherwise molecule doesn’t know how to connect to the docker daemon. We can configure a Ansible Role. I used my ‘zabbix-agent’ role as test case for the Test Kitchen setup, so I will use it again for Molecule.

When you haven’t created an Ansible role yet, instead of using the ansible-galaxy command you can use the following command:

molecule init --driver docker --role role_name

This will create just like the ‘ansible-galaxy’ some default directories and files, but also gives us a starting point with a few extra files for testing this role with Molecule.

When you already have a working module and want to make use of Molecule, please execute the following command:

molecule init --driver docker

This will install several files specific for Molecule. No worries, we can recreate these files manually. Lets do that in a role and see what the files do.

File: <root>/molecule.yml

  playbook: playbook.yml

    - name: zabbix-01
        - group1
      image: debian
      image_version: latest
      privileged: True

  name: testinfra

This is the configuration file for Molecule. We specify which playbook Molecule will execute, in this case playbook.yml.
We specify that we want to make use of the Docker driver and that we have a docker container configuration. In this case, we only have 1 docker container specified. We use a Debian docker container with the ‘latest’ tag. We name the container ‘test-01’ and is in the group ‘group1’. And at last we configure molecule to use testinfra as the testtool.

File: <root>/playbook.yml

- hosts: all
    - role: ansible-zabbix-agent

The playbook that is executed in the Docker container. This is a very basic one, we only have to specify the correct .

File: <root>/tests/inventory

zabbix-01 ansible_connection=docker

The Ansible inventory file. Should be a known file to you 😉

File: <root>/tests/test_default.py

from testinfra.utils.ansible_runner import AnsibleRunner
testinfra_hosts = AnsibleRunner('.molecule/ansible_inventory').get_hosts('all')

def test_hosts_file(File):
    hosts = File('/etc/hosts')
    assert hosts.user == 'root'
    assert hosts.group == 'root' 

(Edit 2016-09-14: As of release 1.9, the first 2 lines should be present in the TestInfra script.)

This is the test infra python file, containing the tests. After the init command, we have 1 test that will check if there is a hosts file, and the user and group of the file belongs to user ‘root’. We discuss this file later on by adding some more tests.

File: <root>/tests/test.yml

- hosts: localhost
  remote_user: root
    - zabbix-agent-role

Now we have discussed the files.

We add some infra test checks in the ‘test_default.py’ file. We add the following 2 tests:

def test_zabbix_package(Package):
    zabbixagent = Package('zabbix-agent')
    assert zabbixagent.is_installed
    assert zabbixagent.version.startswith("1:3.0")

def test_zabbixagent_running_and_enabled(Service):
    zabbixagent = Service("zabbix-agent")
    # assert zabbixagent.is_running
    assert zabbixagent.is_enabled

These are 2 Python function which are executed with Testinfra. With the first function, we validate if the package ‘zabbix-agent’ is installed. Also we check if the version starts with: 1:3.0. If you have some experience with testing Python code, this might be familiar to you. Test Infra uses ‘PyTest‘ to execute the tests and validate the.

The 2nd function we validate the ‘zabbix-agent’ service. We make sure the service is enabled. As you see, I’ve commented the check if the service is running. When it is enabled, I get this error message:

Failed to get D-Bus connection: Unknown error -1

Strange, because I’ve configured the privileged mode on the docker container. So maybe this is a bug or misconfiguration on my part, but for now I leave it commented and need to find a solution for this.

Within the molecule.yml we have to update the docker container configuration by adding the following property for all the docker containers:

    required: True

Now we added it, we have to do a “molecule destroy” and start again. The container will be recreated and we won’t get an error message about the dbus.

Now we are ready to move on (I’m well aware that these 2 tests that I added will not be enough, I’ll add these myself later on).

Molecule has several subcommands, let run molecule -h and see what is available:

No handlers could be found for logger "vagrant"
    molecule [-hv] &amp;amp;lt;command&amp;amp;gt; [&amp;amp;lt;args&amp;amp;gt;...]

    check         check playbook syntax
    create        create instances
    converge      create and provision instances
    idempotence   converge and check the output for changes
    test          run a full test cycle: destroy, create, converge, idempotency-check, verify and destroy instances
    verify        create, provision and test instances
    destroy       destroy instances
    status        show status of instances
    list          show available platforms, providers
    login         connects to instance via SSH
    init          creates the directory structure and files for a new Ansible role compatible with molecule

    -h --help     shows this screen
    -v --version  shows the version

We first start with the ‘check’ command:

[vagrant@localhost ansible-zabbix-agent]$ molecule check
No handlers could be found for logger "vagrant"

playbook: playbook.yml
[vagrant@localhost ansible-zabbix-agent]$ echo $?

Seems very well, the check commands validate if the playbook.yml doesn’t have any problems/syntax errors.
We can continue with the next command: create.

[vagrant@localhost ansible-zabbix-agent]$ molecule create
No handlers could be found for logger "vagrant"
 Building ansible compatible image ...
 Step 1 : FROM debian:latest

  ---&amp;amp;gt; 1b088884749b

 Step 2 : RUN bash -c 'if [ -x "$(command -v apt-get)" ]; then apt-get update &amp;amp;amp;&amp;amp;amp; apt-get install -y python sudo; fi'

  ---&amp;amp;gt; Using cache

  ---&amp;amp;gt; 8ef54383599a

 Step 3 : RUN bash -c 'if [ -x "$(command -v yum)" ]; then yum makecache fast &amp;amp;amp;&amp;amp;amp; yum update -y &amp;amp;amp;&amp;amp;amp; yum install -y python sudo; fi'

  ---&amp;amp;gt; Running in 6d3142fa72aa

 Finished building molecule_local/debian:latest
 Creating container zabbix-01 with base image debian:latest ...
 Container created.

[vagrant@localhost ansible-zabbix-agent]$

Now we have created a docker container where we can install our Ansible role on to, we do that with the ‘converge’ subcommand.

[vagrant@localhost ansible-zabbix-agent]$ molecule converge
No handlers could be found for logger "vagrant"

PLAY [all] *********************************************************************

TASK [setup] *******************************************************************
ok: [zabbix-01]

TASK [ansible-zabbix-agent : Include OS-specific variables] ********************
ok: [zabbix-01]

TASK [ansible-zabbix-agent : Install the correct repository] *******************
skipping: [zabbix-01]

RUNNING HANDLER [ansible-zabbix-agent : restart zabbix-agent] ******************
changed: [zabbix-01]

PLAY RECAP *********************************************************************
zabbix-01                  : ok=12  changed=7    unreachable=0    failed=0

Nice, the role is installed correctly without any issues on the container. With Test Kitchen we had to use BATS to validate if the Role is idempotent, but luckily molecule has just a simple sub command for it: idempotence

Well, it seems that the Role has passed the idempotence test:

[vagrant@localhost ansible-zabbix-agent]$ molecule idempotence
No handlers could be found for logger "vagrant"
Idempotence test in progress (can take a few minutes)...
Idempotence test passed.

[vagrant@localhost ansible-zabbix-agent]$

Testing the role is nicely going on right now, but we are not there yet. Now we need to use the ‘verify’ command to actually validate our role on the Docker container:

[vagrant@localhost ansible-zabbix-agent]$ molecule verify
No handlers could be found for logger "vagrant"
Trailing whitespace found in ./defaults/main.yml on lines: 35
Trailing newline found at the end of ./handlers/main.yml
Trailing whitespace found in ./library/zabbix_host.py on lines: 29
Trailing newline found at the end of ./library/zabbix_hostmacro.py
[vagrant@localhost ansible-zabbix-agent]$

Whoops, it seems it has found some issues. Let me fix that first, probably need to run the verify again after fixing it.

[vagrant@localhost ansible-zabbix-agent]$ molecule verify
No handlers could be found for logger "vagrant"

Executing testinfra tests found in tests/.
============================= test session starts ==============================
platform linux2 -- Python 2.7.5, pytest-2.9.2, py-1.4.31, pluggy-0.3.1
rootdir: /git/ansible/ansible-zabbix-agent/tests, inifile:
plugins: xdist-1.14, testinfra-1.4.0
collected 3 itemss

tests/test_default.py ...

=========================== 3 passed in 0.63 seconds ===========================

No serverspec tests found in spec/.

[vagrant@localhost ansible-zabbix-agent]$

After fixing it, everything seems to work fine. Nice!

Now we are done with the container, so we can execute molecule again, but with the delete sub command and the container will be deleted.

These were the basics for testing an Ansible role with Molecule, Docker and Test Infra. This page uses the ‘Debian’ Docker image, whereas I normally use CentOS for this. I have some issues (Get the same error message when I enable the Test Infra test to validate if the service is running) to make this work on CentOS. So maybe Molecule isn’t mature enough yet, but it is getting there.

I’ll update my Ansible roles so it will use Molecule instead of Test Kitchen (No hard feelings ;-))

Extending the Ansible Test Kitchen tests with BATS tests


In one of the previous blog posts (This one) I described how you can test your Ansible roles with Test Kitchen and Serverspec. With this setup we were able to execute an Ansible Role in an docker container and validate this installation with Serverspec. Server spec is a little bit limited, as we only tested the installation in an sort of technical way: Process is running, port is open, file is created and owner by user x etc. Sometimes this isn’t enough to validate your setup, BATS is the answer.

“Bats is a TAP-compliant testing framework for Bash. It provides a simple way to verify that the UNIX programs you write behave as expected.
A Bats test file is a Bash script with special syntax for defining test cases. Under the hood, each test case is just a function with a description.”
So how does this look like? Let dive in this example:
#!/usr/bin/env bats

@test "Validate status code for login page" {
  run curl -s -o /dev/null -w "%{http_code}" http://zabbix.example.com/index.php
  [[ $output = "200" ]]

First we let the script now this is an bats script. The 2nd line is the start of an test and this line starts with the @ sign. Each test has an description and in this case: Validate status code for login page.  Next line is the actual test, we run the curl command. The command needs to start with run, so it will know that an actual command should be executed. In this case, the output of the curl command is the http status code and this will be checked in the 4th line. The $output variable contains the output of the command and in this case, it will contain 200 (or something else, but then the test fails.)

We can also do an check to find if some string is found in the output of the test command, see the following example:

@test "Validate login page and search for \"Username\"" {
  run curl -s http://zabbix.example.com/index.php
  [[ $output =~ ">Username<" ]]

With this test, we do an curl of an page and checks if we can find the string “>Username<“ in the output. If this is found, this test is executed correctly, otherwise it will fail.

Just to be clear, you don’t have to use this output check for each check. You can also rely on the exit codes of the command. See this paragraph:

“Test cases consist of standard shell commands. Bats makes use of Bash’s errexit (set -e) option when running test cases. If every command in the test case exits with a 0 status code (success), the test passes. In this way, each line is an assertion of truth.”

(Source: https://github.com/sstephenson/bats)

Please check the github page, there are some nice examples on how to write your tests. But the goal for this blog post is that we have to use it with our Test Kitchen setup, so how do we continue?

As you might recall from the earlier mentioned blog post, we created the directory structure: test/integrations/default. And in this directory we created an directory named serverspec. In this “default” directory we also create an directory named bats. After this we create the file with the extension .bats.

Now we are all set. 😃Now we can execute “kitchen test” and when the Ansible role is installed, the bats suits will begin:

-----> Running bats test suite
        ✓ Validate status code for login page
        ✓ Validate login page and search for "Username"
        ✓ Validate if we can login with default credentials via API

Above example shows 3 tests and each test is executed correctly, as you can see with the checks in front of it. (Otherwise we would see an ‘x’) Right after this, the server spec will be executed.

Have fun! 😃

Installing zabbix-server with ansible


Not only I have an puppet module which can be freely used from the forge, I also have some Ansible roles for Zabbix. This page will describe installing the zabbix-server with the dj-wasabi.zabbix-server role. If you want to know how you install the zabbix-agent, please check this page.

You can find the role and some information on this page: https://galaxy.ansible.com/list#/roles/2070

This role works on the 3 main Linux operating systems:

  • RedHat
  • Debian
  • Ubuntu

So, if your server has one of these operating system, you can continue. If you have however an other operating system and have some Ansible knowledge, please add some improvements and create an Pull Request on Github. I always accept Pull Requests related to the Ansible roles.

When you want to install this role, you only have to execute the following command:

ansible-galaxy install dj-wasabi.zabbix-server

Now we need to setup everything, but before we do anything we need to know what kind of database server is going to be used. Zabbix Server can work with several different databases as backend. This Ansible role only works with the following databases:

  • PostgreSQL
  • MySQL

Before we see the examples, there is one main parameter which is always needed: zabbix_url

This is the url on which the zabbix interface is available and should be an fqdn. Default it will create an Apache Virtual Host configuration file with this FQDN as ServerName. If you set this parameter as this:

zabbix_url: zabbix.example.com

the web interface will be available at: http://zabbix.example.com


Default the PostgreSQL is used as backend and before we can use this role, we need to find and download an Ansible role for PostgreSQL which can be used on your operating system. In this example we are using the following role: ‘galaxyprojectdotorg.postgresql’

The following is an example of an playbook for installing the ‘zabbix-server’ with an PostgreSQL database:

- hosts: zabbix-server
    - role: galaxyprojectdotorg.postgresql
        - "host all all trust"
        - "host all all ::1/128 trust"
      postgresql_pg_hba_local_ipv4: false
      postgresql_pg_hba_local_ipv6: false
    - role: dj-wasabi.zabbix-server
      zabbix_url: zabbix.example.com
      zabbix_version: 2.4
      server_dbuser: zabbix-server
      server_dbpassword: zabbix-server

This is the minimum configuration to use for this role with an PostgreSQL as database. What might help to secure everything is to use an more difficult to guess password for the ‘server_dbuser’ 😉


Lets use MySQL as backend now. The following example is used with the following role: ‘geerlingguy.mysql’:

- hosts: localhost
    - role: geerlingguy.mysql
    - role: ansible-zabbix-server
      zabbix_url: zabbix.example.com
      zabbix_version: 2.4
      database_type: mysql
      database_type_long: mysql
      server_dbuser: zabbix-server
      server_dbpassword: zabbix-server

Same as for the example with PostgreSQL, use an different value for the server_dbpassword.

Other configurations

Don’t think that what you just saw with configuring this role is everything. There are a lot of other configuration parameters that can be set. Keep in mind, that all configuration options you’ll normally find in the ‘zabbix_server.conf’ configuration file, can also be set with this role.

Lets give an example:

When we need to set the StartPollers to value 10, we can update the MySQL playbook to look like this:

- hosts: localhost
    - role: geerlingguy.mysql
    - role: ansible-zabbix-server
      zabbix_url: zabbix.example.com
      zabbix_version: 2.4
      database_type: mysql
      database_type_long: mysql
      server_dbuser: zabbix-server
      server_dbpassword: zabbix-server
      server_startpollers: 10

When the role is executed on the ‘zabbix-server’, we see the following in the configuration file:

### option: startpollers
#	number of pre-forked instances of pollers.

Keep in mind to lower the property setting and prefix it with ‘server_’ and you’ll have the property for this Ansible role.

As this Ansible role isn’t perfect, please let me know if you encounter any issues by creating an issue. Pull Request for bugs or new features are always welcome!

Using Librarian-Ansible to install Ansible roles from Gitlab


I have some Ansible roles which I try to keep up2date and these are on Github and on my personal Gitlab instance. Sometimes this takes a little bit longer that I want to, but other projects needs some attention to.

For my own personal environment, I use Ansible too and this is in an seperate git repository of my Gitlab server (Repository: environment/ansible.git). There is one thing that buggers me: My Ansible roles differs from the one used in my personal Ansible setup. At moment of writing, the ‘dj-wasabi.zabbix-agent’ role is at tag 0.2.1, but I use ‘0.0.2’ in my own Ansible setup (Oh, really that old?? 🙂 ).

There should be an solution for this. But before we continue, the solution should met my goals:

  • All Ansible roles should have their own git repository in Gitlab,
  • All Ansible roles have their own Jenkins job, documentation and test cases,
  • I want to make use of tags or versions.

With this I can create specific tags/version of the role and we can run some tests via Jenkins like ‘testkitchen’. With ‘testkitchen’ we run the role on an vagrant/docker and see if everything runs fine. But for know, ‘testkitchen’ is out of scope for this.

I first looked at ‘ansible-galaxy’. It has the possibility for using an ‘requirements.yml’ file which holds all information. Like location and even an version, so we can specify the correct role. After some testing it only work when you have the repository at Github.com.
Also the repository should exists on the Galaxy itself. So for the Zabbix roles this could work, but I also have some roles created just for my own environment. These are specific and there is no need to upload them to the Galaxy or github, so the ‘ansible-galaxy’ will not work for me.

I found “librarian-ansible’. This could be something which might work for me, but didn’t found information on the web. Yes, I did found something that the Ansible creator Michael DeHaan isn’t an very big fan of this (https://groups.google.com/forum/#!msg/ansible-project/TawjChwaV08/3p6Zv24rMWgJ). So lets try it anyways, maybe it creates a fan out of me ;-).

Installation is very simple, we have to install 1 gem:

wdijkerman@curiosity [ ~/git/environment/ansible ] (14:03:56 - Sat Aug 15)
 (master) > sudo gem install librarian-ansible
Successfully installed librarian-ansible-1.0.6
1 gem installed

Now we have to create the Ansiblefile. The Ansiblefile is used for declaring the roles and where these roles can be found. We can do this with the following command:

wdijkerman@curiosity [ ~/git/environment/ansible ] (14:24:19 - Sat Aug 15)
 (master) > librarian-ansible init
      create  Ansiblefile

It creates the “Ansiblefile” in the current directory. It already has some basic roles specified, but I don’t use them.

#!/usr/bin/env ruby
#^syntax detection

site "https://galaxy.ansible.com/api/v1";

role "kunik.deploy-upstart-scripts";

role "pgolm.ansible-playbook-monit",
  github: "pgolm/ansible-playbook-monit";

With the default Ansiblefile it shows you, that you also can make use of the Ansible Galaxy. The site is configured to use Ansible Galaxy API and when you only have specified the “role” (In this case kunik.deploy-upstart-scripts”, it will be downloaded from the Galaxy.

2nd example is downloding the git repository “pgolm/ansible-playbook-monit”, which will be installed with the role name “pgolm.ansible-playbook-monit”. Nice, but I only need to make use of the “git” option. I start with the following:

#!/usr/bin/env ruby
#^syntax detection

role "zabbix-javagateway",
    git: "git@gitlab.dj-wasabi.local:ansible/zabbix-javagateway.git",
    ref: "0.1.0"

When I run “librarian-ansible install’ it will clone the git repository and checkouts the tag “0.1.0”. The role is now installed in the ‘ librarian_roles/’ directory with the name “zabbix-javagateway”. But I want it in my roles directory, so I have to run the ‘librarian-ansible’ command again, but with the config option:

librarian-ansible config path roles --global

This sets the path to my “roles” directory. This is specified in my ansible.cfg and I want them in this directory. So running again the ‘librarian-ansible install’ command and the role is installed again. But, what I didn’t know (or didn’t read in the very few sites that exists about librarian-ansible) is it will delete the content of the directory. So all my roles which were in the ‘roles/’ directory are deleted. So, ‘git checkout roles/’ and moving all roles to their own git repository and start again! 🙂

Maybe add the “roles” directory in my .Gitignore file. We don’t want to store all the roles in this repository too.

I think I’m going to be an fan for librarian-ansible. 🙂


Installing zabbix-agent with Ansible


Not only I have an puppet module for installing Zabbix, I also have some Ansible roles for this. At the moment there are 4 roles:

In this blog item, we talk about the “zabbix-agent” role. The latest version is 0.2.0.


Installing this role is very easy:

ansible-galaxy install dj-wasabi.zabbix-agent

It will be installed in your roles directory. Default is “/etc/ansible/roles” or whatever you have configured in the ansible.cfg file. After installation there is only 1 (or 2 when you make use of active items) parameters needed for making this role work:

agent_server: <IP_-_FQDN_OF_ZABBIX_SERVER>
agent_serveractive: <IP_-_FQDN_OF_ZABBIX_SERVER>

This will need the ip address or the FQDN of the “zabbix-server”.


This role works on several operating systems/families:

  • RedHat
  • Debian
  • Ubuntu
  • OpenSuse

If you have an operating system/family which isn’t in the list above, you can create an issue at the Github page and please fill in the request. I can’t make any guarantee that it will come, but I can try it. Or if you do have some Ansible skills, please create an Pull request and I would be happy to accept it. 🙂


So, how does the playbook looks like? Like this:

- hosts: all
  sudo: yes
   - role: dj-wasabi.zabbix-agent
     agent_server: <IP_-_FQDN_OF_ZABBIX_SERVER>
     agent_serveractive: <IP_-_FQDN_OF_ZABBIX_SERVER>

As you see it is very basic and does the job very good. This only installs the agent on the specific server and configures the configuration file. But we really want to automate everything right?


Few weeks ago I found this pull requests for the “ansible-modules-extra” repository. This pull requests had an few ansible modules which made sure that you can use the Zabbix API to create or update hosts configuration. In the pull requests there were something like 5 modules, but this Ansible role only use 3 of them. With this role, you can create the following:

  • host groups
  • Host itself.
  • Macros for the host

For now, when the host is created, it will only create the “zabbix interface”. Maybe with the next release I’ll make sure you can also create SNMP, JMX and IPMI interfaces.

How do we have to configure it? Something like this. You will have to change it to your environment.

- hosts: wdserver00
     - role: zabbix-agent
       zabbix_url: http://zabbix.example.com
       zabbix_api_use: true
       zabbix_api_user: Admin
       zabbix_api_pass: Zabbix
       zabbix_create_host: present
         - Linux servers
         - Template OS Linux
         - macro_key: apache_type
           macro_value: reverse_proxy

I’ll skip the first 2 parameters, as these are described earlier on this page.

zabbix_url: The url on which the Zabbix web interface is available.
zabbix_api_user: The username which will connect to the API.
zabbix_api_pass: The password for the “zabbix_api_user” user.
zabbix_create_host: present if we want to create the host, absent if we want to delete it.
zabbix_host_groups: List of hostgroup where this host belongs to.
zabbix_link_templates: List of templates which will be linked to the host.
zabbix_macros: key, value pair of macros that will be used by the host. 

When we run Ansible, we will see at the end of the run:

.. <skip> ..
TASK: [zabbix-agent | Create hostgroups] **************************************
ok: [wdserver00 ->]

TASK: [zabbix-agent | Create a new host or update an existing host's info] ****
changed: [wdserver00 ->]

TASK: [zabbix-agent | Updating host configuration with macros] ****************
changed: [wdserver00 ->] => (item={'macro_key': 'apache_type', 'macro_value': 'reverse_proxy'})

Nice! If you check the Web interface, you’ll see that the host is created with the correct host groups and templates. If not, you’ll see some error messages in the Ansible output which will say what went wrong.

This role isn’t perfect, so if you encounter an bug or found/have and enhancement, please create an Pull request at Github and I’ll accept it. We can all make this role beter. 🙂

Side note:

There are more parameters which can be overridden, please check the “defaults/main.yml” file or the README.

Ansible executing puppet agent


I manage my own environment with Ansible, which is really great! This yaml format describing what you want to do is easy to read, understand and even easy to maintain. If you can automate an specific action or just simply executing commands one by one, you can do it with Ansible.

So in my own home environment, I have to execute the puppet agent command a few times. My CI for the wdijkerman-zabbix environment consists of a few steps. One of those steps is executing the puppet agent command on a specific host. (Maybe I will describe my CI process in an blog item later.. 🙂 )

When you try to combine them, you’ll notice that every ansible run for executing the puppet agent command fails. (No worries, I was there before .. 🙂 ) When an puppet agent runs, it ends with different exit codes. Normally when an script, program or commands ends successfully, it has an exit code of 0. Ansible uses this to determine if an action is ok, changed or failed. But puppet uses it slightly different.

According to the puppet agent man page (click):

Provide transaction information via exit codes. If this is enabled, an exit code of ‘2’ means there were changes, an exit code of ‘4’ means there were failures during the transaction, and an exit code of ‘6’ means there were both changes and failures.

With this in mind, we now have the following 2 tasks in Ansible:

  - name: "Start puppet agent"
    shell: /usr/bin/puppet agent --test --verbose --detailed-exitcodes
    register: puppet_agent
    changed_when: puppet_agent.rc == 2
    failed_when: puppet_agent.rc != 2 and puppet_agent.rc != 0

  - name: "puppet output"
    debug: var=puppet_agent.stdout_lines
    when: puppet_agent|failed

The first task is the most important one. We register an variable, which will be used in this task for checking exit codes. We let Ansible know that if the exit code of the puppet agent command is an 2, the task will be “changed”. If it is something other than 0 or 2, it is failed. Thats all!

The 2nd task is actually only showing us some information when the first task is failed. I only want to see the output when the puppet agent run fails for some reason. You don’t have to use this task, as this only prints some information.

Output of the Ansible playbook when everything is ok:

[puppet-zabbix-nightly-provision] $ /bin/sh -xe /tmp/hudson5840383976762038524.sh
+ cd /opt/jenkins/environment-ansible
+ ansible-playbook -i hosts -l vserver-142 playbook/puppet-run.yml

PLAY [vserver-142] ************************************************************ 

GATHERING FACTS ***************************************************************
ok: [vserver-142]

TASK: [Start puppet agent] ****************************************************
changed: [vserver-142]

TASK: [puppet output] *********************************************************
skipping: [vserver-142]

PLAY RECAP ********************************************************************
vserver-142                : ok=2    changed=1    unreachable=0    failed=0   

[puppet-zabbix-nightly-provision] $

Everything looks good, like I suspected. Now an example when something goes wrong:

[puppet-zabbix-nightly-provision] $ /bin/sh -xe /tmp/hudson1324121987798922302.sh
+ cd /opt/jenkins/environment-ansible
+ ansible-playbook -i hosts -l vserver-142 playbook/puppet-run.yml

PLAY [vserver-142] ************************************************************ 

GATHERING FACTS ***************************************************************
ok: [vserver-142]

TASK: [Start puppet agent] ****************************************************
failed: [vserver-142] =&gt; {&quot;changed&quot;: false, &quot;cmd&quot;: &quot;/usr/bin/puppet agent --test --verbose --detailed-exitcodes&quot;, &quot;delta&quot;: &quot;0:00:04.745918&quot;, &quot;end&quot;: &quot;2015-01-31 15:08:06.708110&quot;, &quot;failed&quot;: true, &quot;failed_when_result&quot;: true, &quot;rc&quot;: 1, &quot;start&quot;: &quot;2015-01-31 15:08:01.962192&quot;, &quot;stdout_lines&quot;: [&quot;\u001b[0;32mInfo: Retrieving pluginfacts\u001b[0m&quot;, &quot;\u001b[0;32mInfo: Retrieving plugin\u001b[0m&quot;, &quot;\u001b[0;32mInfo: Loading facts\u001b[0m&quot;], &quot;warnings&quot;: []}
stderr: [1;31mError: Could not retrieve catalog from remote server: Error 400 on SERVER: unrecognized database type for server. at /etc/puppet/environments/master/modules/zabbix/manifests/web.pp:161 on node vserver-142.dj-wasabi.local[0m
[1;31mWarning: Not using cache on failed catalog[0m
[1;31mError: Could not retrieve catalog; skipping run[0m
stdout: [0;32mInfo: Retrieving pluginfacts[0m
[0;32mInfo: Retrieving plugin[0m
[0;32mInfo: Loading facts[0m

FATAL: all hosts have already failed -- aborting

PLAY RECAP ********************************************************************
           to retry, use: --limit @/var/lib/jenkins/puppet-run.retry

vserver-142                : ok=1    changed=0    unreachable=0    failed=1

Ah, I made an error in my manifest.

Nice isn’t it? 🙂