Installing zabbix-agent with Ansible

ansible_logo_black_squarezabbix_logo

Not only I have an puppet module for installing Zabbix, I also have some Ansible roles for this. At the moment there are 4 roles:

In this blog item, we talk about the “zabbix-agent” role. The latest version is 0.2.0.

Defaults

Installing this role is very easy:


ansible-galaxy install dj-wasabi.zabbix-agent

It will be installed in your roles directory. Default is “/etc/ansible/roles” or whatever you have configured in the ansible.cfg file. After installation there is only 1 (or 2 when you make use of active items) parameters needed for making this role work:

agent_server: <IP_-_FQDN_OF_ZABBIX_SERVER>
agent_serveractive: <IP_-_FQDN_OF_ZABBIX_SERVER>

This will need the ip address or the FQDN of the “zabbix-server”.

OS?

This role works on several operating systems/families:

  • RedHat
  • Debian
  • Ubuntu
  • OpenSuse

If you have an operating system/family which isn’t in the list above, you can create an issue at the Github page and please fill in the request. I can’t make any guarantee that it will come, but I can try it. Or if you do have some Ansible skills, please create an Pull request and I would be happy to accept it. 🙂

Playbook

So, how does the playbook looks like? Like this:

- hosts: all
  sudo: yes
  roles:
   - role: dj-wasabi.zabbix-agent
     agent_server: <IP_-_FQDN_OF_ZABBIX_SERVER>
     agent_serveractive: <IP_-_FQDN_OF_ZABBIX_SERVER>

As you see it is very basic and does the job very good. This only installs the agent on the specific server and configures the configuration file. But we really want to automate everything right?

Cove

Few weeks ago I found this pull requests for the “ansible-modules-extra” repository. This pull requests had an few ansible modules which made sure that you can use the Zabbix API to create or update hosts configuration. In the pull requests there were something like 5 modules, but this Ansible role only use 3 of them. With this role, you can create the following:

  • host groups
  • Host itself.
  • Macros for the host

For now, when the host is created, it will only create the “zabbix interface”. Maybe with the next release I’ll make sure you can also create SNMP, JMX and IPMI interfaces.

How do we have to configure it? Something like this. You will have to change it to your environment.

- hosts: wdserver00
  roles:
     - role: zabbix-agent
       agent_server: 192.168.1.1
       agent_serveractive: 192.168.1.1
       zabbix_url: http://zabbix.example.com
       zabbix_api_use: true
       zabbix_api_user: Admin
       zabbix_api_pass: Zabbix
       zabbix_create_host: present
       zabbix_host_groups:
         - Linux servers
       zabbix_link_templates:
         - Template OS Linux
       zabbix_macros:
         - macro_key: apache_type
           macro_value: reverse_proxy

I’ll skip the first 2 parameters, as these are described earlier on this page.

zabbix_url: The url on which the Zabbix web interface is available.
zabbix_api_user: The username which will connect to the API.
zabbix_api_pass: The password for the “zabbix_api_user” user.
zabbix_create_host: present if we want to create the host, absent if we want to delete it.
zabbix_host_groups: List of hostgroup where this host belongs to.
zabbix_link_templates: List of templates which will be linked to the host.
zabbix_macros: key, value pair of macros that will be used by the host. 

When we run Ansible, we will see at the end of the run:

.. <skip> ..
TASK: [zabbix-agent | Create hostgroups] **************************************
ok: [wdserver00 -> 127.0.0.1]

TASK: [zabbix-agent | Create a new host or update an existing host's info] ****
changed: [wdserver00 -> 127.0.0.1]

TASK: [zabbix-agent | Updating host configuration with macros] ****************
changed: [wdserver00 -> 127.0.0.1] => (item={'macro_key': 'apache_type', 'macro_value': 'reverse_proxy'})

Nice! If you check the Web interface, you’ll see that the host is created with the correct host groups and templates. If not, you’ll see some error messages in the Ansible output which will say what went wrong.

This role isn’t perfect, so if you encounter an bug or found/have and enhancement, please create an Pull request at Github and I’ll accept it. We can all make this role beter. 🙂

Side note:

There are more parameters which can be overridden, please check the “defaults/main.yml” file or the README.

Advertisements

Ansible executing puppet agent

ansible_logo_black_squarepuppet

I manage my own environment with Ansible, which is really great! This yaml format describing what you want to do is easy to read, understand and even easy to maintain. If you can automate an specific action or just simply executing commands one by one, you can do it with Ansible.

So in my own home environment, I have to execute the puppet agent command a few times. My CI for the wdijkerman-zabbix environment consists of a few steps. One of those steps is executing the puppet agent command on a specific host. (Maybe I will describe my CI process in an blog item later.. 🙂 )

When you try to combine them, you’ll notice that every ansible run for executing the puppet agent command fails. (No worries, I was there before .. 🙂 ) When an puppet agent runs, it ends with different exit codes. Normally when an script, program or commands ends successfully, it has an exit code of 0. Ansible uses this to determine if an action is ok, changed or failed. But puppet uses it slightly different.

According to the puppet agent man page (click):

Provide transaction information via exit codes. If this is enabled, an exit code of ‘2’ means there were changes, an exit code of ‘4’ means there were failures during the transaction, and an exit code of ‘6’ means there were both changes and failures.

With this in mind, we now have the following 2 tasks in Ansible:

  - name: "Start puppet agent"
    shell: /usr/bin/puppet agent --test --verbose --detailed-exitcodes
    register: puppet_agent
    changed_when: puppet_agent.rc == 2
    failed_when: puppet_agent.rc != 2 and puppet_agent.rc != 0

  - name: "puppet output"
    debug: var=puppet_agent.stdout_lines
    when: puppet_agent|failed

The first task is the most important one. We register an variable, which will be used in this task for checking exit codes. We let Ansible know that if the exit code of the puppet agent command is an 2, the task will be “changed”. If it is something other than 0 or 2, it is failed. Thats all!

The 2nd task is actually only showing us some information when the first task is failed. I only want to see the output when the puppet agent run fails for some reason. You don’t have to use this task, as this only prints some information.

Output of the Ansible playbook when everything is ok:

[puppet-zabbix-nightly-provision] $ /bin/sh -xe /tmp/hudson5840383976762038524.sh
+ cd /opt/jenkins/environment-ansible
+ ansible-playbook -i hosts -l vserver-142 playbook/puppet-run.yml

PLAY [vserver-142] ************************************************************ 

GATHERING FACTS ***************************************************************
ok: [vserver-142]

TASK: [Start puppet agent] ****************************************************
changed: [vserver-142]

TASK: [puppet output] *********************************************************
skipping: [vserver-142]

PLAY RECAP ********************************************************************
vserver-142                : ok=2    changed=1    unreachable=0    failed=0   

[puppet-zabbix-nightly-provision] $

Everything looks good, like I suspected. Now an example when something goes wrong:

[puppet-zabbix-nightly-provision] $ /bin/sh -xe /tmp/hudson1324121987798922302.sh
+ cd /opt/jenkins/environment-ansible
+ ansible-playbook -i hosts -l vserver-142 playbook/puppet-run.yml

PLAY [vserver-142] ************************************************************ 

GATHERING FACTS ***************************************************************
ok: [vserver-142]

TASK: [Start puppet agent] ****************************************************
failed: [vserver-142] =&gt; {&quot;changed&quot;: false, &quot;cmd&quot;: &quot;/usr/bin/puppet agent --test --verbose --detailed-exitcodes&quot;, &quot;delta&quot;: &quot;0:00:04.745918&quot;, &quot;end&quot;: &quot;2015-01-31 15:08:06.708110&quot;, &quot;failed&quot;: true, &quot;failed_when_result&quot;: true, &quot;rc&quot;: 1, &quot;start&quot;: &quot;2015-01-31 15:08:01.962192&quot;, &quot;stdout_lines&quot;: [&quot;\u001b[0;32mInfo: Retrieving pluginfacts\u001b[0m&quot;, &quot;\u001b[0;32mInfo: Retrieving plugin\u001b[0m&quot;, &quot;\u001b[0;32mInfo: Loading facts\u001b[0m&quot;], &quot;warnings&quot;: []}
stderr: [1;31mError: Could not retrieve catalog from remote server: Error 400 on SERVER: unrecognized database type for server. at /etc/puppet/environments/master/modules/zabbix/manifests/web.pp:161 on node vserver-142.dj-wasabi.local[0m
[1;31mWarning: Not using cache on failed catalog[0m
[1;31mError: Could not retrieve catalog; skipping run[0m
stdout: [0;32mInfo: Retrieving pluginfacts[0m
[0;32mInfo: Retrieving plugin[0m
[0;32mInfo: Loading facts[0m

FATAL: all hosts have already failed -- aborting

PLAY RECAP ********************************************************************
           to retry, use: --limit @/var/lib/jenkins/puppet-run.retry

vserver-142                : ok=1    changed=0    unreachable=0    failed=1

Ah, I made an error in my manifest.

Nice isn’t it? 🙂