Zabbix nodata trigger, really a lifesaver

Zabbix has a lot of triggers you can use for your environment. But when it comes to the most important checks, I’ll update the current trigger and create an “or” statement with the “nodata” trigger.

It happened for me a few times when Zabbix didn’t notify me about some checks that went into PROBLEM while the service/program wasn’t responding. In all of these cases, I had it configures with the “last” trigger. All of them said it was ok, but the latest update was at least 30 minutes ago (Some even longer!). I slept well though, but my start of the day wasn’t really good. 🙂

I did find that most of these kind of checks were items that used scripts which were created by myself or by my colleagues. These were simple scripts in bash or python, which had no proper way of exiting due to an timeout.

So the first and most easy way is to update the trigger with the “nodata” configuration. I use an “default” of 5 minutes, if I don’t have any data the trigger can be fired. An good example is the trigger for the zabbix-agent:

{Template App Zabbix Agent:agent.ping.nodata(5m)}=1

When there is no data retrieved in the last 5 minutes, it will be fired. So I’ve updated the trigger for my Apache template like this:

{Template Apache:apache.run.status[{HOST.NAME},'localhost'].last()}=0 or {Template Apache:apache.run.status[{HOST.NAME},'localhost'].nodata(5m)}=1

If the last value of the item apache.run.status is 0 or there is no data retrieved in the last 5 minutes, then we can assume Apache is down.

But these scripts needs to be able to exit within the TimeOut parameter in the configuration file. So each script should have an timeout and when this timeout occurs, it should print some information which Zabbix can handle. (Not an python stacktrace for example. 🙂 )

I know the “nodata” trigger has an few drawbacks. It do add some extra load on the server for these checks. I believe the “nodata” triggers are checked every 30 seconds, but I try to use the nodata only for those critical (Or Disaster in Zabbix terms 🙂 ) triggers.

Another one, but happens (hopefully) not that much, when the Zabbix-server is in some kind of maintenance or had some problems (Or when you have an Zabbix proxy which doesn’t send the data to the server) a lot of triggers are fired when the times has past. So you just updated the Zabbix-server with the latest (linux/kernel) patches and after say 10 minutes the Zabbix-server is up and running again, all of those nodata triggers are fired. 🙂

But anyways, it is an lifesaver! 🙂

Advertisements

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s