nagios 被动检测

默北 Nagiosnagios 被动检测已关闭评论17,0116字数 2902阅读9分40秒阅读模式

默认情况下,nagios都是轮询主动的去检测客户端监控项。下面来说说nagios被动检测,也就是nagios客户端主动的将检查到的结果直接提交给nagios服务端。

对某些环境下,被动检测比主动检测好。例如,数据备份是否成功的监控。在我之前的工作中,数据备份后将备份结果写入到文件,nagios客户端检测该文件的信息来确定成功与否,这就存在一个问题,就是在备份周期内,nagios检测到备份不成功,不停地的发送告警通知,不胜其烦。对于这种情况,可以使用nagios被动检测 + 新鲜度来实现。文章源自运维生存时间-https://www.ttlsa.com/nagios/nagios-passive-detection/

下面来看看被动监控的配置:文章源自运维生存时间-https://www.ttlsa.com/nagios/nagios-passive-detection/

1. 开启被动监控

# vim /usr/local/nagios/etc/nagios.cfg
# accept_passive_service_checks = 1

2. 定义被动监控指令

# vim /usr/local/nagios/etc/objects/commands.cfg
define command {
    command_name check_dummy
    command_line $USER1$/check_dummy $ARG1$ $ARG2$
}

3. 定义要被动监控的主机

# vim /usr/local/nagios/etc/ttlsa.com.cfg
define service {
    use generic-service
    host_name www.ttlsa.com
    service_description BACKUP
    active_checks_enabled 0 
    passive_checks_enabled 1
    check_freshness 1
    freshness_threshold 86400
    check_command check_dummy!1!"Backup failed!No backups have run for 24 hours""
}

check_dummy指令实际上不检查任何东西,指定两个参数,一个是状态,一个是输出,始终返回这两个参数。文章源自运维生存时间-https://www.ttlsa.com/nagios/nagios-passive-detection/

# /usr/local/nagios/libexec/check_dummy 0 successful
OK: successful
# /usr/local/nagios/libexec/check_dummy 1 failed
WARNING: failed
# /usr/local/nagios/libexec/check_dummy 2 failed
CRITICAL: failed
# /usr/local/nagios/libexec/check_dummy 3 failed
UNKNOWN: failed

如果在freshness_threshold时间内,被动检测还没提交数据,check_command将运行,即使主动检测被禁用。文章源自运维生存时间-https://www.ttlsa.com/nagios/nagios-passive-detection/

4. 外部应用程序如何提交主机检查结果

外部应用程序可以通过编写一个PROCESS_HOST_CHECK_RESULT外部命令外部命令文件提交主机检查结果给Nagios 。命令的格式如下:文章源自运维生存时间-https://www.ttlsa.com/nagios/nagios-passive-detection/

[<timestamp>] PROCESS_HOST_CHECK_RESULT;<host_name>;<host_status>;<plugin_output>文章源自运维生存时间-https://www.ttlsa.com/nagios/nagios-passive-detection/

timestamp:unix时间戳文章源自运维生存时间-https://www.ttlsa.com/nagios/nagios-passive-detection/

host_name:监控的主机地址文章源自运维生存时间-https://www.ttlsa.com/nagios/nagios-passive-detection/

host_status:主机的状态( 0 = OK,1 = WARNING,2 =CRITICAL,3 = UNKNOWN)文章源自运维生存时间-https://www.ttlsa.com/nagios/nagios-passive-detection/

plugin_output:主机检查的文本输出文章源自运维生存时间-https://www.ttlsa.com/nagios/nagios-passive-detection/

如:文章源自运维生存时间-https://www.ttlsa.com/nagios/nagios-passive-detection/

# CHECK="[`date +%s`] PROCESS_SERVICE_CHECK_RESULT;10.0.0.166;BACKUP;0;Nightly backups were successful"
# echo $CHECK >>/usr/local/nagios/var/rw/nagios.cmd

5. 被动检测客户端

如果是在同一台nagios服务器上,可以直接通过上面的外部指令提交被动检测结果。那如果是在远程主机上呢,应用程序没法做到,为了让远程主机能够发送被动检查结果给nagios,可以使用NSCA插件。该插件包含了对nagios主机和从远程主机上执行的客户端运行的守护进程。该守护程序将监听来自远程客户端的连接,在提交的结果进行一些基本的验证,然后直接写检查结果到外部命令。文章源自运维生存时间-https://www.ttlsa.com/nagios/nagios-passive-detection/

需要在nagios服务端和客户端都安装NSCA文章源自运维生存时间-https://www.ttlsa.com/nagios/nagios-passive-detection/

服务端:文章源自运维生存时间-https://www.ttlsa.com/nagios/nagios-passive-detection/

# wget http://downloads.sourceforge.net/project/nagios/nsca-2.x/nsca-2.7.2/nsca-2.7.2.tar.gz
# tar -xzf nsca-2.7.2.tar.gz
# cd nsca-2.7.2
# ./configure
# make
# cp src/nsca /usr/local/nagios/bin
# cp sample-config/nsca.cfg /usr/local/nagios/etc
# vi /usr/local/nagios/etc/nsca.cfg
password=www.ttlsa.com
# /usr/local/nagios/bin/nsca -c  /usr/local/nagios/etc/nsca.cfg --single

客户端:文章源自运维生存时间-https://www.ttlsa.com/nagios/nagios-passive-detection/

# wget http://downloads.sourceforge.net/project/nagios/nsca-2.x/nsca-2.7.2/nsca-2.7.2.tar.gz
# tar -xzf nsca-2.7.2.tar.gz
# cd nsca-2.7.2
# ./configure
# make
# mkdir -p /usr/local/bin /usr/local/etc
# cp src/send_nsca /usr/local/bin
# cp sample-config/send_nsca.cfg /usr/local/etc
# vi /usr/local/etc/send_nsca.cfg
password=www.ttlsa.com
# CHECK="10.0.0.166\tBACKUP\t0\tBackup was successful, this check submitted by NSCA\n"
# echo -en $CHECK | send_nsca -c /usr/local/etc/send_nsca.cfg -H 10.0.100.125
1 data packet(s) sent to host successfully.

10.0.0.166 被监控服务器;10.0.100.125监控服务器。文章源自运维生存时间-https://www.ttlsa.com/nagios/nagios-passive-detection/

安装在监控服务器上的NSCA守护进程监听客户端send_nsca提交的服务信息,验证密码是否正确的,数据格式是否符合标准。数据格式如下:文章源自运维生存时间-https://www.ttlsa.com/nagios/nagios-passive-detection/

<host_name>\t<service_description>\t<check_result>\t<check_output>\n文章源自运维生存时间-https://www.ttlsa.com/nagios/nagios-passive-detection/

check_result:检测状态 0 = OK,1 = WARNING,2 =CRITICAL,3 = UNKNOWN文章源自运维生存时间-https://www.ttlsa.com/nagios/nagios-passive-detection/

一旦收到提交的数据,就会被翻译并写入到外部命令文件/usr/local/nagios/var/rw/nagios.cmd,并作为一个本地提交的被动检查。文章源自运维生存时间-https://www.ttlsa.com/nagios/nagios-passive-detection/

weinxin
我的微信
微信公众号
扫一扫关注运维生存时间公众号,获取最新技术文章~
默北
  • 本文由 发表于 15/06/2014 01:00:27
  • 转载请务必保留本文链接:https://www.ttlsa.com/nagios/nagios-passive-detection/