Monitoring the VMware ESXi with Nagios (1)でVMware-vSphere-SDK-for-Perl-4.0.0-161974.i386.tar.gzとcheck_vmware_api.plをインストールしましたが、 Nagios::Plugin perl moduleがインストールされていなかったのでインストールを行います
How do I use the Nagios::Plugin perl module?
http://nagiosplugins.org/faq/development/nagios-plugin-perl
インストール方法は
■from the nagiosplug tarball with an extra configure option
■from CPAN
がありますが、CPANはよくわからない(^^;のでtarboolからインストールします
Official Nagios Plugins
http://www.nagios.org/download/plugins
- nagios-plugins-1.4.15.tar.gzを展開
- configure
- make all
- make install
- check_vmware_api.plの実行
- check_vmware_api.plの変更
- check_vmware_api.plの実行
- VMware ESXiの確認
1 2 |
[root@host1 src]# tar xvfz nagios-plugins-1.4.15.tar.gz [root@host1 src]# cd nagios-plugins-1.4.15 |
オプション–enable-perl-modulesを指定してconfigureを行います
1 2 3 4 5 6 7 8 |
[root@host1 nagios-plugins-1.4.15]# ./configure --enable-perl-modules checking for a BSD-compatible install... /usr/bin/install -c checking whether build environment is sane... yes checking for a thread-safe mkdir -p... /bin/mkdir -p checking for gawk... gawk (snip) --with-trusted-path: /bin:/sbin:/usr/bin:/usr/sbin --enable-libtap: no |
1 2 3 4 5 6 7 8 |
[root@host1 nagios-plugins-1.4.15]# make all make all-recursive make[1]: ディレクトリ `/usr/local/src/nagios-plugins-1.4.15' に入ります Making all in gl make[2]: ディレクトリ `/usr/local/src/nagios-plugins-1.4.15/gl' に入ります (snip) make[2]: ディレクトリ `/usr/local/src/nagios-plugins-1.4.15' から出ます make[1]: ディレクトリ `/usr/local/src/nagios-plugins-1.4.15' から出ます |
Nagios::Plugin perl moduleは/usr/local/nagios/perl配下にインストールされています
1 2 3 4 5 6 7 8 |
[root@host1 nagios-plugins-1.4.15]# make install Making install in gl make[1]: ディレクトリ `/usr/local/src/nagios-plugins-1.4.15/gl' に入ります make install-recursive make[2]: ディレクトリ `/usr/local/src/nagios-plugins-1.4.15/gl' に入ります (snip) make[2]: ディレクトリ `/usr/local/src/nagios-plugins-1.4.15' から出ます make[1]: ディレクトリ `/usr/local/src/nagios-plugins-1.4.15' から出ます |
さて、Nagios::Pluginもインストールできたので改めてcheck_vmware_api.plを実行してみますが相変わらずCan’t locate Nagios/Plugin/Functions.pmのエラーで実行できません
1 2 3 |
[root@host1 ~]# /usr/local/src/libexec/check_vmware_api.pl Can't locate Nagios/Plugin/Functions.pm in @INC (@INC contains: /usr/lib/perl5/site_perl/5.8.8/i386-linux-thread-multi /usr/lib/perl5/site_perl/5.8.8 /usr/lib/perl5/site_perl /usr/lib/perl5/vendor_perl/5.8.8/i386-linux-thread-multi /usr/lib/perl5/vendor_perl/5.8.8 /usr/lib/perl5/vendor_perl /usr/lib/perl5/5.8.8/i386-linux-thread-multi /usr/lib/perl5/5.8.8 .) at libexec/check_vmware_api.pl line 31. BEGIN failed--compilation aborted at libexec/check_vmware_api.pl line 31. |
いろいろ調べると単純にpathが通っていないだけのようで export PERL5LIB=XXXXXで追加すればいけそうですが、手っ取り早くcheck_vmware_api.plを変更してしまいました
Basic – Perl5lib
http://www.perlmonks.org/?node_id=867860
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 |
[root@host1 nagios]# diff -c ~root/op5plugins/check_vmware_api.pl /usr/local/nagios /libexec/check_vmware_api.pl *** /root/op5plugins/check_vmware_api.pl 2013-08-18 14:52:22.000000000 +0900 --- /usr/local/nagios/libexec/check_vmware_api.pl 2013-08-20 13:14:32.000000000 +0900 *************** *** 25,30 **** --- 25,40 ---- # along with this program. If not, see <http://www.gnu.org/licenses/>. # + BEGIN { + $my_module_dir = '/usr/local/nagios/perl/lib'; + push(@INC, ('.',"$my_module_dir")); + } + + BEGIN { + $my_module_dir = '/usr/local/nagios/perl/lib/i386-linux-thread-multi'; + push(@INC, ('.',"$my_module_dir")); + } + use strict; use warnings; use vars qw($PROGNAME $VERSION $output $values $result $defperfargs); |
エラーが無くなって使用方法が表示されました(^_^V
1 2 3 4 5 6 7 |
[root@host1 ~]# /usr/local/nagios/libexec/check_vmware_api.pl Usage: check_vmware_api.pl -D <data_center> | -H <host_name> [ -C <cluster_name> ] [ -N <vm_name> ] -u <user> -p <pass> | -f <authfile> -l <command> [ -s <subcommand> ] [ -T <timeshift> ] [ -i <interval> ] [ -x <black_list> ] [ -o <additional_options> ] [ -t <timeout> ] [ -w <warn_range> ] [ -c <crit_range> ] [ -V ] [ -h ] |
事務所のVMware ESXi 5.1U1に対してcheck_vmware_api.plを実行してみます
cpu, memory情報が取得できましたので次回は実際にNagiosに組み込んでみます
cpu情報
1 2 |
[root@host1 ~]# /usr/local/nagios/libexec/check_vmware_api.pl -H 192.168.1.10 -u root -p password -l cpu CHECK_VMWARE_API.PL OK - cpu usage=798.00 MHz (7.50%) | cpu_usagemhz=798.00Mhz;; cpu_usage=7.50%;; |
メモリ情報
1 2 |
[root@host1 ~]# /usr/local/nagios/libexec/check_vmware_api.pl -H 192.168.1.10 -u root -p password -l mem CHECK_VMWARE_API.PL OK - mem usage=6352.55 MB (78.79%), overhead=658.84 MB, swapped=0.00 MB, memctl=0.00 MB | mem_usagemb=6352.55MB;; mem_usage=78.79%;; mem_overhead=658.84MB;; mem_swap=0.00MB;; mem_memctl=0.00MB;; |
ヘルプを見ると凄い高機能です
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 296 297 298 299 |
[root@host1 ~]# /usr/local/nagios/libexec/check_vmware_api.pl -h check_vmware_api.pl 0.7.0 This nagios plugin is free software, and comes with ABSOLUTELY NO WARRANTY. It may be used, redistributed and/or modified under the terms of the GNU General Public Licence (see http://www.fsf.org/licensing/licenses/gpl.txt). VMWare Infrastructure plugin Usage: check_vmware_api.pl -D <data_center> | -H <host_name> [ -C <cluster_name> ] [ -N <vm_name> ] -u <user> -p <pass> | -f <authfile> -l <command> [ -s <subcommand> ] [ -T <timeshift> ] [ -i <interval> ] [ -x <black_list> ] [ -o <additional_options> ] [ -t <timeout> ] [ -w <warn_range> ] [ -c <crit_range> ] [ -V ] [ -h ] -?, --usage Print usage information -h, --help Print detailed help screen -V, --version Print version information --extra-opts=[section][@file] Read options from an ini file. See http://nagiosplugins.org/extra-opts for usage -H, --host=<hostname> ESX or ESXi hostname. -C, --cluster=<clustername> ESX or ESXi clustername. -D, --datacenter=<DCname> Datacenter hostname. -N, --name=<vmname> Virtual machine name. -u, --username=<username> Username to connect with. -p, --password=<password> Password to use with the username. -f, --authfile=<path> Authentication file with login and password. File syntax : username=<login> password=<password> -w, --warning=THRESHOLD Warning threshold. See http://nagiosplug.sourceforge.net/developer-guidelines.html#THRESHOLDFORMAT for the threshold format. -c, --critical=THRESHOLD Critical threshold. See http://nagiosplug.sourceforge.net/developer-guidelines.html#THRESHOLDFORMAT for the threshold format. -l, --command=COMMAND Specify command type (CPU, MEM, NET, IO, VMFS, RUNTIME, ...) -s, --subcommand=SUBCOMMAND Specify subcommand -S, --sessionfile=SESSIONFILE Specify a filename to store sessions for faster authentication -x, --exclude=<black_list> Specify black list -o, --options=<additional_options> Specify additional command options (quickstats, ...) -T, --timestamp=<timeshift> Timeshift in seconds that could fix issues with "Unknown error". Use values like 5, 10, 20, etc -i, --interval=<sampling period> Sampling Period in seconds. Basic historic intervals: 300, 1800, 7200 or 86400. See config for any changes. Supports literval values to autonegotiate interval value: r - realtime interval, h<number> - historical interval specified by position. Default value is 20 (realtime). Since cluster does not have realtime stats interval other than 20(default realtime) is mandatory. -M, --maxsamples=<max sample count> Maximum number of samples to retrieve. Max sample number is ignored for historic intervals. Default value is 1 (latest available sample). --trace=<level> Set verbosity level of vSphere API request/respond trace -t, --timeout=INTEGER Seconds before plugin times out (default: 30) -v, --verbose Show details for command-line debugging (can repeat up to 3 times) Supported commands(^ - blank or not specified parameter, o - options, T - timeshift value, x - blacklist) : VM specific : * cpu - shows cpu info + usage - CPU usage in percentage + usagemhz - CPU usage in MHz + wait - CPU wait time in ms + ready - CPU ready time in ms ^ all cpu info(no thresholds) * mem - shows mem info + usage - mem usage in percentage + usagemb - mem usage in MB + swap - swap mem usage in MB + swapin - swapin mem usage in MB + swapout - swapout mem usage in MB + overhead - additional mem used by VM Server in MB + overall - overall mem used by VM Server in MB + active - active mem usage in MB + memctl - mem used by VM memory control driver(vmmemctl) that controls ballooning ^ all mem info(except overall and no thresholds) * net - shows net info + usage - overall network usage in KBps(Kilobytes per Second) + receive - receive in KBps(Kilobytes per Second) + send - send in KBps(Kilobytes per Second) ^ all net info(except usage and no thresholds) * io - shows disk I/O info + usage - overall disk usage in MB/s + read - read latency in ms (totalReadLatency.average) + write - write latency in ms (totalWriteLatency.average) ^ all disk io info(no thresholds) * runtime - shows runtime info + con - connection state + cpu - allocated CPU in MHz + mem - allocated mem in MB + state - virtual machine state (UP, DOWN, SUSPENDED) + status - overall object status (gray/green/red/yellow) + consoleconnections - console connections to VM + guest - guest OS status, needs VMware Tools + tools - VMWare Tools status + issues - all issues for the host ^ all runtime info(except con and no thresholds) Host specific : * cpu - shows cpu info + usage - CPU usage in percentage o quickstats - switch for query either PerfCounter values or Runtime info + usagemhz - CPU usage in MHz o quickstats - switch for query either PerfCounter values or Runtime info ^ all cpu info o quickstats - switch for query either PerfCounter values or Runtime info * mem - shows mem info + usage - mem usage in percentage o quickstats - switch for query either PerfCounter values or Runtime info + usagemb - mem usage in MB o quickstats - switch for query either PerfCounter values or Runtime info + swap - swap mem usage in MB o listvm - turn on/off output list of swapping VM's + overhead - additional mem used by VM Server in MB + overall - overall mem used by VM Server in MB + memctl - mem used by VM memory control driver(vmmemctl) that controls ballooning o listvm - turn on/off output list of ballooning VM's ^ all mem info(except overall and no thresholds) * net - shows net info + usage - overall network usage in KBps(Kilobytes per Second) + receive - receive in KBps(Kilobytes per Second) + send - send in KBps(Kilobytes per Second) + nic - makes sure all active NICs are plugged in ^ all net info(except usage and no thresholds) * io - shows disk io info + aborted - aborted commands count + resets - bus resets count + read - read latency in ms (totalReadLatency.average) + write - write latency in ms (totalWriteLatency.average) + kernel - kernel latency in ms + device - device latency in ms + queue - queue latency in ms ^ all disk io info * vmfs - shows Datastore info + (name) - free space info for datastore with name (name) o used - output used space instead of free o breif - list only alerting volumes o regexp - whether to treat name as regexp o blacklistregexp - whether to treat blacklist as regexp x - blacklist VMFS's T (value) - timeshift to detemine if we need to refresh ^ all datastore info o used - output used space instead of free o breif - list only alerting volumes o blacklistregexp - whether to treat blacklist as regexp x - blacklist VMFS's T (value) - timeshift to detemine if we need to refresh * runtime - shows runtime info + con - connection state + health - checks cpu/storage/memory/sensor status o listitems - list all available sensors(use for listing purpose only) o blackregexpflag - whether to treat blacklist as regexp x - blacklist status objects + storagehealth - storage status check o blackregexpflag - whether to treat blacklist as regexp x - blacklist status objects + temperature - temperature sensors o blackregexpflag - whether to treat blacklist as regexp x - blacklist status objects + sensor - threshold specified sensor + maintenance - shows whether host is in maintenance mode + list(vm) - list of VMWare machines and their statuses + status - overall object status (gray/green/red/yellow) + issues - all issues for the host x - blacklist issues ^ all runtime info(health, storagehealth, temperature and sensor are represented as one value and no thresholds) * service - shows Host service info + (names) - check the state of one or several services specified by (names), syntax for (names):<service1>,<service2>,...,<serviceN> ^ show all services * storage - shows Host storage info + adapter - list bus adapters x - blacklist adapters + lun - list SCSI logical units x - blacklist LUN's + path - list logical unit paths x - blacklist paths ^ show all storage info * uptime - shows Host uptime o quickstats - switch for query either PerfCounter values or Runtime info * device - shows Host specific device info + cd/dvd - list vm's with attached cd/dvd drives o listall - list all available devices(use for listing purpose only) DC specific : * cpu - shows cpu info + usage - CPU usage in percentage o quickstats - switch for query either PerfCounter values or Runtime info + usagemhz - CPU usage in MHz o quickstats - switch for query either PerfCounter values or Runtime info ^ all cpu info o quickstats - switch for query either PerfCounter values or Runtime info * mem - shows mem info + usage - mem usage in percentage o quickstats - switch for query either PerfCounter values or Runtime info + usagemb - mem usage in MB o quickstats - switch for query either PerfCounter values or Runtime info + swap - swap mem usage in MB + overhead - additional mem used by VM Server in MB + overall - overall mem used by VM Server in MB + memctl - mem used by VM memory control driver(vmmemctl) that controls ballooning ^ all mem info(except overall and no thresholds) * net - shows net info + usage - overall network usage in KBps(Kilobytes per Second) + receive - receive in KBps(Kilobytes per Second) + send - send in KBps(Kilobytes per Second) ^ all net info(except usage and no thresholds) * io - shows disk io info + aborted - aborted commands count + resets - bus resets count + read - read latency in ms (totalReadLatency.average) + write - write latency in ms (totalWriteLatency.average) + kernel - kernel latency in ms + device - device latency in ms + queue - queue latency in ms ^ all disk io info * vmfs - shows Datastore info + (name) - free space info for datastore with name (name) o used - output used space instead of free o breif - list only alerting volumes o regexp - whether to treat name as regexp o blacklistregexp - whether to treat blacklist as regexp x - blacklist VMFS's T (value) - timeshift to detemine if we need to refresh ^ all datastore info o used - output used space instead of free o breif - list only alerting volumes o blacklistregexp - whether to treat blacklist as regexp x - blacklist VMFS's T (value) - timeshift to detemine if we need to refresh * runtime - shows runtime info + list(vm) - list of VMWare machines and their statuses + listhost - list of VMWare esx host servers and their statuses + listcluster - list of VMWare clusters and their statuses + tools - VMWare Tools status x - blacklist VM's + status - overall object status (gray/green/red/yellow) + issues - all issues for the host x - blacklist issues ^ all runtime info(except cluster and tools and no thresholds) * recommendations - shows recommendations for cluster + (name) - recommendations for cluster with name (name) ^ all clusters recommendations Cluster specific : * cpu - shows cpu info + usage - CPU usage in percentage + usagemhz - CPU usage in MHz ^ all cpu info * mem - shows mem info + usage - mem usage in percentage + usagemb - mem usage in MB + swap - swap mem usage in MB o listvm - turn on/off output list of swapping VM's + memctl - mem used by VM memory control driver(vmmemctl) that controls ballooning o listvm - turn on/off output list of ballooning VM's ^ all mem info(plus overhead and no thresholds) * cluster - shows cluster services info + effectivecpu - total available cpu resources of all hosts within cluster + effectivemem - total amount of machine memory of all hosts in the cluster + failover - VMWare HA number of failures that can be tolerated + cpufainess - fairness of distributed cpu resource allocation + memfainess - fairness of distributed mem resource allocation ^ only effectivecpu and effectivemem values for cluster services * runtime - shows runtime info + list(vm) - list of VMWare machines in cluster and their statuses + listhost - list of VMWare esx host servers in cluster and their statuses + status - overall cluster status (gray/green/red/yellow) + issues - all issues for the cluster x - blacklist issues ^ all cluster runtime info * vmfs - shows Datastore info + (name) - free space info for datastore with name (name) o used - output used space instead of free o breif - list only alerting volumes o regexp - whether to treat name as regexp o blacklistregexp - whether to treat blacklist as regexp x - blacklist VMFS's T (value) - timeshift to detemine if we need to refresh ^ all datastore info o used - output used space instead of free o breif - list only alerting volumes o blacklistregexp - whether to treat blacklist as regexp x - blacklist VMFS's T (value) - timeshift to detemine if we need to refresh Copyright (c) 2008-2013 op5 AB |