最後更新: 2018-11-13
前言
munin 一共分兩個部份
- Client : munin-node <-- Daemon 來, 負責提供資料
- Server: munin <-- cron job來, 收集資料後建立一個 html 的報表
Install Munin
# Centos 6
# 安了 httpd24u 會出現 dependency 問題
yum install httpd24u
yum install munin
yum deplist munin
... dependency: munin-web-support provider: munin-apache.noarch 2.0.40-4.el6 provider: munin-nginx.noarch 2.0.40-4.el6 ...
# 略過 munin-apache 的安裝, 而改安 munin-nginx
yum install -x munin-apache munin
Remark
# munin-nginx 不會影響現在的系統 (httpd24u)
rpm -ql munin-nginx
/var/lib/munin/cgi-tmp /var/log/munin/munin-cgi-graph.log /var/log/munin/munin-cgi-html.log /var/log/munin/munin-graph.log
Client 設定檔
修改 /etc/munin/munin-node.conf
host_name myserver port 4949 <-- default allow ^127\.0\.0\.1$
Server 設定
# test 是 group, fw 是 host [test;fw] address YOUR_NODE_IP use_node_name yes // 由對方的 banner 得知 "list <hostname>" 的 hostname // e.g. telnet port 4949 後見到的 "banner # munin node at myserver" // 當 use_node_name no 時, 那會用 [abc;123] 內指定以 123 去 list <hostname> // 如果 hostname 決定不正確, 那要用 host_name <host name> 設定 // 用 snmp 要 use_node_name no port 4949 update yes // data-fetching for this "host"
list 的 example:
# munin node at SQL-DELL-00 list SQL-DELL-00 df memory processes network cpu hdd
建立 cron job:
執行:
crontab -u munin -e
# 每 5 min Server 會去收集 Client 的數據
*/5 * * * * munin-cron
# 在 Centos 6 不用建立 cron jobs, 因為本身有
因為在 /etc/cron.d/munin 內有
*/5 * * * * munin test -x /usr/bin/munin-cron && /usr/bin/munin-cron
munin-cron 會發動
1. munin-update
responsible for contacting all the agents (munin-nodes) and collecting their data.
2. munin-html
responsible for generating static HTML pages
3. munin-graph
creates graphs from all RRD files in the munin database directory
4. munin-limits
If the limits are breached, for instance, if a value moves from “ok” to “warning”, or from “critical” to “ok”,
it sends an event to any configured contacts.
Apache 設定
# Centos 6
/etc/httpd/conf.d/munin.conf
Alias /munin /var/www/html/munin <directory /var/www/html/munin> AuthUserFile /etc/munin/munin-htpasswd AuthName "Munin" AuthType Basic require valid-user ExpiresActive On ExpiresDefault M310 </directory> # ScriptAlias /munin-cgi/munin-cgi-graph /var/www/cgi-bin/munin-cgi-graph
在用 firefox
進入 munin 的頁面時, 常有以下情況:
就算按下 "允許" , 但它重新載入後又會彈出來
原因是 HTML Code 內有這一包
<meta http-equiv="refresh" content="300" />
煩了一會後, 才知這是 firefox(3.6) 的功能來, 它是可以關的
不選以下項目就可以了
工具 --> 選項 --> 進階 --> "當網站試圖重新導向或重新載入頁面時警告"
munin-node-configure
查看及設定 /etc/munin/plugins
munin-node-configure --help
munin-node-configure --version
version 1.4.6
- --config <file> Default: /etc/munin/munin-node.conf
- --servicedir <dir> Default: /etc/munin/plugins/
- --libdir <dir> Default: /usr/share/munin/plugins/
- --shell Show shell commands instead of a table
- --suggest Show suggestions instead of status
/usr/share/munin/plugins# ./if_ suggest
Status
munin-node-configure
Plugin | Used | Extra information ------ | ---- | ----------------- acpi | no | amavis | no | ........
Telnet Testing
查看 node 的 version
version
munins node on fw version: 1.0.4 (munin-lite)
查看有什麼 plugin 在行
list
cpu if_eth1 if_eth0 if_br_lan if_err_eth1 if_err_eth0 if_err_br_lan load memory processes uptime interrupts irqstats
fetch cpu
user.value 18787 nice.value 0 system.value 8068 idle.value 64711389 iowait.value 0 irq.value 0 softirq.value 100029 .
config cpu
config cpu graph_title CPU usage graph_order system user nice idle iowait irq softirq graph_args --base 1000 -r --lower-limit 0 --upper-limit 100 graph_vlabel % graph_scale no graph_info This graph shows how CPU time is spent. graph_category system graph_period second system.label system system.draw AREA system.max 5000 system.min 0 system.type DERIVE system.warning 30 system.critical 50 system.info CPU time spent by the kernel in system activities user.label user user.draw STACK user.min 0 user.max 5000 user.warning 80 user.type DERIVE user.info CPU time spent by normal programs and daemons nice.label nice nice.draw STACK nice.min 0 nice.max 5000 nice.type DERIVE nice.info CPU time spent by nice(1)d programs idle.label idle idle.draw STACK idle.min 0 idle.max 5000 idle.type DERIVE idle.info Idle CPU time iowait.label iowait iowait.draw STACK iowait.min 0 iowait.max 5000 irq.type DERIVE irq.info CPU time spent handling interrupts softirq.label softirq softirq.draw STACK softirq.min 0 softirq.max 5000 softirq.type DERIVE softirq.info CPU time spent handling batched interrupts
Plugin test
Debugging Munin plugins
在有 munin-node 的機上執行 "munin-run df"
i.e.
munin-run df
# Warning: Root privileges are required to change user/group. The plugin may not behave as expected. _dev_sda1.value 54.7673487506975 _dev_shm.value 0
munin-run df config
# Warning: Root privileges are required to change user/group. The plugin may not behave as expected. graph_title Disk usage in percent graph_args --upper-limit 100 -l 0 graph_vlabel % graph_scale no graph_category disk _dev_sda1.label / _dev_sda1.warning 92 _dev_sda1.critical 98 _dev_shm.label /dev/shm _dev_shm.warning 92 _dev_shm.critical 98
cron jobs 測試
Run
su - munin --shell=/bin/bash -c /usr/bin/munin-cron
Output
2012/11/07 10:45:38 [ERROR] Hostname 'myserver-vps17' contains illegal characters (http://en.wikipedia.org/wiki/Hostname#Restrictions_on_valid_host_names). Please fix this by replacing illegal characters with '-'. Remember to do it on both in the master configuration and on the munin-node. 2012/11/07 10:45:38 [ERROR] config error under [vps;myserver-vps17] for 'address IP' : [ERROR] Hostname 'myserver-vps17' contains illegal characters (http://en.wikipedia.org/wiki/Hostname#Restrictions_on_valid_host_names). Please fix this by replacing illegal characters with '-'. Remember to do it on both in the master configuration and on the munin-node. at /usr/share/munin/munin-update line 63
Debug
su - munin -c "/usr/lib/munin/munin-update --debug --nofork --stdout --host foo.example.com --service df"
quit
munin-check
fix permissions of Munin directories
-f | --fix-permissions
Example:
munin-check
# /var/www/html/munin/cgi : Wrong owner (root != munin) # /var/www/html/munin/cgi/munin-cgi-graph : Wrong owner (root != munin) # /var/www/html/munin/cgi/munin-cgi-html : Wrong owner (root != munin) check /var/lib/munin/datafile check /var/lib/munin/datafile.storable check /var/lib/munin/graphs check /var/lib/munin/htmlconf.storable check /var/lib/munin/limits check /var/lib/munin/localdomain check /var/lib/munin/munin-graph.stats check /var/lib/munin/munin-update.stats ......................................................................
Plugin - diskstats
- Disk IOs per device
- Disk latency per device
- Throughput per device
- Utilization per device
此 plugin 要以下 Path 的資料
- /proc/diskstats
- /sys/block/*/stat
Monitor specific devices
/etc/munin/plugin-conf.d/diskstats
[diskstats] env.include_only sda,sdb,cciss/c0d0
OR
[diskstats] env.exclude sdc,VGroot/LVswap
測試 diskstats:
munin-run diskstats
munin-limit
warning 及 critical setting 是由 munin-limits 負責的
修改 warning 及 critical 的設定如下
Format:
[plugin_name].[fieldname].(warning|critical) [value]
value: min:max, min: or :max
fieldnames: as "Internal name" below the graphs
Example:
[centos6;192.168.88.194] address 192.168.88.194 df._dev_mapper_VolGroup_lv_root.warning 97 df._dev_mapper_VolGroup_lv_root.critical 99 use_node_name yes
File
-rw-r--r-- 1 munin munin 50608 Sep 5 10:55 mydomain-df-_dev_sda1-g.rrd -rw-r--r-- 1 munin munin 50608 Sep 5 10:55 mydomain-df-_dev_sda2-g.rrd -rw-r--r-- 1 munin munin 50608 Sep 5 10:55 mydomain-df-tmpfs_dev_shm-g.rrd
Send E-Mail Alert
contact.email.command mail -s "Munin-notification for ${var:group} :: ${var:host}" [email protected]
Window 上的 munin client
DL: https://github.com/munin-monitoring/munin-node-win32/tree/1.6.0.0
munin-node.ini
[Plugins] ; Plugin Section, 1 enables plugin, 0 disables Disk=1 Memory=1 Processes=1 Network=1 MbmTemp=1 MbmVoltage=1 MbmFan=1 MbmMhz=1 SMART=0 HD=1 Cpu=1 SpeedFan=1 External=1 ExternalTimeout=5 [DiskPlugin] ; Default Warning and Critical values for % space used Warning=92 Critical=98 [ExternalPlugin] ; For External Plugins just add an entry with the path to the program to run ; It doesn't matter what the name of the name=value pair is Plugin01=C:\Users\Jory\Documents\Visual Studio Projects\munin-node\src\plugins\python\disk_free.py Plugin02=c:\1\runaway.py [PerfCounterPlugin_disktime] DropTotal=1 Object=LogicalDisk Counter=% Disk Time CounterFormat=double CounterMultiply=1.000000 GraphTitle=Disk Time GraphCategory=system GraphArgs=--base 1000 -l 0 GraphDraw=LINE [PerfCounterPlugin_processor] DropTotal=1 Object=Processor Counter=% Processor Time CounterFormat=double CounterMultiply=1.000000 GraphTitle=Processor Time GraphCategory=system GraphArgs=--base 1000 -l 0 GraphDraw=LINE [PerfCounterPlugin_uptime] ; This is a section for the Performance Counter plugin ; The Object and Counter settings are used to access the Performance Counter ; For uptime this would result in \System\System Up Time ; The Graph settings are reported to munin ; The DropTotal setting will drop the last instance from the list, which is often _Total ; Has no effect on single instance counters (Uptime) ; The CounterFormat setting controls what format the counter value is read in as a double, int, or large (int64). ; The plugin always outputs doubles, so this shouldn't have that much effect ; The CounterMultiply setting sets a value the counter value is multiplied by, use it to adjust the scale ; 1.1574074074074073e-005 is the result of(1 / 86400.0), the uptime counter reports seconds and we want to report days. ; So we want to divide the counter value by the number of seconds in a day, 86400. Object=System Counter=System Up Time GraphTitle=Uptime GraphCategory=system GraphDraw=AREA GraphArgs=--base 1000 -l 0 DropTotal=0 CounterFormat=large CounterMultiply=1.1574074074074073e-005 [SpeedFanPlugin] BroadcastIP=192.168.0.255 UID=FF671100
useful options:
- -install Install as a system service.
- -uninstall Removes the installed service.
- -quiet Close the console window, running in the background.
- -run Run as a normal program, rather than a service.
Plugin - Apache
- apache_accesses - Plugin to monitor the number of accesses to Apache servers
-
apache_processes - Munin plugin to monitor the number of apache-processes running on the machine.
separate "busy" and "idle" servers count. - apache_volume - Munin plugin to monitor the volume of data sent from Apache servers.
The plugin needs access to http://localhost/server-status?auto (or modify the URL for another host).
Apache needs ExtendedStatus enabled for this plugin to work.
Unit
Milli (symbol m) is a prefix in the metric system denoting a factor of one thousandth
Munin (by default) reads a counter every 5 minutes, and calculates the average between the two last reads.
66m means 66*300 = 19.8 accesses within a five minute period
Enable plugin
ln -s /usr/share/munin/plugins/apache_processes /etc/munin/plugins/apache_processes
ln -s /usr/share/munin/plugins/apache_accesses /etc/munin/plugins/apache_accesses
ln -s /usr/share/munin/plugins/apache_volume /etc/munin/plugins/apache_volume
Configuration
/etc/munin/plugin-conf.d/apache
# The %d in the url will be replaced with the port # 'showfree' enables the disaplay of the "free slots" graph [apache_*] env.url http://127.0.0.1:%d/server-status?auto env.ports 8089 env.showfree 1
Remark
If you need authenticated access server-status
[apache_volume] env.url http://user:pass@localhost/server-status?auto
Enable Service
service munin-node restart
Test
telnet
telnet localhost 4949
list
apache_accesses apache_processes apache_volume ...
munin-run
munin-run apache_processes
busy8089.value 1 idle8089.value 16 free8089.value 783
Apache Configure
https://datahunter.org/apache_server-info
Plugin - meminfo
Enable 後會多了 memory tab, 它之下分別有
- Application memory usage
- External fragmentation: Buddyinfo
- External fragmentation: Page type info
- Physical memory usage
- Slab objects size
- Swap usage
- Virtual memory usage
Enable plugin
ln -s /usr/share/munin/plugins/meminfo /etc/munin/plugins/meminfo
/etc/munin/plugin-conf.d/meminfo
[meminfo] user root group root env.applications httpd env.application_wait 86400 # selects which graphics drawing # enabled_graphs is a regexp # Draw only appinfo & swapinfo graphs env.enabled_graphs (appinfo|swapinfo)
Checking
munin-run meminfo
multigraph appinfo summ_httpd.value 1478574080 multigraph appinfo.app_httpd app_httpd_VmData.value 1478574080 app_httpd_VmExe.value 11440128 app_httpd_VmHWM.value 560578560 app_httpd_VmLck.value 0 app_httpd_VmLib.value 793325568 app_httpd_VmPTE.value 16019456 app_httpd_VmPeak.value 20646313984 app_httpd_VmRSS.value 474218496 app_httpd_VmSize.value 20533698560 app_httpd_VmStk.value 1892352 app_httpd_VmSwap.value 0 multigraph appinfo.processes pr_httpd.value 21 ....
Plugin - Mysql
# Default mysql 的 monitor 是沒有行的
munin-node-configure | grep mysql
mysql_ | no | mysql_bytes | no | mysql_innodb | no | mysql_isam_space_ | no | mysql_queries | no | mysql_slowqueries | no | mysql_threads | no |
Install
# 安裝需要的 package
yum install -y perl-Cache-Cache
# 建立 Mysql User
mysql> CREATE USER [email protected] IDENTIFIED BY '????????';
mysql> GRANT PROCESS ON *.* TO [email protected];
mysql> GRANT SELECT ON mysql.* TO [email protected];
mysql> FLUSH PRIVILEGES;
checking:
/usr/bin/mysqladmin -umunin -p???? ping
這 permission 是用於 Plugin: mysql_bytes mysql_queries mysql_slowqueries mysql_threads
P.S.
Mysql "PROCESS" permission
The PROCESS privilege pertains to display of information about the threads executing within the server (that is, information about the statements being executed by sessions). The privilege enables use of SHOW PROCESSLIST or mysqladmin processlist to see threads belonging to other accounts; you can always see your own threads. The PROCESS privilege also enables use of SHOW ENGINE.
* information_schema
# SHOW PROCESSLIST
# ID User Host Database Command Time Status SQL query
# munin 的設定
# Save this to /etc/munin/plugin-conf.d/mysql_
[mysql_*] # mysql_ env.mysqlconnection DBI:mysql:mysql;host=127.0.0.1;port=3306 env.mysqluser munin env.cachenamespace munin_mysql_pri env.mysqlpassword ???? # mysql_bytes mysqlopts -umunin -p???? env.mysqladmin /usr/bin/mysqladmin
# chmod 640 munin-node.conf
# chgrp munin munin-node.conf
# Enable Plugin
# To get a list of symlinks that can be created
/usr/share/munin/plugins/mysql_ suggest
# for user 'munin'@'localhost' to database 'mysql'
ln -s /usr/share/munin/plugins/mysql_slowqueries /etc/munin/plugins/mysql_slowqueries
ln -s /usr/share/munin/plugins/mysql_threads /etc/munin/plugins/mysql_threads
ln -s /usr/share/munin/plugins/mysql_queries /etc/munin/plugins/mysql_queries
ln -s /usr/share/munin/plugins/mysql_bytes /etc/munin/plugins/mysql_bytes
ln -s /usr/share/munin/plugins/mysql_innodb /etc/munin/plugins/mysql_innodb
service munin-node restart
Other plugin:
ln -s /usr/share/munin/plugins/mysql_bytes mysql_bytes
ln -s /usr/share/munin/plugins/mysql_queries mysql_queries
ln -s /usr/share/munin/plugins/mysql_slowqueries mysql_slowqueries
ln -s /usr/share/munin/plugins/mysql_threads mysql_threads
其他兩個:
-
monitor free space in a pre-allocated innodb tablespace
mysql_innodb
-
# monitor the percent of table space used on isam and myisam tables
mysql_isam_space_
Check again:
munin-node-configure | grep mysql
mysql_ | no | network_slow network_traffic commands connections mysql_bytes | yes | mysql_innodb | no | mysql_isam_space_ | no | mysql_queries | yes | mysql_slowqueries | yes | mysql_threads | yes |
# Testing:
<1>
munin-run mysql_connections
munin-run mysql_bytes
<2>
telnet localhost 4949
fetch mysql_connections
Troubleshoot
Centos 6
A) Munin Dynamically zoomable graph not working
Dynamically zoomable graph:
Link: http://?/munin/static/dynazoom.html
原因1: apache 沒有安裝行 perl 的 module
yum install mod_perl
原因2: /var/log/munin/munin-cgi-graph.log 的 permission 不對
touch /var/log/munin/munin-cgi-graph.log
chown apache. /var/log/munin/munin-cgi-graph.log
Other Plugins
- df_abs # Plugin to monitor absolute disk usage
- ip_ # Wildcard-plugin to monitor IP addresses (IPv4 or IPv6) through iptables
- mbmon_ # Fans RPM, Temperatures, Voltages