I’ve been using Cacti, RRDTool, SNMP and custom scripts running on Linux for several years to collect and display historical data regarding Network, System, and FlexLM license resource usage. I recently began tracking the Host resources used by ESX. I was interested in tracking the following:
- CPU utilization
- Memory Free
- Total Memory
- IO activity
- Network traffic
I found that the IO activity and Network traffic could be obtained via standard SNMP. The network traffic can be obtained by summing the results from an snmpwalk of VMWARE-RESOURCES-MIB::netHCKbRx and VMWARE-RESOURCES-MIB::netHCKbTx. Similarly, the IO can be obtained by summing the results from an snmpwalk of VMWARE-RESOURCES-MIB::kbRead and VMWARE-RESOURCES-MIB::kbWritten.
I did not identify items in the vmware esx mib that I could use via SNMP for the CPU utilization or Free Memory. I did find that these items are available by running esxtop or resxtop in batch mode. Running the utility esxtop on the ESX service console in batch outputs a comma separated file, csv. For example: running the command
- esxtop -b -n 2 -d 10 > esxtop-output.csv
will run esxtop in batch, take 2 samples with a delay of 10 seconds and write a 3 row csv file named “esxtop-output.csv”. The 1st line of the .csv file contains the column headers. The ESX Host cpu utilization time will be in a column named:
“\\[your-esx-hostname]\Physical Cpu(_Total)\% Processor Time”
The ESX Host free memory will be in a column named:
“\\[your-esx-hostname]\Memory\Free MBytes”.
The ESX Host total memory will be in a column named:
“\\[your-esx-hostname]\Memory\Machine MBytes”.
Today, I run 2 cronjobs; 1 on my ESX service console to run esxtop in batch and save the data to a .csv file; and the 2nd on my Cacti monitoring server. On the Cacti monitoring server, I run a “driver” script to collect data items into an RRDTool database every 5 minutes. My Cacti monitoring server uses identity based ssh/scp to get the .csv file that is produced by the cronjob that runs on the ESX service console.The RRD database is mapped as a datasource and graphed via Cacti. I’m currently looking at using the VMWare remote cli version of esxtop, rexstop. The use of resxtop, would eliminate the need for a cronjob on the ESX service console. This will allow my utilities to run against both ESX and ESXi.
Scripts
ESX RRDTool Data update Script. Call with Arguments DBName, Hostname and community, this driver ccript calls this script to obtain the individual items
-
#!/bin/bash
-
. $HOME/setrrdtool_vars
-
PERF_BASE=$HOME/perfmon-esx
-
PATH=${PERF_BASE}/bin:${PATH}
-
#
-
# arg 1 – DBNAME
-
# arg 2 – monitor host
-
# arg 3 – community
-
#
-
if [ "${1}" = "" ]; then
-
echo "must supply rrd dbname"
-
call_errror="yes"
-
fi
-
if [ "${2}" = "" ]; then
-
echo "must supply host"
-
call_errror="yes"
-
fi
-
if [ "${3}" = "" ]; then
-
echo "must supply community"
-
call_errror="yes"
-
fi
-
-
if [ "${call_error}" = "yes" ]; then
-
exit
-
fi
-
-
DBNAME=${1}
-
MHOST=${2}
-
COMMUNITY=${3}
-
-
RRD_DB=${PERF_BASE}/db/${DBNAME}
-
RRD_LOG=${PERF_BASE}/log/${DBNAME}.log
-
-
TOD=`date`
-
-
for tline in ${T}; do
-
itemname=`echo ${tline}|awk -F":" ‘{print $1}’`
-
itemval=`echo ${tline}|awk -F":" ‘{print $2}’`
-
-
case "${itemname}" in
-
"total_cpu")
-
CPU=${itemval}
-
;;
-
"available_memory")
-
MEMAVAIL=${itemval}
-
;;
-
"total_memory")
-
TOTMEM=${itemval}
-
;;
-
"iokb_total")
-
IOKB=${itemval}
-
;;
-
"netkb_total")
-
NETKB=${itemval}
-
;;
-
*)
-
-
;;
-
-
esac
-
done
-
-
rrdtool update ${RRD_DB} –template \
-
CPU_utilization:Memory_available:IO_KBytes:Net_KBytes \
-
N:${CPU}:${MEMAVAIL}:${IOKB}:${NETKB}
-
-
exit
ESX Data capture Script. Call with Arguments Hostname and community. This uses SSH/SCP, SNMP and some additional helper scripts to parse the csv data returned by esxtop
-
#!/bin/bash
-
. $HOME/setrrdtool_vars
-
export MIBS=ALL
-
# arg 1 – monitor host
-
# arg 2 – community
-
-
if [ "${1}" = "" ]; then
-
echo "must supply host to monitor"
-
call_errror="yes"
-
else
-
MHOST=${1}
-
fi
-
if [ "${2}" = "" ]; then
-
echo "must supply community"
-
call_errror="yes"
-
else
-
COMMUNITY=${2}
-
fi
-
-
if [ "${call_error}" = "yes" ]; then
-
exit
-
fi
-
-
PERFDIR=$HOME/perfmon-esx
-
PATH=${PATH}:${PERFDIR}/bin:${PERFDIR}/bin/esxstats
-
WORKDIR=${PERFDIR}/tmp
-
ESXTOPCSV=mrtg-esxtop.csv
-
CSVfile=${WORKDIR}/${MHOST}-${ESXTOPCSV}
-
export PATH
-
-
if [ ! -d ${WORKDIR} ]; then
-
mkdir ${WORKDIR}
-
fi
-
-
# use identity based ssh to copy latest csv file from ${MHOST}
-
# could replace with resxtop and output output to ${CSVfile}
-
scp -Bq ${MHOST}:${ESXTOPCSV} ${CSVfile}
-
-
# Begin CPU
-
get_cpustats ${CSVfile} ${PERFDIR}/bin/esxstats >${WORKDIR}/${MHOST}-cpustat.tmp
-
CPUTOT=`cat ${WORKDIR}/${MHOST}-cpustat.tmp | grep -v ${MHOST}`
-
-
linenum=1
-
for line in ${CPUTOT}; do
-
if [ "${linenum}" = "1" -o "${line}" = "" -o "${line}" = " " ]; then
-
skipping=1
-
let "linenum = linenum + 1"
-
else
-
cpupct_total=`echo ${line} | awk -F‘"’ ‘{print $2}’ | awk -F‘"’ ‘{print $1}’`
-
let "linenum = linenum + 1"
-
fi
-
done
-
#End CPU
-
-
# Host Available Memory
-
get_machine_mem ${CSVfile} ${PERFDIR}/bin/esxstats >${WORKDIR}/${MHOST}-machinemem.tmp
-
MACHINEMEM=`cat ${WORKDIR}/${MHOST}-machinemem.tmp | grep -v ${MHOST}`
-
-
linenum=1
-
for line in ${MACHINEMEM}; do
-
if [ "${linenum}" = "1" -o "${line}" = "" -o "${line}" = " " ]; then
-
skipping=1
-
let "linenum = linenum + 1"
-
else
-
tMachineMem=`echo ${line} | awk -F‘"’ ‘{print $2}’ | awk -F‘"’ ‘{print $1}’`
-
# Adjust answer to bytes
-
let "MachineMem = tMachineMem * 1024"
-
let "linenum = linenum + 1"
-
fi
-
done
-
let "TotalMem = MachineMem"
-
-
# Begin Used Memory
-
-
# Kernel Used Memory
-
linecnt=0
-
get_kern_used_mem ${CSVfile} ${PERFDIR}/bin/esxstats >${WORKDIR}/${MHOST}-kernused.tmp
-
KERNMEM=`cat ${WORKDIR}/${MHOST}-kernused.tmp | grep -v ${MHOST}`
-
-
linenum=1
-
for line in ${KERNMEM}; do
-
if [ "${linenum}" = "1" -o "${line}" = "" -o "${line}" = " " ]; then
-
skipping=1
-
let "linenum = linenum + 1"
-
else
-
tKernMem=`echo ${line} | awk -F‘"’ ‘{print $2}’ | awk -F‘"’ ‘{print $1}’`
-
# Adjust answer to bytes
-
let "KernMem = tKernMem * 1024"
-
let "linenum = linenum + 1"
-
fi
-
done
-
-
# Non-Kernel Used Memory
-
get_nonkern_used_mem ${CSVfile} ${PERFDIR}/bin/esxstats >${WORKDIR}/${MHOST}-nonkernused.tmp
-
NONKERNMEM=`cat ${WORKDIR}/${MHOST}-nonkernused.tmp | grep -v ${MHOST}`
-
-
linenum=1
-
for line in ${NONKERNMEM}; do
-
if [ "${linenum}" = "1" -o "${line}" = "" -o "${line}" = " " ]; then
-
skipping=1
-
let "linenum = linenum + 1"
-
else
-
tNonKernMem=`echo ${line} | awk -F‘"’ ‘{print $2}’ | awk -F‘"’ ‘{print $1}’`
-
# Adjust answer to bytes
-
let "NonKernMem = tNonKernMem * 1024"
-
let "linenum = linenum + 1"
-
fi
-
done
-
-
# End Used Memory
-
-
# Begin Free Memory
-
get_free_mem ${CSVfile} ${PERFDIR}/bin/esxstats >${WORKDIR}/${MHOST}-freemem.tmp
-
FREET=`cat ${WORKDIR}/${MHOST}-freemem.tmp | grep -v ${MHOST}`
-
-
linenum=1
-
for line in ${FREET}; do
-
if [ "${linenum}" = "1" -o "${line}" = "" -o "${line}" = " " ]; then
-
skipping=1
-
let "linenum = linenum + 1"
-
else
-
tAvailMem=`echo ${line} | awk -F‘"’ ‘{print $2}’ | awk -F‘"’ ‘{print $1}’`
-
# Adjust answer to bytes
-
let "AvailMem = tAvailMem * 1024"
-
let "linenum = linenum + 1"
-
fi
-
-
done
-
# End Free Memory
-
-
#IO
-
-
#Get IO READ Write each VM
-
ALLREAD=`snmpwalk -m ALL -c ${COMMUNITY} -v 2c ${MHOST} VMWARE-RESOURCES-MIB::kbRead |awk -F":" ‘{print $4}’|awk -F" " ‘{print $1}’`
-
ALLWRITE=`snmpwalk -m ALL -c ${COMMUNITY} -v 2c ${MHOST} VMWARE-RESOURCES-MIB::kbWritten|awk -F":" ‘{print $4}’|awk -F" " ‘{print $1}’`
-
#
-
IORead=0
-
for iobr in ${ALLREAD}; do
-
if [ "${iobr}" = "" -o "${iobr}" = " " ]; then
-
skipping=1
-
else
-
let "IORead = IORead + iobr"
-
fi
-
done
-
IOWritten=0
-
for iobw in ${ALLWRITE}; do
-
if [ "${iobw}" = "" -o "${iobw}" = " " ]; then
-
skipping=1
-
else
-
let "IOWritten = IOWitten + iobw"
-
fi
-
done
-
-
let "IOTotal=IORead + IOWritten"
-
-
#Get NetworkBytesTotalPerSec from each interface and calculate the total
-
ALLNET_RX=`snmpwalk -m ALL -c ${COMMUNITY} -v 2c ${MHOST} VMWARE-RESOURCES-MIB::netHCKbRx |awk -F":" ‘{print $4}’|awk -F" " ‘{print $1}’`
-
ALLNET_TX=`snmpwalk -m ALL -c ${COMMUNITY} -v 2c ${MHOST} VMWARE-RESOURCES-MIB::netHCKbTx |awk -F":" ‘{print $4}’|awk -F" " ‘{print $1}’`
-
#
-
NetRX=0
-
for netr in ${ALLNET_RX}; do
-
if [ "${netr}" = "" -o "${netr}" = " " ]; then
-
skipping=1
-
else
-
let "NetRX = NetRX + netr"
-
fi
-
done
-
#
-
NetTX=0
-
for nett in ${ALLNET_TX}; do
-
if [ "${nett}" = "" -o "${nett}" = " " ]; then
-
skipping=1
-
else
-
let "NetTX = NetTX + nett"
-
fi
-
done
-
let "NetTotal=NetRX + NetTX"
-
-
echo total_cpu:${cpupct_total} total_memory:${TotalMem} available_memory:${AvailMem} iokb_total:${IOTotal} netkb_total:${NetTotal}
-
-
exit
ESX Data parse script(s). Call each one with the esxtop csv file to grab the data. Uses this awk script
-
each script is a seperate file
-
# get_cpustats
-
CSVfile=$1
-
NDX=`cat ${CSVfile} | awk -v str="Physical Cpu(_Total)" -f ${2}/get_index_awk`
-
cat ${CSVfile} | awk -v i=$NDX -F"," ‘{print $i}’
-
-
# get_free_mem
-
CSVfile=$1
-
NDX=`cat ${CSVfile} | awk -v str="Free MBytes" -f ${2}/get_index_awk`
-
cat ${CSVfile} | awk -v i=$NDX -F"," ‘{print $i}’
-
-
# get_machine_mem
-
CSVfile=$1
-
NDX=`cat ${CSVfile} | awk -v str="Machine MBytes" -f ${2}/get_index_awk`
-
cat ${CSVfile} | awk -v i=$NDX -F"," ‘{print $i}’
-
-
#get_kern_used_mem
-
CSVfile=$1
-
NDX=`cat ${CSVfile} | awk -v str="Kernel MBytes" -f ${2}/get_index_awk`
-
cat ${CSVfile} | awk -v i=$NDX -F"," ‘{print $i}’
-
-
#get_managed_mem
-
CSVfile=$1
-
NDX=`cat ${CSVfile} | awk -v str="Kernel Managed MBytes" -f ${2}/get_index_awk`
-
cat ${CSVfile} | awk -v i=$NDX -F"," ‘{print $i}’
-
-
#get_nonkern_used_mem
-
CSVfile=$1
-
NDX=`cat ${CSVfile} | awk -v str="NonKernel MBytes" -f ${2}/get_index_awk`
-
cat ${CSVfile} | awk -v i=$NDX -F"," ‘{print $i}’
awk script used by get_cpustats, etc
-
BEGIN {FS=","}
-
{colndx=0;
-
for (i=1; i<=NF; i++) {
-
if (index($i,str) >0 ) {
-
colndx=i;
-
print colndx;
-
i=NF;
-
break;
-
}
-
}
-
}
Hello where can I download your scripts?
Like the prior comment, it would be great if you made your script available for download. I have a 3 host cluster I am trying to monitor.
Mike and Pen, I’ve upgraded the post to include my scripts with a minimal of documentation.
David
http://www.unnoc.org/ utilizes the VI Perl Toolkit (viperformance.pl) to graph performance stats for ESX(i) hosts.
You don’t need snmp
It would be really neat with a direct implementation to Cacti though.
[...] -ESX Scripts for monitoring VM Host: http://bable.cybermarshall.com/2008/12/14/tracking-vmware-esx-3x-or-esxi-host-resources-with-cacti/ [...]