I think vCenter Operations Manager (vCops) is a great tool for operations management. vCops will learn normal behavior of your environment by using ‘dynamic threshold’ technology. As long as an infrastructure component (vm, datastore, cluster, networkadapter, etc) performs as expected (within its normal bandwidth) no alarms are generated and the component is reported healthy. Based on historical results a component’s bandwith/dynamic threshold can vary during the day, e.g. when backup is running at night. This is also judged as normal behavior by vCops. A low(er) health score, for example lower than 75, might point at a problem and asks for some additional investigation.
After optimizing a couple of virtual machines (by decreasing the amount of memory for the VM) using the vCops “looking for oversized VMs” report, I noticed the health for one of the datastores decreased and was presenting a yellow value. Although not really a big problem, I was wondering what happened: by lowering the memory configuration for a couple of virtual machines, the datastore usage parameter decreased because the virtual machine VSWP size decreased. Because the datastore usage parameter moved outside its normal bandwidth the health score decreased as well, which is a bit weird because the available free space on my datastore increased (which is a positive thing). At the end nothing to do here, vCops will learn the decrease in datastore usage is normal behavior…and the health score will increase after a while.
Another interesting thing to analyze in this case here was the ‘all metrics’ option for the selected datastore object in vCops. This option will list a lot (really, a lot!) of metrics for the selected object, in this case a datastore.
Did you ever notice the parameters appear blue or yellow in the list? Blue means there’s nothing wrong, a yellow parameter indicates the parameter might need some attention:
In this example you see batch, devices and disk space require attention. Looking at the disk space object, you see object 113 (which is actually a virtual machine) needs some attention. In this case you see Virtual Machine Used GB decrease significantly (because of the decreasing VSWP file)…that’s why this parameter is yellow.
Now about object ‘113’…how do you know this is a virtual machine, and even better….what virtual machine? Well, just paste the ID in the vCops URL and you see the object:
Should be better if this name was in the original screen anyway, but it’s a pretty simple to trace it back.