One of the powerful features of vRealize Operations is that you can build your own custom dashboards. You can drag and drop one of the 44 different widgets on an interactive canvas and display the information that is relevant for you. You can build interactive dashboards that are used for troubleshooting purposes, or you can create dashboards that are displayed on a “Big Screen” in your operations center.
In this article I will demonstrate how to build a dashboards that is used to get information on the status of a stretched cluster environment. In this example the dashboard is linked to a lab that includes 4 hosts (2 per site) and 14 VMs, but of course the dashboard can also be used in bigger environments.
Let’s have a look at the dashboard (click to enlarge):
In the dashboard the primary datacenter is displayed on the left, the secondary datacenter is displayed on the right.
The dashboard includes:
- A heatmap widget that provides information on the availability of the VMs in each datacenter leveraging a SuperMetric that comes from the Operationalize your World program. More details on this SuperMetric are included in this article (just read on…);
- A heatmap that provides information on the ESXi hosts: are the hosts up and running and connected to the vCenter Server;
- An object list widget that shows the ESXi hosts that are in maintenance mode;
- A scoreboard widget that provides information on the CPU & Memory workload on a site level (per site).
The dashboard in action
So, let’s see how the dashboards updates if the status of some of the objects in your datacenter changes. Let’s say a virtual machine is shutdown/powered-off…so the VM is not available. Availability for this/these VM(s) will change and we will notice this in the dashboard:
As you can see vm06 en vm10 are not available.
ESXi host in maintenance mode
Let’s put an ESXi host in maintenance mode. In this case the ESXi host is still running but cannot be used. The hostname of the ESXi host will automatically pop up in the “Hosts in Maintenance Mode” object list:
ESXi host not available
If a host is not available (disconnected/powered-off) it will turn red in the Host Availability heatmap:
In this particular situation the host is unavailable and the last status was that the host is in maintenance mode.
Per site CPU and Memory Workload
Because it’s a stretched cluster architecture, you don’t want to have the workload to be higher than 50% per site because you should be able to provide failover services. That’s the reason why all the counters turn red here, a custom threshold is set at 40, 43 and 45% of workload.
Note: Because the dashboard runs in a nested lab environment I’ve got some high Workload values. In a real-life scenario these counters should be lower, and preferable lower than 40-45%.
Behind the scene
An important construct that is used in the dashboard is a Custom Datacenter. The Custom Datacenter construct is used to create the sites in the stretched cluster architecture:
- The Primary DC Custom Datacenter contains host 1 & 2;
- The Secondary DC Custom Datacenter contains host 2 & 3.
Other objects/component that are linked to these Custom Datacenters are automatically included as “descendants”. It’s important to first construct these Custom Datacenters, this is a static configuration but only for the included ESXi hosts per sites. VMs and other components (the descendants) are dynamically included. The Custom Datacenter option is available under Environment–>Custom Datacenter.
Configure the widgets
The widgets used in this dashboard are two heatmaps, an object list and scoreboard. All the widgets are “self-provider” widgets, there’s no interaction between the widgets. So enable the self-provider option on all the widgets and also enable the refresh content option.
For most of the widgets I didn’t use the Input Data option. With Input Data you can pre-select which objects you want to include in your widget. I prefer to use a more dynamic way to include objects through the Output Filter option. With Output Filter you can create a query that defines which objects should be included in your widget:
This query for example selects all the virtual machines that are a descendant of the Custom Datacenter Primary DC (created earlier). This will include all VMs that are running in the primary DC of my stretched cluster architecture, through the use of the Custom Datacenter construct. To determine if a VM is up or down, I use the SuperMetric “Ops. VM Uptime“. This is a SuperMetric that comes from the Operationalize your World program and can determine if a VM is up and running based on the powerstate, memory-, network- and disk-activity. Read more about it in an article I published back in 2017. The definition for this SuperMetric is:
The availability of the ESXi hosts is provided through a heatmap where the System|Powered ON metric is used. To determine if a host is in Maintenance Mode and display this in the object list, the Runtime|Maintenance State property is used. This property is part of a Output Filter query:
To display the performance of the cluster per site, I’m using the Scoreboard widget that uses the CPU|Workload and Memory|Workload metrics. These metrics have a custom Color Method configured so the Scoreboard already turn yellow/orange/red at low values (around 40%). This because you don’t want a load higher than 50% per site in a stretched cluster architecture.
Again the Custom Datacenter construct is used to filter hosts per site.
Import and use the dashboard yourself
To get a full understanding of this dashboard, feel free to download it here. Don’t forget to import the SuperMetric as well, available for download here. Again, the SuperMetric has to be included in your active vROps policies. Don’t forget to configure two Custom Datacenters called Primary DC and Secondary DC. Connect the ESXi hosts that are in these two datacenter locations. It will take around 10 minutes before the dashboard will display useful information.
I hope this was helpful, feel free to leave a comment if you have any questions or comments. Happy dashboarding!
Notice: This dashboard was built for educational purposes. Although it will work in production datacenter environments, use it at your own risk. I cannot take any responsibility on any problems that might occur.