vROps: Building an overview dashboard for a stretched cluster datacenter

Posted By: viktoriouson: April 25, 2019In: vRealize OperationsNo Comments

One of the powerful features of vRealize Operations is that you can build your own custom dashboards. You can drag and drop one of the 44 different widgets on an interactive canvas and display the information that is relevant for you. You can build interactive dashboards that are used for troubleshooting purposes, or you can create dashboards that are displayed on a “Big Screen” in your operations center.

In this article I will demonstrate how to build a dashboards that is used to get information on the status of a stretched cluster environment. In this example the dashboard is linked to a lab that includes 4 hosts (2 per site) and 14 VMs, but of course the dashboard can also be used in bigger environments.

Let’s have a look at the dashboard (click to enlarge):

In the dashboard the primary datacenter is displayed on the left, the secondary datacenter is displayed on the right.

The dashboard includes:

A heatmap widget that provides information on the availability of the VMs in each datacenter leveraging a SuperMetric that comes from the Operationalize your World program. More details on this SuperMetric are included in this article (just read on…);
A heatmap that provides information on the ESXi hosts: are the hosts up and running and connected to the vCenter Server;
An object list widget that shows the ESXi hosts that are in maintenance mode;
A scoreboard widget that provides information on the CPU & Memory workload on a site level (per site).

The dashboard in action

VM Down

So, let’s see how the dashboards updates if the status of some of the objects in your datacenter changes. Let’s say a virtual machine is shutdown/powered-off…so the VM is not available. Availability for this/these VM(s) will change and we will notice this in the dashboard:

As you can see vm06 en vm10 are not available.

ESXi host in maintenance mode

Let’s put an ESXi host in maintenance mode. In this case the ESXi host is still running but cannot be used. The hostname of the ESXi host will automatically pop up in the “Hosts in Maintenance Mode” object list:

ESXi host not available

If a host is not available (disconnected/powered-off) it will turn red in the Host Availability heatmap:

In this particular situation the host is unavailable and the last status was that the host is in maintenance mode.

Per site CPU and Memory Workload

The last example is around the scoreboard widget. This scoreboard widget provides information on the CPU and Memory workload:

Because it’s a stretched cluster architecture, you don’t want to have the workload to be higher than 50% per site because you should be able to provide failover services. That’s the reason why all the counters turn red here, a custom threshold is set at 40, 43 and 45% of workload.

Note: Because the dashboard runs in a nested lab environment I’ve got some high Workload values. In a real-life scenario these counters should be lower, and preferable lower than 40-45%.

Behind the scene

Let’s have a look at how this dashboard was built. Notice that I used vROps 7.5, but it will probably work in earlier versions of vROps as well.

Custom Datacenter

An important construct that is used in the dashboard is a Custom Datacenter. The Custom Datacenter construct is used to create the sites in the stretched cluster architecture:

The Primary DC Custom Datacenter contains host 1 & 2;
The Secondary DC Custom Datacenter contains host 2 & 3.

Other objects/component that are linked to these Custom Datacenters are automatically included as “descendants”. It’s important to first construct these Custom Datacenters, this is a static configuration but only for the included ESXi hosts per sites. VMs and other components (the descendants) are dynamically included. The Custom Datacenter option is available under Environment–>Custom Datacenter.

Configure the widgets

The widgets used in this dashboard are two heatmaps, an object list and scoreboard. All the widgets are “self-provider” widgets, there’s no interaction between the widgets. So enable the self-provider option on all the widgets and also enable the refresh content option.

For most of the widgets I didn’t use the Input Data option. With Input Data you can pre-select which objects you want to include in your widget. I prefer to use a more dynamic way to include objects through the Output Filter option. With Output Filter you can create a query that defines which objects should be included in your widget:

This query for example selects all the virtual machines that are a descendant of the Custom Datacenter Primary DC (created earlier). This will include all VMs that are running in the primary DC of my stretched cluster architecture, through the use of the Custom Datacenter construct. To determine if a VM is up or down, I use the SuperMetric “Ops. VM Uptime“. This is a SuperMetric that comes from the Operationalize your World program and can determine if a VM is up and running based on the powerstate, memory-, network- and disk-activity. Read more about it in an article I published back in 2017. The definition for this SuperMetric is:

Click here to download this SuperMetric, so you can directly import it into vROps. To forget it to link the SuperMetric to your active policy/policies, read more about this here.

The availability of the ESXi hosts is provided through a heatmap where the System|Powered ON metric is used. To determine if a host is in Maintenance Mode and display this in the object list, the Runtime|Maintenance State property is used. This property is part of a Output Filter query:

To display the performance of the cluster per site, I’m using the Scoreboard widget that uses the CPU|Workload and Memory|Workload metrics. These metrics have a custom Color Method configured so the Scoreboard already turn yellow/orange/red at low values (around 40%). This because you don’t want a load higher than 50% per site in a stretched cluster architecture.

Again the Custom Datacenter construct is used to filter hosts per site.

Import and use the dashboard yourself

To get a full understanding of this dashboard, feel free to download it here. Don’t forget to import the SuperMetric as well, available for download here. Again, the SuperMetric has to be included in your active vROps policies. Don’t forget to configure two Custom Datacenters called Primary DC and Secondary DC. Connect the ESXi hosts that are in these two datacenter locations. It will take around 10 minutes before the dashboard will display useful information.

I hope this was helpful, feel free to leave a comment if you have any questions or comments. Happy dashboarding!

Notice: This dashboard was built for educational purposes. Although it will work in production datacenter environments, use it at your own risk. I cannot take any responsibility on any problems that might occur.

Tags: supermetric vrops

Upgrade to vRealize Operations 7.5 in six easy steps

VMware Cloud on AWS with NSX-T networking basics

About the author

viktorious

Related Articles

Leave a ReplyCancel reply

About viktorious.nl

vExpert

Subscribe to Blog via Email

Recent Comments

Niranjan on 26 Feb in: Setup Harbor Proxy Cache and Harbor Container Webhook to overcome Docker Hub Pull Rate Limits in Kubernetes

VSphere 7 Update 3 broadens app acceleration, cloud initiatives - TechTarget - ColorMag on 13 Mar in: Deploy a Tanzu Kubernetes cluster on vSphere 7

David Feng on 10 Mar in: Automated deployment of a NAT network with VMware Cloud Assembly and NSX-T

viktorious on 09 Mar in: Automated deployment of a NAT network with VMware Cloud Assembly and NSX-T

David Feng on 06 Mar in: Automated deployment of a NAT network with VMware Cloud Assembly and NSX-T