vRealize Operations with its four main pillars:
- Optimize Performance
- Optimize Capacity
- Manage Configuration
provides a perfect solution to manage complex SDDC environments.
The “Optimize Performance” part of vRealize Operations provides a wide range of features like workload optimization to ensure consistent performance in your datacenters or VM rightsizing to reduce bottlenecks and ensure best possible performance of your workloads.
The vROps capability to identify over- and undersized VMs and conduct the required operations to adjust the configuration of VMs is one of the well-known features, accessible directly form the UI.
But what if you would like to rightsize your ESXi Clusters? What information and features is vRealize Operations providing in this area?
What-If Analysis for Clusters
The What-If Analysis feature in the “Optimize Capacity” area is a quick and simple way to check the impact additional workloads or removing of workloads will have in the capacity of an ESXi traditional or HCI cluster.
You can also run infrastructure centric scenarios, like removing or adding hosts from/to clusters.
These are all great features supporting proper capacity and performance management.
But how can you determine if your clusters are configured correctly from the available capacity point of view? What if you have a significant number of clusters? You probably do not want to run the scenarios for every and each cluster over and over again to get updated information.
vRealize Operations is providing all needed information to have a quick and up-to-date insight into your environment allowing you take all necessary actions to adjust the sizing of your ESXi clusters and optimize your SDDC.
Recommended CPU, Memory and Disk Space Metrics
vRealize Operations is constantly calculating recommended values for CPU, Memory and Disk Space based on the configured capacity models, Demand and Allocation if activated. The recommended capacity calculation takes into account vROps Buffers, allocation ratios and Admission Control settings giving you a fairly reliable indication on how to size your clusters.
These metrics can be used to calculate the actual number of ESXi host which could be safely removed from the cluster or how many hosts need to be added to cope with the projected demand.
Cluster Rightsizing Dashboard
The Dashboard and all required components can be downloaded from VMware Code page:
My simple dashboard will give you detailed insights into the utilization and capacity of your clusters.
It will also provide recommendations regarding the optimal size of the cluster, which will help improve the efficiency of your environment.
This first version of the dashboard is limited to traditional clusters (non-HCI like vSAN clusters).
Even if it shows all clusters (a filter will be added in the next version), please do not shrink vSAN clusters using information provided by this dashboard.
Only CPU and Memory Demand metrics are processed to conduct the rightsizing.
Before removing ESXi host from a cluster I highly recommend putting them into maintenance mode for some period of time and assess performance of the workloads. Additional What-If analysis based on the numbers provided by the dashboard helps get confidence in uncertain situations.
In addition to the metrics provided out-of-the-box we need few Super Metrics to calculate the actual number of hosts to add/remove. It is important to note that the calculation is working properly for uniform clusters. That means same sizing of ESXi host within a cluster, same CPU speed and number of cores, same memory configuration.
The list view used in the dashboard displays all clusters in the selected vSphere Datacenter.
In the last column you will see the number of ESXi host you either should add to the cluster to ensure sufficient capacity or you could potentially remove from the cluster.
Before you start removing hosts from clusters, you can also run a What-If scenario to check the remaining capacity and the capacity projection.
In my example the dashboard is indicating that I could remove one host from the wdcc02 cluster.
If we run the scenario, we see that from the demand perspective the cluster is still providing sufficient capacity to run the current workloads.
Happy rightsizing and stay safe.
Thomas – https://twitter.com/ThomasKopton