As you probably know vRealize Operations provides several symptom definitions based on message events as part of the vCenter Solution content OOTB. You can see some of them in the next picture.
These events are used in alert definitions to raise vReaalize Operations alarms any time one of those events is triggered in any of the managed vCenter instances.
If you take a look into vCenter events in the Monitoring tab or check the available events as presented by the VMware Event Broker Appliance (VEBA) integration, you will see that there are tons of other events you may want to use to raise alerts.
Unfortunately, this is not always as easy as creating a new message event symptom definition in vROps. Not every event is intercepted by vRealize Operations.
Now, you could of course use VEBA to run functions triggered by such events and let the functions raise alerts, create tickets, etc. This is definitely a great option and how to do that using VEBA functions and vROps is something I am planning to describe in an upcoming blog post. But there are also other ways to achieve that.
If you run vRealize Log Insight integrated with vRealize Operations in your environment, and this is a highly recommended setup, you have another, very easy option, to raise alerts on any available vCenter event, as long as that event is logged by the vCenter instance. That should be the case for all or at least the majority of the events.
In the next picture, you see all the various events I have received in my vRLI coming from vCenter in the last 48 hours. For a better visibility, I have excluded all events generated by vim.event.eventex
.
To search, filter, and display such events using their type I have created the following extracted field in vRLI:
This extracted field makes it now easy to create alert definitions in vRLI.
Let us assume our use case is: “I need an alert every time any vSphere cluster configuration has been changed.“
The corresponding event in vCenter is created by vim.event.ClusterReconfiguredEvent
and send to vRLI as a log message
And this is the corresponding log message in vRLI after I have changed the DRS configuration of one of my clusters.
To get such events as an alarm in vRealize Operations in general we need two things:
- vRealize Operations integration in vRealize Log Insight. With that integration vRLI is capable of mapping vCenter objects that are sources of messages to their objects in vROps. With this feature vRLI alarms can be forwarded to vROps and attached to exactly that object which is the original source of the message received by vRLI.
- Alarm definition in vRealize Log Inisght that will be triggered every time an event of type
vim.event.ClusterReconfiguredEvent
has been received and this alarm will be forwarded to vROps. For this alert definitions we will use the extracted field described in figure 4.
But there is still a little more work we need to do to implement a solution that really fulfills our requirement: get an alert every time a cluster configuration change happened.
Let us assume the following situation. Someone is changing the configuration of one or several clusters very frequently. Our vRLI alert definition looks like shown in the next picture.
Figure 6: First alert definition in vRealize Log Insight.
And if we now run a query on this alert definition we will see that vRLI is properly triggering the alarms. In the picture, we see the three alarms raised because of three changes during 10 minutes.
Figure 7: Event messages triggering alarms in vRLI.
The problem with the vROps integration is, that the first alarm will be properly forwarded to vROps and will raise an alarm on the vCenter instance but any subsequent alarm coming in will not be reflected in vROps as long as the first alarm is still in the “open” state. We see the first alarm in vROps in the next figure.
This behavior is due to the same notification event text for every alarm. In that case, vROps just assumes that the next occurrence is reporting the same issue thus there is no need to raise another, duplicate alarm. In our case the notification event text is the name of the alarm as defined in the vRLI alert definition: tkopton-ClusterConfigChanged
.
To change this behavior we need to include unique information for every alarm in the alarm name.
What we can do is customize the alert name by including a field or an extracted field in the format ${field-name}
.
The challenge is to find such unique information in the event log message. Let’s see what we have. This is a sample event message as received in vRLI:
2021-11-14T10:18:06.795866+00:00 vcenter01 vpxd 7888 - - Event [9150192] [1-1] [2021-11-14T10:18:06.795507Z] [vim.event.ClusterReconfiguredEvent] [info] [VSPHERE.LOCAL\Administrator] [Datacenter-01] [9150191] [Reconfigured cluster CL01 in datacenter Datacenter-01
Modified:
configurationEx.drsConfig.enableVmBehaviorOverrides: false -> true;
configurationEx.proactiveDrsConfig.enabled: false -> true;
Added:
Deleted:
]
It looks like every event has a unique event ID – the key
property as described in the vSphere API documentation. I have created an extracted field for the event ID:
This extracted field can be now used as part of the name in the alert definition, which will make every occurrence unique in vROps. In the next picture, you can see the modified alert definition in vRLI.
Let’s do some vSphere cluster reconfigurations.
And this is how it looks like in vROps after vRLI forwarded these alarms to vRealize Operations. First, we check the symptoms, see the next picture.
Figure 12: Notification event symptoms in vRealize Operations.
And here we see the corresponding alarms in vROps.
With these alarms, you could now create vROps notifications, start webhook triggered actions, parse the content and automate the remediation. Yes, especially around the alert name in vRLI using the extracted field we still have some room for improvement but the approach described here is sufficient for many use cases I have worked with.
Have fun implementing your use cases.
Stay safe.
Thomas – https://twitter.com/ThomasKopton
1 Comment