Submit a ticketCall us

Get a crash course on Network Monitoring delivered right to your inbox
This free 7-day email course provides a primer to the philosophy, theory, and fundamental concepts involved in IT monitoring. Lessons will explain not only how to perform various monitoring tasks, but why and when you should use them. Sign up now.

Home > Success Center > Server & Application Monitor (SAM) > Troubleshooting Guide for Hardware Health for ESX host Server Polled Through vCenter

Troubleshooting Guide for Hardware Health for ESX host Server Polled Through vCenter

Overview

VMware ESX Host showing One or Multiple issues with Hardware Health 

Environment

NPM 11.5 +

Steps

  1. This guide was created to help users troubleshoot problems when polling Hardware Health information on ESX Host servers polled through through a vCenter server (Management Server). 
     

    Sometimes it can happen that a Hardware sensor for Hard disk / Fan / ETC  appears in the Solarwinds Web console in a  critical or warning state. 
     

    However even when the disk is replaced the hardware health sensor can still display as critical or warning within Solarwinds. This is a common occurrence when ESX hosts are polled through vCenter. SAM is polling this information via the VMware API which can fail to update itself or can provide a false positive. 

    It seems that the vCenter sometimes fails to update or there is a significant time delay before the Hardware sensors are updated and warning messages are cleared from the VMware API. 

    To verify if the status shown by SolarWinds Hardware Health is consistent with status reported by the VMware API please follow this guide. 

     

  2. You can see whether your ESX host is polled through vCenter or Polled Directly by checking on Virtualization settings page. (Settings > Virtualization Settings)

     

    vCenterSettings.png



    To Access the VMware API:

    You can access it by using this URL: https://your_vCenter_server/mob   ==> Replace 'your_vCenter_server' with either it's IP address or its host name to access.
     

    This will open the Managed Object Browser page
     

    vCenterTroubleshooting1.png

     

    Now we we have to find the ESX host object. It's a small bit complicated but you should get it by selecting the fhe following properties:

    Content -> rootFolder -> childEntity (choose datacenter) -> hostFolder -> childEntity (choose domain) -> host (choose ESX host)
     

    You should see page similar to this:

    vCenterTroubleshooting2.png

     

    Now let's find the ESX Host's  Hardware Health information:

    runtime -> healthSystemRuntime -> systemHealthInfo -> numericSensorInfo
     

    You should see page similar to this:

    vCenterTroubleshooting3.png

     

    Except the Numeric sensors there is number of other sensor in other location's which we monitor.

    Here are alternative paths which you can substitute with the one for numeric sensors.

    runtime -> healthSystemRuntime -> hardwareStatusInfo -> cpuStatusInfo

    runtime -> healthSystemRuntime -> hardwareStatusInfo -> memoryStatusInfo

    runtime -> healthSystemRuntime -> hardwareStatusInfo -> storageStatusInfo

    hardware -> systemInfo -> otherIdentifyInfo

    config -> storageDevice -> scsiLun

 

Example 1:
ESX Guest showing  "Memory Critical" issue.

 

Please Note :  There are  more than numericSensorInfo properties.  Please check the one which is relevant to the Warning / Critical Message you are seeing to your ESX,  
 

You get that by following the below properties:

 runtime -> healthSystemRuntime -> hardwareStatusInfo -> memoryStatusInfo

Error Which Appears in SAM Webconsole

hmc.PNG

 

But when we check the Host Itself there Appears to be no Issue. 

vmactual.PNG

 

But when we Dig Down into the VMware API we can see that it is reporting a 'Red' Status which is why we display the Warning message in the SAM Web Console

 

vmpagediag.PNG

 

 

Example 2:
Power Supply replaced however the Status sill showing RED


When Checking the Host via vCenter there appears to be no issue

repl.PNG

 

SAM Is Reporting there is an issue with one of the Power Supply Units

psupactul.PNG


Again Once we check the VMware API where SAM is actually Polling This information from we can see there is a 'Red' Warning message being flagged in the API. 

powsuphtml.PNG

 

Resolution:

Option 1: 

Contact VMware directly to see if there is a resolution to 'Clear' the incorrect status's from the API. They will be able to assist you from their side as unfortunately we are unable to provide a workaround at this time for the incorrect states being provided within the API. 

 

Option 2:

Try polling the ESX / ESXi directly using the CIM Protocol. There are Steps available at this Link  to Change a host to poll through vCenter vs Poll ESX server directly

 

 

 

 

 

Last modified
01:59, 16 Jun 2017

Tags

Classifications

Public