Maintaining your vSphere environment is key to keeping your users happy. Keeping the environment secure, stable, and performing at its best is essential. One tool that stands out for this task is the VMware Skyline Health Diagnostics Appliance. This tool assesses and makes recommendations for vCenter, ESXi, and vSAN to ensure the best performance.
Installing and Maintaining the Skyline Health Diagnostics Appliance
The Skyline Health Diagnostics OVA Image is available for vSphere 6.5 and above. Once downloaded, you can deploy to a vCenter. You then use a browser to connect to https://vmware-shd_ip_address_or_fqdn. The tool’s version and compatibility database are both updated frequently. Before running a collect and analyze log bundles, I always update both, so I have the most current information.
- Settings > Tool Update > Check Tool Updates
- Settings > VCG Update > Update VCG Database (this process takes some time)
Collect and Analyze Log Bundles
The tool can detect issues in both vSphere and vSAN environments. It will check for issues and provide KB articles for a resolution to any issues detected. It also compares the driver and firmware versions you have and compares them to the VMware Compatibility Guide database. The first step is either to 'Collect Logs & Analyze' them, or you can upload existing log bundles to be analyzed.
- You can choose which plugins: Diagnostics, VMware Security Advisory, or vSAN Health. I typically choose all 3.
- With the 'Collect Logs and Analyze' function, you can choose to include vCenter and pick hosts for analysis.
- When the analysis is complete, you can view the report. A new tab will open with the report. You can also choose to Save the report to an HTML file. New to version 2.0.5 is an option to delete old reports.
Here is an example of a detected issue:
DIAGNOSTICS.Storage.KB67667:
Memory allocation failure for "smartpqi" driver can result in host not responding state. KB Number: 67667. Resolution: This issue is fixed with the version 1.0.3 of driver "smartpqi". Drivers prior to version 1.0.3 can work with memory allocated within 4GB Range. If there is not free memory to allocate available below 4GB range, driver related operations will fail. Please read KB: https://kb.vmware.com/s/article/67667 for more details/resolution. Fix Available In: smartpqi - 1.0.3
Here are a couple of examples of driver version issues:
[WARNING] Current Driver i40en-1.8.1.9-2vmw.670.3.73.14320388 is part of the supported list. But not a recent one. Recent as per VCG: 1.10.9.0
[WARNING] Driver Version smartpqi-1.0.1.553-28vmw.670.3.73.14320388 is lower than recommended by VCG. Minimum: 1.0.2.1028
Full documentation is posted on the VMware website here.