Vcenter Integration with KloudMate
In today’s dynamic IT environments, effective monitoring and management of virtualized resources are crucial for maintaining performance and reliability. This document outlines the integration of vCenter with OpenTelemetry (OTel), facilitating the collection and transmission of metrics from VMware vSphere environments, including both vCenter and ESXi hosts.
By leveraging the vSphere APIs, our integration allows seamless fetching of vital metrics, which are then forwarded to KloudMate. This ensures that teams have access to real-time insights and analytics, enabling proactive management of virtual infrastructure. Through this integration, organizations can enhance visibility into their virtualized resources, optimize performance, and streamline troubleshooting processes, ultimately driving improved operational efficiency.
Pre-requisites::
- Vcenter must be running with the version 8 and 7.0.
- User credentials: A “Read Only” user assigned to a vSphere with permissions to the vCenter server, cluster and all subsequent resources being monitored must be specified in order for the receiver to retrieve information about them.
- Endpoint: Endpoint to the vCenter Server or ESXi host that has the sdk path enabled.
- Install the OpenTelemetry collector on the specific server which have access to Vcenter. Check Installing and Configuring OpenTelemetry Collector.
Step 1: Configure the Receivers to scrape metrics as well as Logs.
To start monitoring with VMware with Otel Collector, you need to configure vcenter receiver
- Linux Users: Open the file located at /etc/otelcol-contrib/config.yamlusing your preferred text editor.
- Windows Users: Create a new file called config.yaml in the C:\Program Files\OpenTelemetry Collector folder. You can use Notepad or any text editor to do this.
Add the suitable extensions :
In this configuration file, ensure the vcenter receiver is set up to collect and send metrics according to your specific requirements.
Step 2: Set up the processor component to identify resource information and either append or replace the resource values in the telemetry data with this information.
Step 3: Configure the exporter, extension and save the configuration
Set up the KloudMate Backend on the exporter part of the Open Telemetry configuration file and configure the pipeline.
Step 4: To restart and verify the status of the OpenTelemetry (Otel) Collector, follow these steps:
For Linux:
- Execute the following commands:
These commands will restart the Otel Collector and display its current status.
For Windows:
- Open the Services window:
- Press Win + R, type services.msc, and press OK.
- Alternatively, search for "Services" in the Windows Start menu.
- In the Services window, locate the "OpenTelemetry Collector" service.
- Right-click the service and select "Restart."
Subsequently, monitor the metrics on the KloudMate dashboard and set up an alarm to receive notifications if the potential metrics for a specific application rise.
Metrics | Description |
---|---|
vcenter_datacenter_cluster_count | The number of clusters in the datacenter. |
vcenter_datacenter_host_count | The number of hosts in the datacenter. |
vcenter_datacenter_vm_count | The number of VM's in the datacenter. |
vcenter_datacenter_datastore_count | The number of datastores in the datacenter. |
vcenter_datacenter_disk_space | The amount of available and used disk space in the datacenter. |
vcenter_datacenter_cpu_limit | The total amount of CPU available to the datacenter. |
vcenter_datacenter_memory_limit | The total amount of memory available to the datacenter. |
vcenter_cluster_cpu_limit | The amount of CPU available to the cluster. |
vcenter_cluster_cpu_effective | The effective CPU available to the cluster. This value excludes CPU from hosts in maintenance mode or are unresponsive. |
vcenter_cluster_memory_limit | The available memory of the cluster. |
vcenter_cluster_memory_effective | The effective available memory of the cluster. |
vcenter_cluster_vm_count | The number of virtual machines in the cluster. |
vcenter_cluster_vm_template_count | The number of virtual machine templates in the cluster. |
vcenter_cluster_host_count | The number of hosts in the cluster. |
vcenter_cluster_vsan_throughput | The vSAN throughput of a cluster. |
vcenter_cluster_vsan_operations | The vSAN IOPs of a cluster. |
vcenter_cluster_vsan_latency_avg | The overall cluster latency while accessing vSAN storage. |
vcenter_cluster_vsan_congestions | The congestion of IOs generated by all vSAN clients in the cluster. |
vcenter_datastore_disk_usage | The amount of space in the datastore. |
vcenter_datastore_disk_utilization | The utilization of the datastore. |
vcenter_host_cpu_utilization | The CPU utilization of the host system. |
vcenter_host_cpu_usage | The amount of CPU used by the host. |
vcenter_host_cpu_capacity | Total CPU capacity of the host system. |
vcenter_host_cpu_reserved | The CPU of the host is reserved for use by virtual machines. |
vcenter_host_disk_throughput | Average number of kilobytes read from or written to the disk each second. |
vcenter_host_disk_latency_avg | The latency of operations to the host system's disk. |
vcenter_host_disk_latency_max | Highest latency value across all disks used by the host. |
vcenter_host_memory_utilization | The percentage of the host system's memory capacity that is being utilized. |
vcenter_host_memory_usage | The amount of memory the host system is using. |
vcenter_host_network_throughput | The amount of data that was transmitted or received over the network by the host. |
vcenter_host_network_usage | The sum of the data transmitted and received for all the NIC instances of the host. |
vcenter_host_network_packet_error_rate | The rate of packet errors transmitted or received on the host network. |
vcenter_host_network_packet_rate | The rate of packets transmitted or received across each physical NIC (network interface controller) instance on the host. |
vcenter_host_network_packet_drop_rate | The rate of packets dropped across each physical NIC (network interface controller) instance on the host. |
vcenter_host_vsan_throughput | The vSAN throughput of a host. |
vcenter_host_vsan_operations | The vSAN IOPs of a host. |
vcenter_host_vsan_latency_avg | The host latency while accessing vSAN storage. |
vcenter_host_vsan_congestions | The congestion of IOs generated by all vSAN clients in the host. |
vcenter_host_vsan_cache_hit_rate | The host's read IOs which could be satisfied by the local client cache. |
vcenter_resource_pool_memory_usage | The usage of the memory by the resource pool. |
vcenter_resource_pool_memory_shares | The amount of shares of memory in the resource pool. |
vcenter_resource_pool_memory_swapped | The amount of memory that is granted to VMs in the resource pool from the host's swap space. |
vcenter_resource_pool_memory_ballooned | The amount of memory in a resource pool that is ballooned due to virtualization. |
vcenter_resource_pool_memory_granted | The amount of memory that is granted to VMs in the resource pool from shared and non-shared host memory. |
vcenter_resource_pool_cpu_usage | The usage of the CPU used by the resource pool. |
vcenter_resource_pool_cpu_shares | The amount of shares of CPU in the resource pool. |
vcenter_vm_memory_ballooned | The amount of memory that is ballooned due to virtualization. |
vcenter_vm_memory_usage | The amount of memory that is used by the virtual machine. |
vcenter_vm_memory_swapped | The portion of memory that is granted to this VM from the host's swap space. |
vcenter_vm_memory_swapped_ssd | The amount of memory swapped to fast disk device such as SSD. |
vcenter_vm_disk_usage | The amount of storage space used by the virtual machine. |
vcenter_vm_disk_utilization | The utilization of storage on the virtual machine. |
vcenter_vm_disk_latency_avg | The latency of operations to the virtual machine's disk. |
vcenter_vm_disk_latency_max | The highest reported total latency (device and kernel times) over an interval of 20 seconds. |
vcenter_vm_disk_throughput | Average number of kilobytes read from or written to the virtual disk each second. |
vcenter_vm_network_throughput | The amount of data that was transmitted or received over the network of the virtual machine. |
vcenter_vm_network_packet_rate | The rate of packets transmitted or received by each vNIC (virtual network interface controller) on the virtual machine. |
vcenter_vm_network_packet_drop_rate | The rate of transmitted or received packets dropped by each vNIC (virtual network interface controller) on the virtual machine. |
vcenter_vm_network_usage | The network utilization combined transmit and receive rates during an interval. |
vcenter_vm_cpu_utilization | The CPU utilization of the VM. |
vcenter_vm_cpu_usage | The amount of CPU used by the VM. |
vcenter_vm_cpu_readiness | Percentage of time that the virtual machine was ready, but could not get scheduled to run on the physical CPU. |
vcenter_vm_memory_utilization | The memory utilization of the VM. |
vcenter_vm_vsan_throughput | The vSAN throughput of a virtual machine. |
vcenter_vm_vsan_operations | The vSAN IOPs of a virtual machine. |
vcenter_vm_vsan_latency_avg | The virtual machine latency while accessing vSAN storage. |