Kubernetes Monitoring
Kubernetes Monitoring in KloudMate gives you visibility into cluster health, workload availability, resource usage, pod behavior, and related telemetry collected by the KloudMate Agent.
KloudMate organizes Kubernetes data into views for clusters, nodes, namespaces, pods, and workloads so you can move from fleet-level health to pod-level troubleshooting quickly.
What This Covers
Section titled “What This Covers”- cluster, node, namespace, pod, and workload metrics
- container logs and Kubernetes events
- optional application telemetry through Kubernetes auto-instrumentation
- prebuilt dashboards and alerting workflows built on top of the collected data
Install the Kubernetes Agent
Section titled “Install the Kubernetes Agent”Use the Kubernetes Agent installation guide to deploy the KloudMate Agent into your cluster.
The same installation flow supports self-managed Kubernetes and managed services such as AKS, EKS, and GKE Standard. For GKE Autopilot, see the Monitoring GKE Autopilot Clusters guide.
What You Can Do After Installation
Section titled “What You Can Do After Installation”- detect performance issues early
- track CPU and memory utilization across nodes, pods, and workloads
- identify unhealthy or unstable workloads
- investigate incidents with logs, events, and metrics together
- build dashboards and alerts around cluster health
Once data is flowing, use:
- Explore for ad hoc queries
- Log Explorer to investigate pod logs
- Dashboards for curated visualizations
- Alerts to notify on resource or availability issues
Unified Monitoring for Apps and Databases
Section titled “Unified Monitoring for Apps and Databases”When applications and databases run inside Kubernetes, KloudMate can correlate their telemetry with infrastructure metrics from the same cluster.
Use these guides when you need deeper service-level visibility:
What Kubernetes Monitoring Covers
Section titled “What Kubernetes Monitoring Covers”KloudMate provides visibility into all major Kubernetes components:
- Clusters: overall health, capacity, and scale
- Nodes: infrastructure-level resource usage and node health
- Namespaces: resource distribution across teams, applications, or environments
- Pods: pod performance, restarts, and resource consumption
- Workloads: health and availability of deployments, daemonsets, and statefulsets
Validation Checklist
Section titled “Validation Checklist”After the agent is installed:
- open the Kubernetes module in KloudMate
- confirm that clusters and nodes appear within a few minutes
- verify CPU and memory metrics for nodes, pods, and workloads
- validate logs and events in the related views if those features are enabled
KloudMate also provides prebuilt dashboard templates for common Kubernetes views. See Create a Dashboard for template usage.
Common Kubernetes Metrics
Section titled “Common Kubernetes Metrics”When Kubernetes Monitoring is enabled, KloudMate automatically collects a set of common Kubernetes metrics. No additional configuration is required.
Cluster and Workload Metrics
Section titled “Cluster and Workload Metrics”- k8s_container_cpu_limit
- k8s_container_cpu_request
- k8s_container_memory_limit
- k8s_container_memory_request
- k8s_container_ready
- k8s_container_restarts
- k8s_deployment_available
- k8s_deployment_desired
Node and Container Metrics
Section titled “Node and Container Metrics”- container_cpu_time
- container_cpu_usage
- container_memory_usage
- container_memory_working_set
- k8s_node_cpu_usage
- k8s_node_memory_usage
- k8s_node_network_io
These metrics are sourced from the upstream OpenTelemetry receivers: