Use this guide when you want to collect GKE Autopilot metrics with the OpenTelemetry Collector. Because GKE Autopilot automatically manages nodes and restricts certain privileged workloads, the standard KloudMate Agent installation cannot be used.
This guide demonstrates how to use a dedicated OpenTelemetry Collector deployment to pull Autopilot telemetry directly from Google Cloud Monitoring into KloudMate.
- A running GKE Autopilot cluster.
- A GCP service account with the
Monitoring Viewer role.
- The following CLIs installed and configured:
Create a file named service-account.yaml:
apiVersion: v1
kind: ServiceAccount
metadata:
name: otel-collector
namespace: default
Apply to your cluster:
kubectl apply -f service-account.yaml
Annotate the Kubernetes service account to link it with your GCP service account. Replace placeholders with your values:
kubectl annotate serviceaccount otel-collector \
--namespace default \
iam.gke.io/gcp-service-account=<gcp-service-account>@<project>.iam.gserviceaccount.com \
--overwrite
gcloud iam service-accounts add-iam-policy-binding \
<gcp-service-account>@<project>.iam.gserviceaccount.com \
--member="serviceAccount:<project>.svc.id.goog[default/otel-collector]" \
--role="roles/iam.workloadIdentityUser" \
--project=<project>
Create a file deployment.yaml and replace placeholders such as <PROJECT_ID>, <API_KEY>, and <CLUSTER_NAME>.
apiVersion: v1
kind: ConfigMap
metadata:
name: otel-collector
namespace: default
labels:
app: opentelemetry
component: otel-collector
data:
config.yaml: |
receivers:
googlecloudmonitoring:
collection_interval: 60s
project_id: <PROJECT_ID>
metrics_list:
- metric_descriptor_filter: 'metric.type = starts_with("kubernetes.io/")'
- metric_descriptor_filter: 'metric.type = starts_with("container.googleapis.com")'
- metric_descriptor_filter: 'metric.type = starts_with("gke.io")'
- metric_descriptor_filter: 'metric.type = starts_with("prometheus.googleapis.com")'
processors:
resourcedetection:
detectors: [env, system]
timeout: 5s
override: false
attributes/metrics:
actions:
- key: cluster
value: "<CLUSTER_NAME>"
action: insert
batch:
send_batch_size: 10000
timeout: 60s
memory_limiter:
check_interval: 1s
limit_mib: 400
spike_limit_mib: 100
exporters:
debug:
verbosity: detailed
otlphttp:
endpoint: "https://otel.kloudmate.com:4318"
headers:
Authorization: <API_KEY>
service:
pipelines:
metrics:
receivers: [googlecloudmonitoring]
processors: [memory_limiter, batch, resourcedetection, attributes/metrics]
exporters: [otlphttp, debug]
---
apiVersion: apps/v1
kind: Deployment
metadata:
name: otel-collector
namespace: default
spec:
selector:
matchLabels:
app: otel-collector
template:
metadata:
labels:
app: otel-collector
spec:
serviceAccountName: otel-collector
containers:
- name: otel-collector
image: otel/opentelemetry-collector-contrib:latest
args: ["--config=/etc/otel-collector-config/config.yaml"]
env:
- name: KUBE_NODE_NAME
valueFrom:
fieldRef:
fieldPath: spec.nodeName
volumeMounts:
- name: config-vol
mountPath: /etc/otel-collector-config
- name: varlogpods
mountPath: /var/log/pods
readOnly: true
- name: varlogcontainers
mountPath: /var/log/containers
readOnly: true
volumes:
- name: config-vol
configMap:
name: otel-collector
- name: varlogpods
hostPath:
path: /var/log/pods
- name: varlogcontainers
hostPath:
path: /var/log/containers
securityContext:
runAsUser: 0
Apply the deployment to your cluster with this CLI command:
kubectl apply -f deployment.yaml
Log in to KloudMate and verify that cluster metrics begin appearing within a few minutes. Then use Explore or dashboards to validate CPU, memory, workload, and cluster-level telemetry.