As a DevOps engineer with years of experience optimizing Kubernetes clusters, I've come to appreciate the critical role that robust monitoring solutions play in maintaining healthy, efficient systems.
Among the myriad tools available, kube-state-metrics stands out as an indispensable component of any comprehensive Kubernetes observability stack. In this in-depth guide, I'll share my experiences and insights on leveraging this powerful exporter to enhance your Kubernetes monitoring capabilities.
Understanding kube-state-metrics
What is kube-state-metrics?
kube-state-metrics is an open-source add-on for Kubernetes that listens to the Kubernetes API server and generates metrics about the state of various Kubernetes objects. Unlike metrics-server, which provides resource usage data (CPU, memory), kube-state-metrics focuses on the health and status of Kubernetes resources.
Key Features
- Object-focused metrics: Provides metrics for a wide range of Kubernetes objects, including Pods, Deployments, StatefulSets, and more.
- Read-only access: Only requires read-only access to the Kubernetes API, enhancing security.
- Prometheus compatibility: Exposes metrics in Prometheus format, making it easy to integrate with existing monitoring stacks.
- Custom resource support: Can be extended to monitor custom resources (CRDs).
Why kube-state-metrics Matters
In my years working with Kubernetes clusters, I've found that while tools like kubelet and metrics-server provide crucial CPU and memory usage data, they don't give the full picture.
kube-state-metrics fills this gap by offering insights into the state of deployments, pods, nodes, and other Kubernetes objects.
For example, while metrics-server might tell you that a node is using 80% of its CPU, kube-state-metrics can tell you how many pods are running on that node, how many are in a failed state, or whether there are any pending pods due to resource constraints.
Setting Up kube-state-metrics
Let's deep dive into the process of setting up kube-state-metrics in a Kubernetes cluster:
Installation Options
There are several ways to install kube-state-metrics:
- Using Helm (recommended)
- Manual installation using YAML manifests
- Building from source
I prefer using Helm, as it simplifies the process and manages updates efficiently.
Installation using Helm
First, add the Prometheus community repo:
helm repo add prometheus-community https://prometheus-community.github.io/helm-charts
helm repo update
Then, install kube-state-metrics:
helm install kube-state-metrics prometheus-community/kube-state-metrics
For more customization options, you can create a values.yaml file:
replicas: 2
autosharding:
enabled: true
podSecurityPolicy:
enabled: true
Then install with:
helm install kube-state-metrics prometheus-community/kube-state-metrics -f values.yaml
Manual Installation
If you prefer manual installation or need more control over the deployment, you can find the YAML manifests in the official kube-state-metrics GitHub repo.
Clone the repository:
git clone https://github.com/kubernetes/kube-state-metrics.git
- cd kube-state-metrics
- Apply the manifests:
kubectl apply -f examples/standard
Configuration and RBAC
After installation, kube-state-metrics will create a ServiceAccount, ClusterRole, and ClusterRoleBinding to ensure it has the necessary permissions to access the Kubernetes API server.
Here's a snippet of the ClusterRole YAML:
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
name: kube-state-metrics
rules:
- apiGroups: [""]
resources:
- configmaps
- secrets
- nodes
- pods
- services
- resourcequotas
- replicationcontrollers
- limitranges
- persistentvolumeclaims
- persistentvolumes
- namespaces
- endpoints
verbs: ["list", "watch"]
- apiGroups: ["apps"]
resources:
- statefulsets
- daemonsets
- deployments
- replicasets
verbs: ["list", "watch"]
# ... more rules ...
This RBAC configuration ensures that kube-state-metrics has read-only access to the necessary Kubernetes resources.
To verify the installation, run:
kubectl get pods -n kube-system | grep kube-state-metrics
You should see the kube-state-metrics pod(s) running.
Integrating with Prometheus
Now that we have kube-state-metrics running, let's integrate it with Prometheus to start collecting and visualizing our metrics.
Configuring Prometheus
Add the following job to your Prometheus config:
- job_name: 'kube-state-metrics'
kubernetes_sd_configs:
- role: endpoints
relabel_configs:
- source_labels: [__meta_kubernetes_service_name]
regex: kube-state-metrics
action: keep
- action: labelmap
regex: __meta_kubernetes_service_label_(.+)
- source_labels: [__meta_kubernetes_namespace]
action: replace
target_label: kubernetes_namespace
- source_labels: [__meta_kubernetes_service_name]
action: replace
- target_label: kubernetes_name
- Restart Prometheus to apply the changes.
Querying kube-state-metrics in Prometheus
Once Prometheus is scraping kube-state-metrics, you can start querying the metrics. Here are some useful queries:
- Number of pods by namespace:
sum(kube_pod_info) by (namespace)
- Pods not in Running state:
sum(kube_pod_status_phase{phase!="Running"}) by (namespace, phase)
- Deployments not at the desired number of replicas:
kube_deployment_spec_replicas != kube_deployment_status_replicas_available
- Persistent volumes by phase:
sum(kube_persistentvolume_status_phase) by (phase)
Real-World Use Cases
Let's explore some practical use cases where I've found kube-state-metrics to be invaluable:
1. Monitoring Pod Health
One of the most common issues I've encountered is pods stuck in a crash loop. kube-state-metrics makes it easy to track pod restarts:
sum(kube_pod_container_status_restarts_total) by (pod)
This query helps identify problematic pods quickly, saving precious debugging time. I once used this to identify a memory leak in a Java application that was causing frequent restarts.
2. Resource Quota Management
In multi-tenant clusters, managing resource quotas is crucial. kube-state-metrics provides metrics like:
kube_resourcequota
This allows us to track resource usage against quotas, helping prevent resource starvation issues. I've seen this bring down entire namespaces when not properly monitored.
Example query to check CPU usage against quota:
sum(kube_pod_container_resource_requests_cpu_cores) by (namespace) /
sum(kube_resourcequota{resourcequota!="", resource="requests.cpu"}) by (namespace)
3. Persistent Volume Monitoring
I once dealt with a critical outage caused by running out of persistent volume space. kube-state-metrics could have helped prevent this with metrics like:
kube_persistentvolume_status_phase
This metric allows you to track the status of your persistent volumes and set up alerts for when they're nearing capacity.
4. Deployment Tracking
Tracking the success of rolling updates is crucial. kube-state-metrics provides detailed deployment metrics:
kube_deployment_status_replicas_available
kube_deployment_status_replicas_unavailable
These metrics have helped me quickly identify failed deployments and roll back when necessary.
For example, you can set up an alert for when available replicas don't match the desired state:
kube_deployment_status_replicas_available != kube_deployment_spec_replicas
5. Node Capacity Planning
kube-state-metrics provides valuable data for capacity planning:
sum(kube_node_status_capacity_cpu_cores) - sum(kube_node_status_allocatable_cpu_cores)
This query shows the difference between total CPU capacity and allocatable CPU, helping identify overhead and plan for cluster scaling.
Advanced Topics and Best Practices
As you become more comfortable with kube-state-metrics, consider these advanced topics and best practices:
1. Custom Resources
kube-state-metrics supports custom resources. I've used this to monitor application-specific CRDs, providing valuable business metrics.
To enable custom resource metrics, you need to build kube-state-metrics from the source with the custom resource enabled. Here's an example Dockerfile:
FROM golang:1.17 as builder
WORKDIR /go/src/k8s.io/kube-state-metrics
COPY . .
RUN make build-local
FROM gcr.io/distroless/static:nonroot
COPY --from=builder /go/src/k8s.io/kube-state-metrics/kube-state-metrics .
USER nonroot:nonroot
ENTRYPOINT ["./kube-state-metrics", "--custom-resource-state-config=/path/to/cr-config.yaml"]
The cr-config.yaml file would look something like this:
spec:
resources:
- groupVersionKind:
group: "myapp.com"
kind: "MyCustomResource"
version: "v1"
metrics:
- name: "mycustomresource_status_phase"
help: "The current phase of the custom resource"
each:
type: Gauge
gauge:
valueFrom:
path: "{.status.phase}"
2. High Availability
For critical environments, consider running multiple replicas of kube-state-metrics using a StatefulSet for high availability. Here's an example configuration:
apiVersion: apps/v1
kind: StatefulSet
metadata:
name: kube-state-metrics
spec:
serviceName: "kube-state-metrics"
replicas: 2
selector:
matchLabels:
app: kube-state-metrics
template:
metadata:
labels:
app: kube-state-metrics
spec:
containers:
- name: kube-state-metrics
image: k8s.gcr.io/kube-state-metrics/kube-state-metrics:v2.1.0
ports:
- containerPort: 8080
name: http-metrics
- containerPort: 8081
name: telemetry
readinessProbe:
httpGet:
path: /healthz
port: 8080
initialDelaySeconds: 5
timeoutSeconds: 5
3. Resource Limits
Be sure to set appropriate CPU and memory limits for kube-state-metrics. I've seen it consume significant resources in large clusters. Here's an example of setting resource limits:
resources:
limits:
cpu: 100m
memory: 150Mi
requests:
cpu: 100m
memory: 150Mi
4. Metric Relabeling
Use Prometheus' relabeling features to add useful metadata to your metrics, making them more informative and easier to query. Here's an example relabeling configuration:
metric_relabel_configs:
- source_labels: [__name__]
regex: 'kube_pod_container_status_running'
action: keep
- action: labelmap
regex: __meta_kubernetes_pod_label_(.+)
5. Grafana Dashboards
Create comprehensive Grafana dashboards using kube-state-metrics. The kube-state-metrics GitHub repo has some great examples to get you started.
Here's a simple Grafana dashboard query to show the number of pods per namespace:
sum(kube_pod_info) by (namespace)
For more complex dashboards, you can combine kube-state-metrics with other data sources. For example, to show CPU usage vs. requests:
sum(rate(container_cpu_usage_seconds_total[5m])) by (namespace) /
sum(kube_pod_container_resource_requests_cpu_cores) by (namespace)
6. Alerting
Set up alerting rules in Prometheus to proactively notify you of potential issues. Here's an example alert for pods in a non-running state:
groups:
- name: kubernetes-apps
rules:
- alert: KubePodNotReady
expr: sum by (namespace, pod) (kube_pod_status_phase{phase=~"Pending|Unknown"}) > 0
for: 15m
labels:
severity: warning
annotations:
summary: "Pod {{ $labels.pod }} in namespace {{ $labels.namespace }} is not ready"
description: "Pod {{ $labels.pod }} has been in a non-ready state for more than 15 minutes."
For large clusters, you might need to tune kube-state-metrics for better performance.
Some options include:
- Enabling metric caching:
--metric-cache-size=1000
- Adjusting the metrics update interval:
--metric-resync-interval=30s
- Using metric allow-lists to reduce the number of metrics collected:
--metric-allowlist=kube_pod_status_phase,kube_deployment_status_replicas
These optimizations can significantly improve the runtime performance of kube-state-metrics, especially in large Kubernetes environments.
8. Integration with Cloud Providers
When running Kubernetes on cloud providers like AWS, you can leverage kube-state-metrics to monitor cloud-specific resources. For example, you can track the number of load balancers created by your Ingress controllers:
sum(kube_service_spec_type{type="LoadBalancer"}) by (namespace)
This query helps you keep track of your AWS resources and associated costs.
9. Container Image Management
kube-state-metrics can help you track the usage of container images across your cluster:
sum(kube_pod_container_info) by (image)
This is particularly useful for ensuring that all pods are running the expected Docker images and versions.
Troubleshooting Common Issues
Even with careful setup, you might encounter some issues. Here are some common problems and their solutions:
- High CPU Usage: If kube-state-metrics is consuming too much CPU, consider using metric allow-lists or increasing the metric resync interval.
- Memory Leaks: Ensure you're using the latest version of kube-state-metrics. Earlier versions had memory leak issues that have since been resolved.
- Missing Metrics: Check the RBAC permissions. kube-state-metrics might not have access to all namespaces or resources.
- Inconsistent Metrics: This can happen in large clusters due to the eventually consistent nature of the Kubernetes API. Consider increasing the metric resync interval.
- Ingress Metrics Missing: If you're not seeing metrics for Ingress resources, make sure you're using a compatible Ingress controller and that kube-state-metrics has permissions to access Ingress resources.
- Authentication Issues: If you're experiencing authentication problems, ensure that the ServiceAccount associated with kube-state-metrics has the correct RBAC permissions. You may need to adjust the ClusterRole or Role bindings.
Conclusion
kube-state-metrics has become an essential tool in my Kubernetes observability stack. Its ability to provide deep insights into the state of Kubernetes objects, combined with the power of Prometheus and Grafana, has significantly improved my ability to maintain and troubleshoot Kubernetes clusters.
Remember, observability is not just about collecting metrics—it's about gaining actionable insights. kube-state-metrics, when used effectively, can provide those insights and help you maintain a healthy, efficient Kubernetes environment.
As you implement kube-state-metrics in your own clusters, keep in mind that every environment is unique.
Don't be afraid to experiment with different configurations and metrics to find what works best for your specific use cases.
Happy monitoring!
We'd love to hear your experiences with reliability, observability, or monitoring! Join the conversation and share your insights with us in the SRE Discord community.
What are kube-state-metrics?
kube-state-metrics is a service that listens to the Kubernetes API server and generates metrics about the state of Kubernetes objects. It provides a metrics endpoint for Prometheus to scrape, offering insights into the health and status of various Kubernetes resources.
What is the difference between node exporter and kube-state-metrics?
Node exporter focuses on collecting system-level metrics from nodes (CPU, memory, disk usage, etc.), while kube-state-metrics generates metrics about Kubernetes objects (pods, deployments, services, etc.). They complement each other to provide a complete view of your cluster's health.
How do you deploy kube-state-metrics on Kubernetes?
You can deploy kube-state-metrics using Helm charts or by applying YAML manifests directly. Here's a simple Helm command to install kube-state-metrics:
helm install kube-state-metrics prometheus-community/kube-state-metrics
How can we expose metrics to Prometheus?
kube-state-metrics exposes a metrics endpoint that Prometheus can scrape. Add a job to your Prometheus configuration like this:
- job_name: 'kube-state-metrics'
kubernetes_sd_configs:
- role: endpoints
relabel_configs:
- source_labels: [__meta_kubernetes_service_name]
regex: kube-state-metrics
action: keep
What metrics are provided by kube-state-metrics in Kubernetes?
kube-state-metrics provides a wide range of metrics, including but not limited to:
- Pod status and resource usage
- Deployment status and replica counts
- Node capacity and allocatable resources
- PersistentVolume and PersistentVolumeClaim status
- Service and Ingress details
How do I monitor pod status using kube-state-metrics in Kubernetes?
You can use queries like these in Prometheus:
kube_pod_status_phase{phase="Running"} # Number of running pods
kube_pod_container_status_restarts_total # Container restart count
Can you monitor Kubernetes cluster status using kube-state-metrics?
Yes, kube-state-metrics provides various metrics that give insights into cluster status, such as node conditions, resource usage, and the status of core Kubernetes components.
For example:
kube_node_status_condition{condition="Ready", status="true"} # Number of ready nodes
How many load balancer services do we have and what are their IPs?
You can use a query like this to get the count and IPs of LoadBalancer services:
kube_service_spec_type{type="LoadBalancer"}
kube_service_status_load_balancer_ingress
These metrics will give you information about the number of LoadBalancer services and their assigned IPs.