As a DevOps engineer with years of experience optimizing Kubernetes clusters, I've come to appreciate the critical role that robust monitoring solutions play in maintaining healthy, efficient systems.
Among the myriad tools available, kube-state-metrics stands out as an indispensable component of any comprehensive Kubernetes observability stack. In this in-depth guide, I'll share my experiences and insights on leveraging this powerful exporter to enhance your Kubernetes monitoring capabilities.
Understanding kube-state-metrics
What is kube-state-metrics?
kube-state-metrics is an open-source add-on for Kubernetes that listens to the Kubernetes API server and generates metrics about the state of various Kubernetes objects.
Unlike metrics-server, which provides resource usage data (CPU, memory), kube-state-metrics focuses on the health and status of Kubernetes resources.
Key Features of kube-state-metrics
- Object-focused metrics: Provides metrics for a wide range of Kubernetes objects, including Pods, Deployments, StatefulSets, and more.
- Read-only access: Only requires read-only access to the Kubernetes API, enhancing security.
- Prometheus compatibility: Exposes metrics in Prometheus format, making it easy to integrate with existing monitoring stacks.
- Custom resource support: Can be extended to monitor custom resources (CRDs).
Why kube-state-metrics Matters
In my years working with Kubernetes clusters, I've found that while tools like kubelet and metrics-server provide crucial CPU and memory usage data, they don't give the full picture.
kube-state-metrics fills this gap by offering insights into the state of deployments, pods, nodes, and other Kubernetes objects.
For example, while metrics-server might tell you that a node is using 80% of its CPU, kube-state-metrics can tell you how many pods are running on that node, how many are in a failed state, or whether there are any pending pods due to resource constraints.
Setting Up kube-state-metrics
Let's understand the process of setting up kube-state-metrics in a Kubernetes cluster:
Installation Options
I prefer using Helm, as it simplifies the process and manages updates efficiently.
Using Helm (Recommended)
First, add the Prometheus community repo:
helm repo add prometheus-community https://prometheus-community.github.io/helm-charts
helm repo updateThen, install kube-state-metrics:
helm install kube-state-metrics prometheus-community/kube-state-metricsFor more customization options, you can create values.yaml file:
replicas: 2
autosharding:
  enabled: true
podSecurityPolicy:
  enabled: trueThen install with:
helm install kube-state-metrics prometheus-community/kube-state-metrics -f values.yamlManual Installation using YAML Manifests
If you prefer manual installation or need more control over the deployment, you can find the YAML manifests in the official kube-state-metrics GitHub repo.
Clone the repository:
git clone https://github.com/kubernetes/kube-state-metrics.git- cd kube-state-metrics
- Apply the manifests:
kubectl apply -f examples/standardConfiguration and RBAC
After installation, kube-state-metrics will create a ServiceAccount, ClusterRole, and ClusterRoleBinding to ensure it has the necessary permissions to access the Kubernetes API server.
Here's a snippet of the ClusterRole YAML:
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
  name: kube-state-metrics
rules:
- apiGroups: [""]
  resources:
  - configmaps
  - secrets
  - nodes
  - pods
  - services
  - resourcequotas
  - replicationcontrollers
  - limitranges
  - persistentvolumeclaims
  - persistentvolumes
  - namespaces
  - endpoints
  verbs: ["list", "watch"]
- apiGroups: ["apps"]
  resources:
  - statefulsets
  - daemonsets
  - deployments
  - replicasets
  verbs: ["list", "watch"]
# ... more rules ...This RBAC configuration ensures that kube-state-metrics has read-only access to the necessary Kubernetes resources.
To verify the installation, run:
kubectl get pods -n kube-system | grep kube-state-metricsYou should see the kube-state-metrics pod(s) running.
Building from Source
Building kube-state-metrics from the source allows you to customize the application or use the latest features not yet released in the official binaries. Follow these steps to build kube-state-metrics from the source:
Prerequisites
Before you begin, ensure you have the following installed:
- Go: Make sure you have Go installed (version 1.18 or later is recommended). You can download it from the official Go website.
- Git: You’ll need Git to clone the kube-state-metrics repository. Install Git from the official site if you don't have it already.
Steps to Build kube-state-metrics
- Clone the Repository
 Open a terminal and run the following command to clone the kube-state-metrics repository:
git clone https://github.com/kubernetes/kube-state-metrics.gitChange into the cloned directory:
cd kube-state-metrics- Checkout the Desired Version
 If you want to build a specific version, use the following command to check out that version:
git checkout <version-tag>Replace <version-tag> with the desired version (e.g., v2.0.0).
- Build the Project
 Use the following command to build the kube-state-metrics binary:
make build- This command compiles the code and generates the kube-state-metricsbinary in the./bindirectory.
- Run kube-state-metrics
 You can run kube-state-metrics directly from the terminal using the built binary:
./bin/kube-state-metricsBy default, it listens on the port 8080. You can configure additional options as needed.
- Verify the Installation
 Open your web browser or usecurlto access the kube-state-metrics metrics endpoint:
curl http://localhost:8080/metricsIntegrating with Prometheus
Now that we have kube-state-metrics running, let's integrate it with Prometheus to start collecting and visualizing our metrics.
Configuring Prometheus
Add the following job to your Prometheus config:
- job_name: 'kube-state-metrics'
  kubernetes_sd_configs:
  - role: endpoints
  relabel_configs:
  - source_labels: [__meta_kubernetes_service_name]
    regex: kube-state-metrics
    action: keep
  - action: labelmap
    regex: __meta_kubernetes_service_label_(.+)
  - source_labels: [__meta_kubernetes_namespace]
    action: replace
    target_label: kubernetes_namespace
  - source_labels: [__meta_kubernetes_service_name]
    action: replace- target_label: kubernetes_name
- Restart Prometheus to apply the changes.
Querying kube-state-metrics in Prometheus
Once Prometheus is scraping kube-state-metrics, you can start querying the metrics. Here are some useful queries:
- Number of pods by namespace:
sum(kube_pod_info) by (namespace)- Pods not in Running state:
sum(kube_pod_status_phase{phase!="Running"}) by (namespace, phase)- Deployments not at the desired number of replicas:
kube_deployment_spec_replicas != kube_deployment_status_replicas_available- Persistent volumes by phase:
sum(kube_persistentvolume_status_phase) by (phase)kube-state-metrics Use Cases
Let's understand some practical use cases where I've found kube-state-metrics to be invaluable:
1. Monitoring Pod Health
One of the most common issues I've encountered is pods stuck in a crash loop. kube-state-metrics makes it easy to track pod restarts:
sum(kube_pod_container_status_restarts_total) by (pod)This query helps identify problematic pods quickly, saving precious debugging time. I once used this to identify a memory leak in a Java application that was causing frequent restarts.
2. Resource Quota Management
In multi-tenant clusters, managing resource quotas is crucial. kube-state-metrics provides metrics like:
kube_resourcequotaThis allows us to track resource usage against quotas, helping prevent resource starvation issues. I've seen this bring down entire namespaces when not properly monitored.
Example query to check CPU usage against quota:
sum(kube_pod_container_resource_requests_cpu_cores) by (namespace) / 
sum(kube_resourcequota{resourcequota!="", resource="requests.cpu"}) by (namespace)3. Persistent Volume Monitoring
I once dealt with a critical outage caused by running out of persistent volume space. kube-state-metrics could have helped prevent this with metrics like:
kube_persistentvolume_status_phaseThis metric allows you to track the status of your persistent volumes and set up alerts for when they're nearing capacity.
4. Deployment Tracking
Tracking the success of rolling updates is crucial. kube-state-metrics provides detailed deployment metrics:
kube_deployment_status_replicas_available
kube_deployment_status_replicas_unavailableThese metrics have helped me quickly identify failed deployments and roll back when necessary.
For example, you can set up an alert for when available replicas don't match the desired state:
kube_deployment_status_replicas_available != kube_deployment_spec_replicas5. Node Capacity Planning
kube-state-metrics provides valuable data for capacity planning:
sum(kube_node_status_capacity_cpu_cores) - sum(kube_node_status_allocatable_cpu_cores)This query shows the difference between total CPU capacity and allocatable CPU, helping identify overhead and plan for cluster scaling.
Advanced Topics and Best Practices
As you become more comfortable with kube-state-metrics, consider these advanced topics and best practices:
1. Custom Resources
kube-state-metrics supports custom resources. I've used this to monitor application-specific CRDs, providing valuable business metrics.
To enable custom resource metrics, you need to build kube-state-metrics from the source with the custom resource enabled. Here's an example Dockerfile:
FROM golang:1.17 as builder
WORKDIR /go/src/k8s.io/kube-state-metrics
COPY . .
RUN make build-local
FROM gcr.io/distroless/static:nonroot
COPY --from=builder /go/src/k8s.io/kube-state-metrics/kube-state-metrics .
USER nonroot:nonroot
ENTRYPOINT ["./kube-state-metrics", "--custom-resource-state-config=/path/to/cr-config.yaml"]The cr-config.yaml file would look something like this:
spec:
  resources:
    - groupVersionKind:
        group: "myapp.com"
        kind: "MyCustomResource"
        version: "v1"
      metrics:
        - name: "mycustomresource_status_phase"
          help: "The current phase of the custom resource"
          each:
            type: Gauge
            gauge:
              valueFrom:
                path: "{.status.phase}"2. High Availability
For critical environments, consider running multiple replicas of kube-state-metrics using a StatefulSet for high availability. Here's an example configuration:
apiVersion: apps/v1
kind: StatefulSet
metadata:
  name: kube-state-metrics
spec:
  serviceName: "kube-state-metrics"
  replicas: 2
  selector:
    matchLabels:
      app: kube-state-metrics
  template:
    metadata:
      labels:
        app: kube-state-metrics
    spec:
      containers:
      - name: kube-state-metrics
        image: k8s.gcr.io/kube-state-metrics/kube-state-metrics:v2.1.0
        ports:
        - containerPort: 8080
          name: http-metrics
        - containerPort: 8081
          name: telemetry
        readinessProbe:
          httpGet:
            path: /healthz
            port: 8080
          initialDelaySeconds: 5
          timeoutSeconds: 53. Resource Limits
Be sure to set appropriate CPU and memory limits for kube-state-metrics. I've seen it consume significant resources in large clusters. Here's an example of setting resource limits:
resources:
  limits:
    cpu: 100m
    memory: 150Mi
  requests:
    cpu: 100m
    memory: 150Mi4. Metric Relabeling
Use Prometheus' relabeling features to add useful metadata to your metrics, making them more informative and easier to query. Here's an example relabeling configuration:
metric_relabel_configs:
- source_labels: [__name__]
  regex: 'kube_pod_container_status_running'
  action: keep
- action: labelmap
  regex: __meta_kubernetes_pod_label_(.+)5. Grafana Dashboards
Create comprehensive Grafana dashboards using kube-state-metrics. The kube-state-metrics GitHub repo has some great examples to get you started.
Here's a simple Grafana dashboard query to show the number of pods per namespace:
sum(kube_pod_info) by (namespace)For more complex dashboards, you can combine kube-state-metrics with other data sources. For example, to show CPU usage vs. requests:
sum(rate(container_cpu_usage_seconds_total[5m])) by (namespace) /
sum(kube_pod_container_resource_requests_cpu_cores) by (namespace)6. Alerting
Set up alerting rules in Prometheus to proactively notify you of potential issues. Here's an example alert for pods in a non-running state:
groups:
- name: kubernetes-apps
  rules:
  - alert: KubePodNotReady
    expr: sum by (namespace, pod) (kube_pod_status_phase{phase=~"Pending|Unknown"}) > 0
    for: 15m
    labels:
      severity: warning
    annotations:
      summary: "Pod {{ $labels.pod }} in namespace {{ $labels.namespace }} is not ready"
      description: "Pod {{ $labels.pod }} has been in a non-ready state for more than 15 minutes."7. Performance Tuning
For large clusters, you might need to tune kube-state-metrics for better performance.
Some options include:
- Enabling metric caching:
--metric-cache-size=1000- Adjusting the metrics update interval:
--metric-resync-interval=30s- Using metric allow-lists to reduce the number of metrics collected:
--metric-allowlist=kube_pod_status_phase,kube_deployment_status_replicasThese optimizations can significantly improve the runtime performance of kube-state-metrics, especially in large Kubernetes environments.
8. Integration with Cloud Providers
When running Kubernetes on cloud providers like AWS, you can leverage kube-state-metrics to monitor cloud-specific resources. For example, you can track the number of load balancers created by your Ingress controllers:
sum(kube_service_spec_type{type="LoadBalancer"}) by (namespace)This query helps you keep track of your AWS resources and associated costs.
9. Container Image Management
kube-state-metrics can help you track the usage of container images across your cluster:
sum(kube_pod_container_info) by (image)This is particularly useful for ensuring that all pods are running the expected Docker images and versions.
Troubleshooting Common Issues
Even with careful setup, you might encounter some issues. Here are some common problems and their solutions:
- High CPU Usage: If kube-state-metrics is consuming too much CPU, consider using metric allow-lists or increasing the metric resync interval.
- Memory Leaks: Ensure you're using the latest version of kube-state-metrics. Earlier versions had memory leak issues that have since been resolved.
- Missing Metrics: Check the RBAC permissions. kube-state-metrics might not have access to all namespaces or resources.
- Inconsistent Metrics: This can happen in large clusters due to the eventually consistent nature of the Kubernetes API. Consider increasing the metric resync interval.
- Ingress Metrics Missing: If you're not seeing metrics for Ingress resources, make sure you're using a compatible Ingress controller and that kube-state-metrics has permission to access Ingress resources.
- Authentication Issues: If you're experiencing authentication problems, ensure that the ServiceAccount associated with kube-state-metrics has the correct RBAC permissions. You may need to adjust the ClusterRole or Role bindings.
Conclusion
kube-state-metrics has become an essential tool in my Kubernetes observability stack. Its ability to provide deep insights into the state of Kubernetes objects, combined with the power of Prometheus and Grafana, has significantly improved my ability to maintain and troubleshoot Kubernetes clusters.
As you implement kube-state-metrics in your own clusters, keep in mind that every environment is unique.
Happy monitoring!
We'd love to hear your experiences with reliability, observability, or monitoring! Join the conversation and share your insights with us in the SRE Discord community.
FAQs
What are kube-state-metrics?
kube-state-metrics listens to the Kubernetes API server and generates metrics about Kubernetes object states. It provides a metrics endpoint for Prometheus, offering insights into the health and status of various resources.
What is the difference between node exporter and kube-state-metrics?
Node exporter collects system-level metrics from nodes (CPU, memory, disk usage), while kube-state-metrics generates metrics about Kubernetes objects (pods, deployments, services). They work together to give a complete view of cluster health.
How do you deploy kube-state-metrics on Kubernetes?
You can deploy kube-state-metrics using Helm charts or YAML manifests. For Helm, use: helm install kube-state-metrics prometheus-community/kube-state-metrics
How can we expose metrics to Prometheus?
kube-state-metrics exposes a metrics endpoint for Prometheus to scrape. Add this job to your Prometheus config:
- job_name: 'kube-state-metrics'
  kubernetes_sd_configs:
  - role: endpoints
  relabel_configs:
  - source_labels: [__meta_kubernetes_service_name]
    regex: kube-state-metrics
    action: keepWhat metrics are provided by kube-state-metrics in Kubernetes?
kube-state-metrics provides metrics such as pod status, deployment status, node capacity, PersistentVolume and PersistentVolumeClaim status, and service details.
How do I monitor pod status using kube-state-metrics in Kubernetes?
Use Prometheus queries like kube_pod_status_phase{phase="Running"} for running pods and kube_pod_container_status_restarts_total for container restart counts.
Can you monitor Kubernetes cluster status using kube-state-metrics?
Yes, kube-state-metrics provides metrics on cluster status, such as node conditions and resource usage. For example, use kube_node_status_condition{condition="Ready", status="true"} to find ready nodes.
How many load balancer services do we have and what are their IPs?
Use the query kube_service_spec_type{type="LoadBalancer"} and kube_service_status_load_balancer_ingress to get the count and IPs of LoadBalancer services.
 
  
  
  
  
  
 