Sep 20th, ‘24/4 min read

Adding Cluster Labels to Kubernetes Metrics

A definitive guide on adding cluster label to all Kubernetes metrics

Adding Cluster Labels to Kubernetes Metrics

As a DevOps engineer who's spent countless hours wrangling Kubernetes clusters and their metrics, I've learned the hard way that proper labeling is crucial, especially in multi-cluster environments. Today, I'm going to share my experience adding cluster labels to metrics collected from Kubernetes clusters, focusing on setups using Prometheus Operator or the Kube Helm stack.

Why Cluster Labels Matter

Before we dive into the how, let's quickly touch on the why. If you're managing multiple Kubernetes clusters, you've probably run into situations where you couldn't immediately tell which metric came from which cluster. This can be a real headache when you're trying to diagnose issues or compare performance across environments.

Adding a cluster label to your metrics solves this problem. It allows you to:

  1. Easily filter and group metrics by cluster
  2. Create more meaningful dashboards and alerts
  3. Simplify the process of comparing metrics across different clusters

Now, let's get into the nitty-gritty of how to implement this for both Prometheus Operator and Kube Helm stack users.

Adding Cluster Labels with Prometheus Operator

If you're using Prometheus Operator, you're in luck. It provides a straightforward way to add custom labels to all metrics scraped from your cluster.

Step 1: Update the Prometheus resource

First, you'll need to update your Prometheus custom resource. Here's an example of how to add a cluster label:

apiVersion: monitoring.coreos.com/v1
kind: Prometheus
metadata:
  name: prometheus
  namespace: monitoring
spec:
  serviceAccountName: prometheus
  serviceMonitorSelector:
    matchLabels:
      team: frontend
  ruleSelector:
    matchLabels:
      team: frontend
  resources:
    requests:
      memory: 400Mi
  enableAdminAPI: false
  externalLabels:
    cluster: production-cluster-1  # Add this line

The externalLabels field is where the magic happens. Any labels defined here will be added to all metrics scraped by this Prometheus instance.

Step 2: Apply the changes

After updating the Prometheus resource, apply it to your cluster:

kubectl apply -f prometheus.yaml

Step 3: Verify the changes

To verify that the new label is being applied, you can query Prometheus directly. Here's an example query:

sum(kube_pod_info) by (cluster)

This should return results grouped by your new cluster label.

Adding Cluster Labels with Kube Helm Stack

If you're using the Kube Helm stack (which typically includes Prometheus), there are actually a couple of ways to add cluster labels. Let's explore both methods.

Method 1: Using prometheusSpec.externalLabels

Step 1: Update your values.yaml

When installing or upgrading your Helm release, you'll need to modify the values.yaml file. Here's an example of how to add a cluster label:

prometheus:
  prometheusSpec:
    externalLabels:
      cluster: production-cluster-1

Step 2: Apply the changes

If you're installing for the first time:

helm install monitoring prometheus-community/kube-prometheus-stack -f values.yaml

If you're updating an existing installation:

helm upgrade monitoring prometheus-community/kube-prometheus-stack -f values.yaml

Step 3: Verify the changes

Just like with Prometheus Operator, you can verify the changes by querying Prometheus:

sum(kube_pod_info) by (cluster)

Method 2: Using commonLabels

An alternative approach is to use the commonLabels field in your Helm values. This method has some advantages, as it applies the labels more broadly within your Helm release.

Step 1: Update your values.yaml

Add the commonLabels field to your values.yaml:

commonLabels: 
  cluster: otlp-aps1.last9.io

Step 2: Apply the changes

Use the same Helm install or upgrade commands as in Method 1.

Step 3: Verify the changes

You can verify the changes by checking the labels on the Prometheus pods:

kubectl get pods -n monitoring -l app=prometheus -o yaml | grep -i cluster

You should see your cluster label in the output.

Comparing the Two Methods

  1. Scope:
    • prometheusSpec.externalLabels adds labels specifically to metrics collected by Prometheus.
    • commonLabels adds labels to all resources created by the Helm chart, including the Prometheus pods themselves.
  2. Flexibility:
    • prometheusSpec.externalLabels gives you more control over which metrics get labeled.
    • commonLabels is a broader approach that ensures consistency across all resources.
  3. Use Case:
    • Use prometheusSpec.externalLabels if you only want to label the metrics.
    • Use commonLabels if you want to label both the metrics and the Kubernetes resources created by the Helm chart.

In many cases, using commonLabels can be a more comprehensive solution, as it ensures that both your metrics and your Kubernetes resources are consistently labeled. This can be particularly useful for resource management and troubleshooting.

However, if you need fine-grained control over metric labeling without affecting other resources, stick with prometheusSpec.externalLabels.

Remember, you can always combine both approaches if needed, but be careful to avoid conflicts or redundancy in your label definitions.

Advanced Techniques: Using Relabeling

Sometimes, you might want more fine-grained control over how the cluster label is applied. In these cases, you can use Prometheus's powerful relabeling feature.

Here's an example of how you might use relabeling to add a cluster label based on the Kubernetes node name:

prometheus:
  prometheusSpec:
    relabelConfigs:
      - sourceLabels: [__address__]
        regex: '(.*)'
        targetLabel: cluster
        replacement: 'production-cluster-1'

This configuration adds a cluster label with the value production-cluster-1 to all metrics, regardless of their source.

Common Pitfalls and How to Avoid Them

  1. Label conflicts: Be careful not to choose a label name that's already in use. Always check your existing metrics before adding new labels.
  2. Performance impact: While adding a single cluster label shouldn't have a significant impact, be cautious about adding too many labels. Each unique combination of label values creates a new time series, which can quickly balloon your storage requirements.
  3. Consistency across clusters: If you're managing multiple clusters, make sure you're using consistent label names and values across all of them. This will make cross-cluster queries much easier.

Conclusion

Adding cluster labels to your Kubernetes metrics might seem like a small change, but it can significantly improve your observability and make your life easier when managing multiple clusters. Whether you're using Prometheus Operator or the Kube Helm stack, the process is relatively straightforward, and the benefits are well worth the effort.

Remember, good observability is about more than just collecting metrics – it's about making those metrics meaningful and actionable. Proper labeling is a key step in that direction.

Happy monitoring!

Newsletter

Stay updated on the latest from Last9.

Authors

Prathamesh Sonpatki

Prathamesh works as an evangelist at Last9, runs SRE stories - where SRE and DevOps folks share their stories, and maintains o11y.wiki - a glossary of all terms related to observability.