🏏 450 million fans watched the last IPL. What is 'Cricket Scale' for SREs? Know More

Nov 6th, ‘23/8 min read

Mastering Prometheus Relabeling: A Comprehensive Guide

A comprehensive guide to relabeling strategies in Prometheus

Share:
Mastering Prometheus Relabeling: A Comprehensive Guide

In this guide, we will explore the powerful feature of Prometheus relabeling, a technique central to refining data collection and enhancing monitoring efforts.

Before we start with our deep dive into relabeling metrics, let's first recap how labels work in Prometheus.

Quick Recap of Prometheus Labels

Labels are a foundational concept in Prometheus, serving as the key to its powerful and flexible data model. In essence, labels are key-value pairs attached to every time series, allowing for meaningful differentiation between data points.

They enable users to query and aggregate metrics finely and flexibly, making Prometheus not just a tool for monitoring but a robust system capable of providing insightful observations into applications and infrastructure.

Structure of Labels in Prometheus

A label in Prometheus is a simple construct. Each label consists of a label name and a label value, both of which are strings.

The label name is a descriptor that denotes what field or dimension the label is categorizing, such as job, instance, or method.

The label value is the specific data for that descriptor, providing the context for the metric, like orders-service, host-76, or POST.

For example, a metric with labels may look like this in text exposition format:

http_requests_total{method="POST", handler="/api/orders", status="200"} 1027

In this case, http_requests_total is the metric name, and there are three labels attached to it: method with the value POST, handler with the value /api/orders, and status with the value 200. This labeling structure gives a clear indication that the metric is counting the total number of HTTP requests for POST to /api/orders that resulted in a 200 status code.

Understand more on Prometheus metrics, samples v.s. cardinality

Labels make Prometheus particularly adept at turning raw numerical data into insightful, actionable information, bridging the gap between simple number-crunching and meaningful analytics.

In Prometheus, the labeling system, as we've seen, provides a powerful means of distinguishing and categorizing time series data. However, while labels are incredibly versatile, there are instances where you might need to go a step further in refining the data, which is where relabeling comes into play.

Sometimes, You may need to align data from dynamic environments with a predefined and static monitoring setup, ensuring consistent and meaningful data collection. Sometimes, your labels can become complex and unwieldy. Prometheus relabeling allows you to simplify your configurations and make them more manageable.

Some use cases for relabeling might include the following:

Standardizing

In large teams and distributed systems, inconsistencies are bound to occur. Making changes in your instrumentation might not be possible or might take time. In the meantime, you can rename or standardize labels to make querying and alerting more consistent.

Filtering & Dropping

You can drop specific labels using relabeling to reduce clutter and simplify metric sets.

Aggregations

Relabeling can help aggregate or summarize metrics from multiple sources into one metric. For example, you can combine metrics from different instances into a single representation.

Managing High Cardinality

Cardinality spikes are common if too many unique label combinations are generated, potentially impacting performance. Relabeling can be one of the strategies you can use to manage high cardinality metrics.

Streaming Aggregation is a potent way to manage high cardinality without changing instrumentation or doing any relabeling.

Relabeling Configuration

The relabel_config and metric_relabel_configs are directives in the Prometheus configuration that dictate how labels should be modified.

While relabel_config applies at scrape time, before ingestion, metric_relabel_configs comes into play post-ingestion, allowing for further refinement.

Additionally, write_relabel_configs affects the data as it's written to remote storage using Prometheus Remote Write and alert_relabel_configs tailors how alerts are labeled.

Structure of a Relabeling Rule

A Prometheus relabeling rule is composed of several fields that determine its behavior as described in the official documentation:

  • source_labels: The labels to use as input.
  • separator: Defines the string that separates concatenated source labels.
  • target_label: The label to receive the value of the replacement.
  • regex: A Prometheus regex pattern that source labels must match.
  • replacement: The replacement value or string.
  • action: The operation performed on labels, such as replace, keep, drop, etc. The complete list can be found here,.

Types of Actions in Relabeling

Each action in Prometheus relabel_configs dictates how labels are managed:

  • drop: Removes the label entirely.
  • keep: Retains the target with matching labels.
  • replace: Replaces a label with a new value
  • labelmap: Dynamically renames labels based on a regex pattern.
  • hashmod: Applies a hash function to labels.
  • labeldrop and labelkeep: Selectively remove or keep labels.
  • keepequal and dropequal: Keep or drop specific label-value pairs.
  • lowercase and uppercase: Change label values to lowercase or uppercase

In the next section, we will go through examples of each action. To summarize, using drop actions, specific labels can be excluded from metrics to prevent unnecessary high cardinality and optimize storage.

💡
Levitate - Last9's time series data warehouse supports high cardinality metrics using Streaming Aggregation and Cardinality workflows. Get started today, or book a demo,.

With add actions, you can enrich their metrics by including additional meaningful labels. A global approach can be adopted, ensuring consistency across all scraped data by adding giving more context to all metrics being ingested.

Applying replacegiving actions in the relabeling rules ensures that data is not only consistent but also adapts to the specific needs of your monitoring and alerting strategies.

Examples of Prometheus Relabel Action

1. Replace

The replace action substitutes a target label's value with a replacement value if the source label values match the regex.

Example:

- action: replace
  source_labels: [service]
  regex: (.*)
  target_label: environment
  replacement: production

This rule will take the current value of the service label and replace the value of the environment label with production.

2. Keep

The keep action retains the time series that matches the specified regex, discarding all others.

Example:

- action: keep
  source_labels: [job]
  regex: notification-job

This will keep all time-series where the job label matches notification-job.

3. Drop

The drop action removes the time series that match the regex, keeping all others.

Example:

- action: drop
  source_labels: [status]
  regex: failure

Time series with a status label of failure are dropped with this rule.

4. Labelmap

The labelmap action dynamically renames labels according to a regex pattern.

Example:

- action: labelmap
  regex: __meta_kubernetes_pod_label_(.+)

This rule renames all labels that match __meta_kubernetes_pod_label_XXX to XXX.

5. Hashmod

The hashmod action applies a hash function to a label's value and stores the result in the target label.

Example:

- action: hashmod
  source_labels: [instance]
  target_label: instance_hash
  modulus: 100

This would take the instance label's value, apply a hash function, and take the result modulo 100, storing it in instance_hash.

6. Labeldrop & Labelkeep

The labeldrop and labelkeep actions allow you to remove or keep labels selectively.

Example:

- action: labeldrop
  regex: "(temporary|debug)_.*"

This removes any labels starting with temporary_ or debug_.

Example - labelkeep:

- action: labelkeep
  regex: wanted_label_.*

Conversely, this rule retains only labels that match the regex wanted_label_, discarding all others.

7. Keepequal and Dropequal

The keepequal and dropequal actions allow you to keep or drop specific label-value pairs.

Example - keepequal:

- action: keepequal
  source_labels: [status]
  regex: success|info

This rule retains time series with the label status having values success or info, dropping all others.

Example - dropequal:

- action: dropequal
  source_labels: [environment]
  regex: development

In this case, time series with the environment label set to development will be dropped.

9. Lowercase and Uppercase

The lowercase and uppercase actions, as their names suggest, change label values to lowercase or uppercase, respectively.

Example -

- action: lowercase
  source_labels: [environment]
  target_label: environment_lower

In this rule, the environment_lower label will contain the lowercase value of the environment label.

Leveraging Internal Labels and Metadata in Prometheus

Beyond the conventional labels, Prometheus introduces the concept of hidden labels and metadata, often prefixed with a double underscore (__). These special labels are instrumental in enriching your metrics and extracting valuable insights. Let's explore the significance of hidden labels, some examples of commonly used ones, and how they can be employed in relabeling actions.

Hidden labels, or metadata labels, serve as an internal mechanism within Prometheus. They don't appear in the final metric output but play a crucial role during the metric collection and relabeling processes. These labels can carry information about the target, the scrape job, or the system itself. They provide a way to tap into Prometheus's internal data management capabilities.

Examples of Internal Labels and Metadata

  1. __address__ Label: This hidden label holds the target's address. For instance, it might contain the IP address or domain name of a target.
  2. __job__ Label: The __job__ hidden label stores the job name associated with a target. It helps identify which job is scraping the metrics.
  3. __metrics_path__ Label: This label contains the metrics path configured for the target. It's useful when different targets expose metrics on distinct paths.

Here is a table with the internal labels and their description.

Label name Description
name The scraped metric’s name
address host:port of the scrape target
scheme URI scheme of the scrape target
metrics_path Metrics endpoint of the scrape target
param_ Value of the first URL parameter passed to target
scrape_interval The target’s scrape interval (experimental)
scrape_timeout The target’s timeout (experimental)
_meta Special labels set by the Service Discovery mechanism
__tmp Special prefix used to temporarily store label values before discarding them

Using Internal Labels in Relabeling Actions

Hidden labels can be handy when crafting relabeling rules:

Example 1 - Renaming Targets:

- action: labelmap
  regex: __meta_kubernetes_pod_label_(.+)

In this example, the rule renames hidden labels like __meta_kubernetes_pod_label_app to app, effectively incorporating Kubernetes labels into the metric as regular labels.

Example 2 - Dynamically Configuring Job Names:

- action: replace
  source_labels: [__meta_kubernetes_pod_label_component]
  target_label: job

This rule dynamically configures the job label based on the __meta_kubernetes_pod_label_component, allowing you to organize metrics by components in a Kubernetes environment.

By leveraging these hidden labels and metadata within relabeling actions, you can extract and manipulate valuable information that might otherwise remain obscured, enhancing the precision and granularity of your monitoring setup.

Metric Relabeling and Cardinality

Cardinality refers to the uniqueness of data points within a dataset. Prometheus relabeling must be managed carefully to prevent high cardinality, which can occur if too many unique label combinations are generated, potentially impacting performance.

💡
Read more on Levitate - our managed time series data warehouse handles high cardinality.

Best Practices for Prometheus Relabeling

When delving into the world of Prometheus relabeling, it's crucial to follow best practices to ensure that you're not only getting the most out of this powerful feature but also avoiding common pitfalls that could affect the performance and reliability of your monitoring setup.

Some best practices for using prometheus relabeling :

  • Use Relabeling Sparingly: Only use relabeling when necessary. Excessive use of complex relabeling rules can make your configuration hard to understand and maintain.
  • Check Impact on Scrape Performance: Relabeling happens at scrape time, impacting the time it takes to scrape targets. Monitor your scrape durations and adjust your relabeling configurations if necessary.
  • Consistent Label Values: Where possible, standardize label values across different targets and jobs to ensure consistency in queries and dashboards.
  • Dry Run Changes: Before applying new relabeling rules to your production environment, test them in a staging environment to verify their effects.
  • Use promtool to Check Configurations: Utilize Prometheus promtool to check your configuration files for errors or inconsistencies.
  • Document Your Relabeling Rules: Maintain clear documentation for your relabeling rules, explaining the purpose behind each rule and its expected outcome.
  • Descriptive Label Names: Ensure label names are descriptive and reflect the label's purpose to anyone reading the configuration.
  • Evolve Rules Gradually: When the need arises to change labeling strategies, do so gradually. It's often better to add new labels alongside the old ones, transition over time, and then remove the old labels once you're sure they're no longer needed.

Conclusion

Prometheus relabeling is an advanced feature that, when mastered, provides unparalleled control over your monitoring environment. By understanding and applying relabel_configs, teams can tailor Prometheus to their unique monitoring requirements.

💡
The Last9 promise — We will reduce your Observability TCO by about 50%. Our managed time series database data warehouse, Levitate, comes with streaming aggregation, data tiering, and the ability to manage high cardinality. If this sounds interesting, talk to us.

Contents


Newsletter

Stay updated on the latest from Last9.

Authors

Last9

Last9 helps businesses gain insights into the Rube Goldberg of micro-services. Levitate - our managed time series data warehouse is built for scale, high cardinality, and long-term retention.

Handcrafted Related Posts