In this guide, we will explore the powerful feature of Prometheus relabeling, a technique central to refining data collection and enhancing monitoring efforts.
Before we start with our deep dive into relabeling metrics, let's first recap how labels work in Prometheus.
Quick Recap of Prometheus Labels
Labels are a foundational concept in Prometheus, serving as the key to its powerful and flexible data model. In essence, labels are key-value pairs attached to every time series, allowing for meaningful differentiation between data points.
They enable users to query and aggregate metrics finely and flexibly, making Prometheus not just a tool for monitoring but a robust system capable of providing insightful observations into applications and infrastructure.
Structure of Labels in Prometheus
A label in Prometheus is a simple construct. Each label consists of a label name and a label value, both of which are strings.
The label name is a descriptor that denotes what field or dimension the label is categorizing, such as job
, instance
, or method
.
The label value is the specific data for that descriptor, providing the context for the metric, like orders-service
, host-76
, or POST
.
For example, a metric with labels may look like this in text exposition format:
http_requests_total{method="POST", handler="/api/orders", status="200"} 1027
In this case, http_requests_total
is the metric name, and there are three labels attached to it: method
with the value POST
, handler
with the value /api/orders
, and status
with the value 200
. This labeling structure gives a clear indication that the metric is counting the total number of HTTP requests for POST to /api/orders
that resulted in a 200 status code.
Understand more on Prometheus metrics, samples v.s. cardinality
Labels make Prometheus particularly adept at turning raw numerical data into insightful, actionable information, bridging the gap between simple number-crunching and meaningful analytics.
In Prometheus, the labeling system, as we've seen, provides a powerful means of distinguishing and categorizing time series data. However, while labels are incredibly versatile, there are instances where you might need to go a step further in refining the data, which is where relabeling comes into play.
Sometimes, You may need to align data from dynamic environments with a predefined and static monitoring setup, ensuring consistent and meaningful data collection. Sometimes, your labels can become complex and unwieldy. Prometheus relabeling allows you to simplify your configurations and make them more manageable.
Some use cases for relabeling might include the following:
Standardizing
In large teams and distributed systems, inconsistencies are bound to occur. Making changes in your instrumentation might not be possible or might take time. In the meantime, you can rename or standardize labels to make querying and alerting more consistent.
Filtering & Dropping
You can drop specific labels using relabeling to reduce clutter and simplify metric sets.
Aggregations
Relabeling can help aggregate or summarize metrics from multiple sources into one metric. For example, you can combine metrics from different instances into a single representation.
Managing High Cardinality
Cardinality spikes are common if too many unique label combinations are generated, potentially impacting performance. Relabeling can be one of the strategies you can use to manage high cardinality metrics.
Streaming Aggregation is a potent way to manage high cardinality without changing instrumentation or doing any relabeling.
Relabeling Configuration with relabel_config
and metric_relabel_configs
The relabel_config
and metric_relabel_configs
are directives in the Prometheus configuration that dictate how labels should be modified.
While relabel_config
applies at scrape time, before ingestion, metric_relabel_configs
comes into play post-ingestion, allowing for further refinement.
Additionally, write_relabel_configs
affects the data as it's written to remote storage using Prometheus Remote Write and alert_relabel_configs
tailors how alerts are labeled.
Structure of a Relabeling Rule
A Prometheus relabeling rule is composed of several fields that determine its behavior as described in the official documentation:
source_labels
: The labels to use as input.separator
: Defines the string that separates concatenated source labels.target_label
: The label to receive the value of the replacement.regex
: A Prometheus regex pattern that source labels must match.replacement
: The replacement value or string.action
: The operation performed on labels, such asreplace
,keep
,drop
, etc. The complete list can be found here,.
Types of Actions in Relabeling
Each action in Prometheus relabel_configs dictates how labels are managed:
drop
: Removes the label entirely.keep
: Retains the target with matching labels.replace
: Replaces a label with a new valuelabelmap
: Dynamically renames labels based on a regex pattern.hashmod
: Applies a hash function to labels.labeldrop
andlabelkeep
: Selectively remove or keep labels.keepequal
anddropequal
: Keep or drop specific label-value pairs.lowercase
anduppercase
: Change label values to lowercase or uppercase
In the next section, we will go through examples of each action. To summarize, using drop
actions, specific labels can be excluded from metrics to prevent unnecessary high cardinality and optimize storage.
With add
actions, you can enrich their metrics by including additional meaningful labels. A global approach can be adopted, ensuring consistency across all scraped data by adding
giving more context to all metrics being ingested.
Applying replace
giving actions in the relabeling rules ensures that data is not only consistent but also adapts to the specific needs of your monitoring and alerting strategies.
Examples of Prometheus Relabel Action
1. Replace
The replace
action substitutes a target label's value with a replacement value if the source label values match the regex.
Example:
- action: replace
source_labels: [service]
regex: (.*)
target_label: environment
replacement: production
This rule will take the current value of the service
label and replace the value of the environment
label with production
.
2. Keep
The keep
action retains the time series that matches the specified regex, discarding all others.
Example:
- action: keep
source_labels: [job]
regex: notification-job
This will keep all time-series where the job
label matches notification-job
.
3. Drop
The drop
action removes the time series that match the regex, keeping all others.
Example:
- action: drop
source_labels: [status]
regex: failure
Time series with a status
label of failure
are dropped with this rule.
4. Labelmap
The labelmap
action dynamically renames labels according to a regex pattern.
Example:
- action: labelmap
regex: __meta_kubernetes_pod_label_(.+)
This rule renames all labels that match __meta_kubernetes_pod_label_XXX
to XXX
.
5. Hashmod
The hashmod
action applies a hash function to a label's value and stores the result in the target label.
Example:
- action: hashmod
source_labels: [instance]
target_label: instance_hash
modulus: 100
This would take the instance
label's value, apply a hash function, and take the result modulo 100, storing it in instance_hash
.
6. Labeldrop & Labelkeep
The labeldrop
and labelkeep
actions allow you to remove or keep labels selectively.
Example:
- action: labeldrop
regex: "(temporary|debug)_.*"
This removes any labels starting with temporary_
or debug_
.
Example - labelkeep
:
- action: labelkeep
regex: wanted_label_.*
Conversely, this rule retains only labels that match the regex wanted_label_
, discarding all others.
7. Keepequal and Dropequal
The keepequal
and dropequal
actions allow you to keep or drop specific label-value pairs.
Example - keepequal
:
- action: keepequal
source_labels: [status]
regex: success|info
This rule retains time series with the label status
having values success
or info
, dropping all others.
Example - dropequal
:
- action: dropequal
source_labels: [environment]
regex: development
In this case, time series with the environment
label set to development
will be dropped.
9. Lowercase and Uppercase
The lowercase
and uppercase
actions, as their names suggest, change label values to lowercase or uppercase, respectively.
Example -
- action: lowercase
source_labels: [environment]
target_label: environment_lower
In this rule, the environment_lower
label will contain the lowercase value of the environment
label.
Leveraging Internal Labels and Metadata in Prometheus
Beyond the conventional labels, Prometheus introduces the concept of hidden labels and metadata, often prefixed with a double underscore (__
). These special labels are instrumental in enriching your metrics and extracting valuable insights. Let's explore the significance of hidden labels, some examples of commonly used ones, and how they can be employed in relabeling actions.
Hidden labels, or metadata labels, serve as an internal mechanism within Prometheus. They don't appear in the final metric output but play a crucial role during the metric collection and relabeling processes. These labels can carry information about the target, the scrape job, or the system itself. They provide a way to tap into Prometheus's internal data management capabilities.
Examples of Internal Labels and Metadata
__address__
Label: This hidden label holds the target's address. For instance, it might contain the IP address or domain name of a target.__job__
Label: The__job__
hidden label stores the job name associated with a target. It helps identify which job is scraping the metrics.__metrics_path__
Label: This label contains the metrics path configured for the target. It's useful when different targets expose metrics on distinct paths.
Here is a table with the internal labels and their description.
Label name | Description |
---|---|
name | The scraped metric’s name |
address | host:port of the scrape target |
scheme | URI scheme of the scrape target |
metrics_path | Metrics endpoint of the scrape target |
param |
Value of the first URL parameter passed to target |
scrape_interval | The target’s scrape interval (experimental) |
scrape_timeout | The target’s timeout (experimental) |
_meta | Special labels set by the Service Discovery mechanism |
__tmp | Special prefix used to temporarily store label values before discarding them |
Using Internal Labels in Relabeling Actions
Hidden labels can be handy when crafting relabeling rules:
Example 1 - Renaming Targets:
- action: labelmap
regex: __meta_kubernetes_pod_label_(.+)
In this example, the rule renames hidden labels like __meta_kubernetes_pod_label_app
to app
, effectively incorporating Kubernetes labels into the metric as regular labels.
Example 2 - Dynamically Configuring Job Names:
- action: replace
source_labels: [__meta_kubernetes_pod_label_component]
target_label: job
This rule dynamically configures the job
label based on the __meta_kubernetes_pod_label_component
, allowing you to organize metrics by components in a Kubernetes environment.
By leveraging these hidden labels and metadata within relabeling actions, you can extract and manipulate valuable information that might otherwise remain obscured, enhancing the precision and granularity of your monitoring setup.
Metric Relabeling and Cardinality
Cardinality refers to the uniqueness of data points within a dataset. Prometheus relabeling must be managed carefully to prevent high cardinality, which can occur if too many unique label combinations are generated, potentially impacting performance.
Best Practices for Prometheus Relabeling
When delving into the world of Prometheus relabeling, it's crucial to follow best practices to ensure that you're not only getting the most out of this powerful feature but also avoiding common pitfalls that could affect the performance and reliability of your monitoring setup.
Some best practices for using prometheus relabeling :
- Use Relabeling Sparingly: Only use relabeling when necessary. Excessive use of complex relabeling rules can make your configuration hard to understand and maintain.
- Check Impact on Scrape Performance: Relabeling happens at scrape time, impacting the time it takes to scrape targets. Monitor your scrape durations and adjust your relabeling configurations if necessary.
- Consistent Label Values: Where possible, standardize label values across different targets and jobs to ensure consistency in queries and dashboards.
- Dry Run Changes: Before applying new relabeling rules to your production environment, test them in a staging environment to verify their effects.
- Use
promtool
to Check Configurations: Utilize Prometheuspromtool
to check your configuration files for errors or inconsistencies. - Document Your Relabeling Rules: Maintain clear documentation for your relabeling rules, explaining the purpose behind each rule and its expected outcome.
- Descriptive Label Names: Ensure label names are descriptive and reflect the label's purpose to anyone reading the configuration.
- Evolve Rules Gradually: When the need arises to change labeling strategies, do so gradually. It's often better to add new labels alongside the old ones, transition over time, and then remove the old labels once you're sure they're no longer needed.
Conclusion
Prometheus relabeling is an advanced feature that, when mastered, provides unparalleled control over your monitoring environment. By understanding and applying relabel_configs
, teams can tailor Prometheus to their unique monitoring requirements.