Updated: 15-Apr-2025.
This blog post explores the four key metrics supported by Prometheus, highlighting their use cases and the PromQL functions you can use to query them.
In Prometheus, metrics represent one or more time series, each consisting of a metric name, a set of labels, and a series of data points (with timestamps and values). Time series data is essentially a collection of data points indexed by time, which is crucial for monitoring system performance.
At its core, a metric is a quantifiable measure used to track and assess the status of a specific process or activity. Prometheus, as an open-source monitoring solution, offers a powerful data model for storing and querying this metric data.
Metrics Structure in Prometheus:
Before jumping into metric types, it's important to understand how Prometheus handles metric data.
The Prometheus data model is designed to efficiently store time-series data, with each data point containing sample observations from your systems.
The structure of a metric typically includes the following key components:
- Metric Name: An explicit identifier for the metric, often reflecting what it measures. For example,
http_requests_total
. - Labels: Key-value pairs that provide additional dimensions to the metric, enabling more detailed and specific tracking. An example label for
http_requests_total
could be{method="GET", endpoint="/api"}
. - Metric Value: The actual data point representing the measurement, which could be a count, a duration, etc.
- Timestamp: The point in time when the metric value was recorded (often added automatically by the monitoring system)
Consider a metric named user_logins_total
—
This metric could have labels like {user_type="admin", location="EU"}
and a numerical value indicating the total count of logins. The timestamp would denote when this count was recorded.
Types of Prometheus Metrics
Prometheus, through its various client libraries including Python, Go, and Java clients, primarily deals with four types of metrics:
- Counter: A metric that only increases or resets to zero on restart. Ideal for tracking the number of requests, tasks completed, or errors.
- Gauge: This represents a value that can go up or down, like temperature or current memory usage.
- Histogram: Measures the distribution of events over time, such as request latencies or response sizes.
- Summary: Similar to histograms but provides a total count and sum of observed values.
Lets dive deeper into each Prometheus metrics types.
Counters
A counter metric is a cumulative metric used primarily for tracking quantities that increase over time. Said simply, a counter... counts!
What are Counters Used For?
Counters are ideal for monitoring the rate of events, like:
- Total number of HTTP requests to a web server
- Task completions
- Error occurrences
A counter is designed to only increase, which means its value should never decrease (except when reset to zero, usually due to a restart or reset of the process generating it).
Visualizing and Alerting with Counters
Counters are often visualized in dashboards to show trends over time, like the rate of requests to a web service. They can trigger alerts if the rate of errors or specific events exceeds a threshold, indicating potential issues in the monitored system.
Example: node_network_receive_bytes_total
in Node Exporter is a counter that tracks the total number of bytes received on a network interface.
Working with Counters in PromQL
Several PromQL functions are commonly used with counters:
- rate(): Calculates a metric's per-second average rate of increase over a given time interval
- increase(): Calculates the cumulative increase of a metric over a given time range
- reset(): Identifies the number of times a counter has been reset during a given period
Counter Reset Behavior
There are scenarios where a counter can reset. The most common reason is when the process generating the metric restarts due to a service restart, deployment, or system reboot. When this happens, the counter starts from zero again.
💡 An upcoming feature in Prometheus adds created timestamp
metrics to solve the long-standing issues with counter-resets. See the talk from Promcon 2023.
This reset behavior is crucial for understanding how to interpret counter data. Functions like rate()
or increase()
in PromQL are designed to account for counter resets by detecting when the counter value decreases between scrape intervals.
Counter Implementation Example
// Go example
requestCounter := prometheus.NewCounter(prometheus.CounterOpts{
Name: "http_requests_total",
Help: "Total number of HTTP requests",
})
prometheus.MustRegister(requestCounter)
// Increment the counter
requestCounter.Inc()
Gauges
Gauges represent a metric that can increase or decrease, akin to a thermometer. They give a snapshot of a system's state at a specific point in time.
What are Gauges Used For?
Gauges are versatile and can measure values like:
- Memory usage
- Temperature
- Queue sizes
- CPU utilization
- Current connections
Working with Gauges
Gauges are straightforward in terms of updating their value. They can be:
- Set to a particular value at any given time
- Incremented
- Decremented
This flexibility makes them ideal for tracking metrics that fluctuate up and down.
Visualizing Gauges
Gauges are often visualized using line graphs in dashboards to depict their value changes over time. They are useful for observing the current state and trends of what's being measured rather than the rate of change.
Example: From the JMX Exporter, which is used for Java applications, a Gauge might be employed to monitor the number of active threads in a JVM labeled as jvm_threads_current
.
Analyzing Gauge Metrics with PromQL
When working with gauges, specific functions are typically used to calculate statistical measures over a time series:
avg_over_time()
- for computing the averagemax_over_time()
- for finding the maximum valuemin_over_time()
- for the minimum valuequantile_over_time()
- for determining percentiles within the specified perioddelta()
- for the difference in the gauge value over the time series
These functions are instrumental in analyzing the trends and variations of gauge metrics, providing valuable insights into the performance and state of monitored systems.
Gauge Implementation Example
// Go example
connectionGauge := prometheus.NewGauge(prometheus.GaugeOpts{
Name: "active_connections",
Help: "Number of active connections",
})
prometheus.MustRegister(connectionGauge)
// Set or modify the gauge
connectionGauge.Set(10)
connectionGauge.Inc()
connectionGauge.Dec()
Histograms
Histograms are used to sample and aggregate distributions, such as latencies. They use configurable buckets to sort measurements into predefined ranges, which can be adjusted based on your monitoring needs. Histograms are excellent for understanding the distribution of metric values and helpful in performance analysis, like tracking request latencies or response sizes.
How Histograms Work
Histograms efficiently categorize measurement data into defined intervals, known as buckets, and tally the number (i.e., a counter) of measurements that fit into each of these buckets. These buckets are pre-defined during the instrumentation stage.
A key thing to note in the Prometheus Histogram type is that the buckets are cumulative. This means each bucket counts all values less than or equal to its upper bound, providing a cumulative distribution of the data. Simply put, each bucket contains the counts of all prior buckets.
Example: Observing Response Times
Let's take an example of observing response times with buckets — We could classify request times into meaningful time buckets like:
- 0 to 200ms - le="200" (less or equal to 200)
- 200ms to 300ms - le="300" (less or equal to 300)
- … and so on
- Prometheus also adds a +inf bucket by default
Let's say our API's response time observed is 175ms; the count values for the bucket will look something like this:
Bucket | Count |
---|---|
0 - 200 | 1 |
0 - 300 | 1 |
0 - 500 | 1 |
0 - 1000 | 1 |
0 - +Inf | 1 |
Here, you can see how the cumulative nature of the histogram works.
Let's say in the following observation our API's response time is 300ms; the count values will look like this:
Bucket | Count |
---|---|
0 - 200 | 1 |
0 - 300 | 2 |
0 - 500 | 2 |
0 - 1000 | 2 |
0 - +Inf | 2 |
Histogram Metric Structure
It is essential to note the histogram-type metric's structure for properly querying it.
Each bucket is available as a "counter," which can be accessed by adding a _bucket suffix and the le label. The suffix of _count and _sum are generated by default to help with the qualitative calculation.
- _count is a counter with the total number of measurements available for the said metric.
- _sum is a counter with the total (or the sum) of all values of the measurement.
For Example:
- http_request_duration_seconds_sum{host="example.last9.io"} 9754.113
- http_request_duration_seconds_count{host="example.last9.io"} 6745
- http_request_duration_seconds_bucket{host="example.last9.io", le="200"} 300
- http_request_duration_seconds_bucket{host="example.last9.io", le="300"} 124
- ...
Working with Histograms in PromQL
The histogram_quantile() function calculates quantiles (e.g., medians, 95th percentiles) from histograms. It takes a quantile (a value between 0 and 1) and a histogram metric as arguments and computes the estimated value at that quantile across the histogram's buckets.
For instance, histogram_quantile(0.95, metric_name_here) estimates the value below which 95% of the observations in the histogram fall, providing insights into distribution tails like request latencies.
Aggregating Histograms
The histogram data type can also be aggregated, i.e., combining multiple histograms into a single histogram. Suppose you're monitoring response times across different servers.
Each server emits a histogram of response times. You would aggregate these individual histograms to understand the overall response time distribution across all servers. This aggregation is done by summing up the counts in corresponding buckets across all histograms.
For example, you could use a PromQL query like this:
sum by (le) (rate(http_request_duration_seconds_bucket{endpoint="payment"}[5m]))
In this example, the sum by (le) part aggregates the counts in each bucket (le label) across all instances of the endpoint labeled "payment". The rate function is applied over a 5-minute interval ([5m]), calculating the per-second rate of increase for each bucket, which is helpful for histograms derived from counters. This query gives a unified view of the request duration distribution across all servers for the specified endpoint.
Histogram Implementation Example
// Go example
requestDuration := prometheus.NewHistogram(prometheus.HistogramOpts{
Name: "http_request_duration_seconds",
Help: "HTTP request duration distribution",
Buckets: prometheus.LinearBuckets(0.1, 0.1, 10), // 10 buckets, starting at 0.1, width of 0.1
})
prometheus.MustRegister(requestDuration)
// Observe a value
requestDuration.Observe(0.42)
Native Histograms
Starting from Prometheus version 2.40, an experimental feature provides support for native histograms. With native histograms, you only need a one-time series, and it includes a variable number of buckets along with the sum and count of observations. This feature offers significantly higher resolution while being more cost-effective.
Summaries
Summaries track the size and number of events, commonly used to calculate percentiles like the 99th percentile for latency monitoring. The total sum and count are automatically maintained for each summary metric.
What are Summaries Used For?
Summaries are ideal for calculating quantiles and averages. They are used for metrics where aggregating over time and space is essential, like request latency or transaction duration.
How Summaries Work
A summary metric automatically calculates and stores quantiles (e.g., 50th, 90th, 95th percentiles) over a sliding time window. This means it tracks both the number of observations (like requests) and their sizes (like latency), and then computes the quantiles of these observations in real-time.
A Prometheus summary typically consists of three parts:
- The count (
_count
) of observed events - The sum of these events' values (
_sum
) - The calculated quantiles
Example of Summary Metrics
# HELP http_request_duration_seconds The duration of HTTP requests in seconds
# TYPE http_request_duration_seconds summary
http_request_duration_seconds{quantile="0.5"} 0.055
http_request_duration_seconds{quantile="0.9"} 0.098
http_request_duration_seconds{quantile="0.95"} 0.108
http_request_duration_seconds{quantile="0.99"} 0.15
http_request_duration_seconds_sum 600
http_request_duration_seconds_count 10000
Summaries vs. Histograms
Summaries are better suited when you need accurate quantiles for individual instances or components and don't intend to aggregate these quantiles across different dimensions or labels.
Compared to histograms, which are helpful when aggregating data across multiple instances or dimensions, like calculating global request latency across several servers.
Limitations of Summaries
A significant limitation of summaries is that you cannot aggregate their quantiles across multiple instances. While you can sum the counts and sums, the quantiles are only meaningful within the context of a single summary instance.
Resource Considerations
Summaries can be more resource-intensive since they compute quantiles on the fly and keep a sliding window of observations. Histograms can be more efficient regarding memory and CPU usage, especially when dealing with high-cardinality data. Since the bucket configuration is fixed, they can also be optimized for storage.
Summary Implementation Example
// Go example
requestLatency := prometheus.NewSummary(prometheus.SummaryOpts{
Name: "http_request_processing_seconds",
Help: "Request processing time",
Objectives: map[float64]float64{0.5: 0.05, 0.9: 0.01, 0.99: 0.001},
})
prometheus.MustRegister(requestLatency)
// Observe a value
requestLatency.Observe(0.27)
Performance and Cardinality Considerations
When working with Prometheus metrics, understanding performance implications and cardinality concerns is crucial for maintaining a healthy monitoring system.
Understanding Metric Cardinality
Cardinality refers to the number of unique time series in your Prometheus database. Each unique combination of metric name and label values creates a separate time series. High cardinality can lead to:
- Increased storage requirements
- Slower query performance
- Higher memory usage in Prometheus servers
- Potential system instability
Cardinality Impact by Metric Type
Different metric types have varying impacts on cardinality:
- Counters and Gauges: Generally have the lowest cardinality impact since they create only one time series per unique label combination.
- Histograms: Can significantly increase cardinality due to the creation of multiple time series:
- One time series for each bucket (number of buckets × number of label combinations)
- Additional time series for _count and _sum
- For example, a histogram with 10 buckets and 5 label combinations can create 52 time series (10+1 buckets × 5 combinations, plus 5 _count and 5 _sum series)
- Summaries: Generate fewer time series than comparable histograms since they don't use buckets, but still create multiple:
- One time series for each configured quantile
- Additional time series for _count and _sum
Best Practices for Managing Cardinality
- Label Usage:
- Avoid high-cardinality labels like user IDs, session IDs, or timestamps
- Use label values with bounded cardinality (e.g., HTTP status codes instead of exact error messages)
- Consolidate similar label values when precise granularity isn't needed
- Histogram Configuration:
- Choose bucket counts and ranges carefully
- More buckets = higher resolution but increased cardinality
- Focus on ranges that matter most for your use case
- Regular Monitoring:
- Monitor the total number of time series in your Prometheus instance
- Watch for unexpected cardinality increases
- Use
prometheus_tsdb_head_series
metric to track active series
- Aggregation and Recording Rules:
- Use recording rules to pre-compute and store frequently accessed aggregations
- This reduces query-time resource usage and can mitigate some cardinality issues
Resource Usage Considerations
- Storage growth is directly proportional to the number of active time series
- Memory requirements increase with the number of active series
- Query performance degrades as cardinality increases
- Network bandwidth between Prometheus and exporters increases with more time series
Balancing Detail and Performance
The key to effective Prometheus monitoring is finding the right balance between:
- Detailed metrics that provide valuable insights
- Controlled cardinality that maintains system performance
For most applications, being selective about labels and thoughtful about metric types will help maintain this balance.
Visualization and Integration
While Prometheus provides powerful querying capabilities through PromQL, many organizations use Grafana as their primary visualization tool for Prometheus metrics.
Grafana offers rich dashboarding capabilities and seamless integration with Prometheus data sources. You can also use Last9 to explore these metrics through very user-friendly navigation and dashboards.

Summing up
The fundamentals we have covered in this post around metrics types in Prometheus will hopefully help you better grasp your monitoring setup.
In previous posts, we have posted various posts covering the fundamentals of Prometheus Monitoring and Prometheus Cardinality.
If you or your team is looking to get started using Prometheus, you can consider hosted and managed prometheus offerings that can help eliminate your cardinality and long-term storage woes while reducing your monitoring cost significantly.
Related Reading

Prometheus Query Language Developer Guide
Additional Resources
- For more detailed information about implementing these metric types, refer to the official Prometheus docs and client-side documentation for your preferred programming language.
- The prometheus client libraries provide comprehensive examples for different metric implementations across various programming languages like Python, Go, and Java.