This blog post explores the four key metrics supported by Prometheus, highlighting their use cases, underlying behaviors, and the PromQL functions you can use to query them effectively.
What Are Metrics in Prometheus?
In Prometheus, a metric is a data point that helps you track a specific aspect of your system's behavior. These metrics are represented as one or more time series — meaning they're collected and recorded over intervals of time.
Each metric consists of:
- A metric name: Like
http_requests_total
, which describes what’s being measured. - Labels: Key-value pairs that add dimensions to the metric — such as
{method="POST", status="500"}
. - Values: Numerical measurements that reflect the actual count or measurement.
- Timestamps: When the measurement was recorded (usually automatically appended).
For example, a metric like user_logins_total{user_type="admin", location="EU"}
could tell you the total number of logins by EU-based admins, say 193
, as of the latest scrape.
Metrics in Prometheus are pull-based, meaning the Prometheus server scrapes metrics from instrumented targets at configured intervals. This setup allows for better reliability and scalability since targets do not push data to Prometheus.
Prometheus stores all scraped data in a time series format, where each unique combination of metric name and label set forms a new series. This model allows fine-grained visibility but must be used thoughtfully to avoid high cardinality issues that can cripple performance.
rate()
function is a great place to start.The Prometheus Metric Structure
Prometheus metrics are built to be flexible and high-performing for time series analysis. Here's what makes up a Prometheus metric:
- Metric Name – Acts as the primary identifier.
- Labels – These are critical for filtering and querying. You can think of them as database tags. Labels are what allow you to group and filter your data based on specific attributes.
- Metric Value – This is what you’re measuring, whether it’s a count, a duration, or some gauge value.
- Timestamp – Prometheus automatically assigns one during scraping.
This model allows you to slice and dice data across many dimensions, enabling complex analysis with simple expressions. It's also what contributes to the issue of high cardinality if not managed properly.
Metrics should be structured with stability in mind — values that change frequently, such as unique request IDs or timestamps, should never be used as label values. Otherwise, you’ll end up creating thousands or millions of unique time series that add little value and strain your infrastructure.
A Deeper Look at Prometheus Metric Types
Prometheus supports four major metric types, each with different strengths and ideal use cases:

- Counter – Measures totals that can only increase. Ideal for counting things like the number of HTTP requests or errors.
- Gauge – Can increase or decrease. Useful for capturing current state values like memory usage or temperature.
- Histogram – Captures the distribution of values over time using fixed buckets. Great for understanding latencies and size distributions.
- Summary – Calculates and reports quantiles over a sliding time window. Best used for precise quantile calculations on individual instances.
Each metric type provides different levels of insight and trade-offs, especially when it comes to aggregation, storage overhead, and precision.
To get the most out of Prometheus, it’s important to understand the implications of each type—not just in terms of functionality, but also the performance and cost impact they can have on your observability stack.
Counter Metrics: The Foundation of Event Counting
Counters are the simplest and most commonly used metric type. They represent a value that only increases over time. If the process restarts, the counter resets to zero.
When to use counters:
- Counting the total number of requests served
- Tracking completed background jobs
- Measuring errors or failures in an application
What to watch out for:
- Since counters reset when an app restarts, functions like
rate()
andincrease()
are designed to detect these drops and adjust accordingly. - If a counter suddenly decreases outside a reset scenario, that’s a red flag—perhaps you’re using the wrong metric type.
PromQL Patterns:
rate(http_requests_total[5m])
: Useful to observe the number of HTTP requests per secondincrease(errors_total[1h])
: Total number of errors over the last hour
Implementation Tip:
Always use counters for metrics that represent something accumulating over time—think of logs, not states.
Gauge Metrics: Capturing Values That Fluctuate
Gauges are ideal for tracking values that go up and down. They offer a snapshot of your system’s state at a particular moment.
Typical use cases:
- Current memory usage
- Queue sizes
- Active users or sessions
PromQL Patterns:
avg_over_time(cpu_usage[5m])
: Average CPU usage over timedelta(queue_length[10m])
: Net change in queue length over 10 minutes
Good to know:
- Gauges can be misleading if you’re relying on one-off values. Instead, use them with
*_over_time()
functions for better trend analysis. - Since gauges can decrease, they’re useful for alerting on regression-like patterns (e.g., disk space dropping).
Instrumentation Caution:
If you expose a gauge from your app but forget to update it periodically, it might appear as zero in Prometheus. Always ensure gauges reflect the current value reliably.
Histogram Metrics: Understanding the Distribution of Observations
Histograms break down the values of an observation (like request latency) into buckets. Each bucket counts how many observations fall within its range. This lets you analyze value distributions.

Use Histograms when you need to:
- Measure request latency in buckets
- Visualize response size distribution
- Calculate percentiles over aggregated instances
Example: If you define buckets as [0.1, 0.3, 1, 5]
, then values are counted into these ranges:
- <= 0.1
1 and <= 5
0.3 and <= 1
0.1 and <= 0.3
Prometheus adds a special bucket +Inf
that captures all values greater than your last defined range.
Prometheus adds a special bucket +Inf
that captures all values greater than your last defined range.
PromQL Patterns:
sum(rate(http_request_duration_seconds_bucket[5m])) by (le)
gives bucketed datahistogram_quantile(0.95, sum(rate(http_request_duration_seconds_bucket[5m])) by (le))
estimates the 95th percentile
Why use Histograms over Summaries?
- Histograms are aggregatable across instances and labels
- Useful for alerting on latency thresholds (e.g., “5% of traffic > 500ms”)
Downsides:
- Can cause time series explosion if too many buckets or labels are used
- Lack precision compared to summaries for quantiles
Summary Metrics: Real-Time Percentile Estimation
Summaries calculate quantiles (e.g., p90, p95) in your application before exporting the metric to Prometheus.
Use Summaries when:
- You want precise local percentiles for a service
- You don't need to aggregate data across instances
Structure:
_count
and_sum
accompany every summary- One series is created per configured quantile (e.g.,
{quantile="0.95"}
)
Limitations:
- Not aggregatable across multiple targets (quantiles lose meaning)
- CPU and memory-intensive due to sliding windows and in-process calculations
Alternatives:
- If you need percentiles across your cluster, stick with histograms.
- Use summaries for things like specific endpoints or job timings where precision matters.
Best practice:
Use both histograms and summaries in different places depending on whether you need local accuracy (summary) or global aggregation (histogram).
Summary vs Histogram: Choosing the Right Metric for the Job
Scenario | Use Summary | Use Histogram |
---|---|---|
Need p99 on a single instance | Yes | No |
Aggregating across instances | No | Yes |
Reducing time series count | Yes | No |
Grafana-friendly trends | Limited | Recommended |
Dynamic quantile calculation | No | Yes |
Performance & Cardinality: Managing Prometheus at Scale
Cardinality is the silent killer in Prometheus. The more unique time series you create, the harder it becomes to store, query, and alert efficiently.
Tips to stay safe:
- Limit the use of high-cardinality labels like
user_id
,session_id
, or full paths - Use
topk()
andcount_values()
to find problematic labels - Monitor
prometheus_tsdb_head_series
to track active series - Use relabeling to drop unnecessary labels before ingestion
By metric type:
- Counters & Gauges: Low cardinality risk
- Summaries: Moderate risk due to per-quantile series
- Histograms: Highest risk due to buckets +
_count
+_sum
Visualization Tools: Making Metrics Easier to Understand
While Prometheus is the engine, you need a dashboard to see what’s going on.
Best practices:
- Use Grafana or Last9’s Telemetry Warehouse
- Dashboards should reflect golden signals: latency, traffic, errors, and saturation
- Avoid raw numbers—use functions like
rate()
,avg_over_time()
- Document custom metrics and label conventions
Bonus tip:
Use panel repeat features in Grafana to show per-label breakdowns (e.g., per endpoint, region, or service).

Prometheus Metric Types Cheatsheet
Metric Type | What It Measures | Can Decrease? | Aggregatable Across Instances? | Use Cases | Common PromQL Functions |
---|---|---|---|---|---|
Counter | Cumulative total | No | Yes | Errors, requests, job runs | rate() , increase() |
Gauge | Current value | Yes | Yes | Memory usage, CPU load | avg_over_time() , delta() |
Histogram | Value distribution (bucketed) | No | Yes | Latency, response size | sum() by (le) , histogram_quantile() |
Summary | Sliding window quantiles | No | No | Local p95/99 latency | metric{quantile="0.99"} |
Quick Tips:
- Use Counters for ever-increasing values.
- Use Gauges for current state and fluctuating data.
- Use Histograms for aggregation and latency distributions.
- Use Summaries for fine-grained, local performance metrics.
Choosing the Right Metrics for Your System
Not every metric needs to be a histogram. Not every error needs a counter. Choose based on what questions you want to answer and how often you’ll ask them.
The best observability setup isn't the one with the most metrics—it's the one that helps you debug, optimize, and understand your system without blowing up your bill.
Want to explore more? Try these PromQL tricks or dig into common pitfalls in Prometheus.
FAQs
What are the types of metrics in Prometheus?
Prometheus supports four types of metrics: Counter, Gauge, Histogram, and Summary. These types of Prometheus metrics form the backbone of its data model, each offering unique capabilities for tracking different system behaviors.
What are the metrics of Prometheus scale?
Prometheus scales by scraping time series data based on a label-rich model. Its scalability depends heavily on cardinality—the number of unique combinations of labels. Histograms with too many histogram buckets or highly dynamic labels can significantly strain a Prometheus monitoring system.
What metrics does Prometheus monitor?
Prometheus can monitor any metric your API, service, or infrastructure exposes using supported Prometheus client libraries. These include metrics about CPU usage, memory consumption, API response times, Kubernetes pod states, and business events like the total number of HTTP requests or transactions.
What is the format of Prometheus metrics?
Prometheus metrics follow a plaintext exposition format over HTTP. Each line includes a metric name, an optional label set, and a value.
Example:http_requests_total{method="GET", code="200"} 1027
This format supports easy parsing and integration with open-source monitoring tools.
What Are Prometheus Metrics?
Prometheus metrics are structured time series built on a flexible data model. Each metric includes a name, labels, a sample value (observation), and a timestamp. These samples are collected at intervals and analyzed using PromQL.
What are some use cases for gauges?
- Memory usage in Kubernetes containers
- Number of API connections or sessions
- Queue length or disk space remaining
Gauges are ideal for monitoring system values that fluctuate over time.
When to use histograms?
Use a Prometheus histogram when you need to analyze value distributions or percentiles, such as:
- API response times bucketed into latency ranges
- Observing the total sum and count of observations over time
- Configurable buckets that help track thresholds (e.g., 99th percentile)
What would be the ideal metric type for that job?
- Counting the number of events or HTTP requests: Counter
- Monitoring real-time resource usage: Gauge
- Tracking response times across ranges: Histogram
- Getting client-side quantiles like the 99th percentile: Summary
Why use Summary metrics?
Summary metrics are computed on the client side. They’re ideal when you want accurate percentiles, like 99th percentile latency, on a per-instance basis. Use them when aggregation isn't needed but high-resolution insight is.
How do you create custom metrics in Prometheus?
To create custom metric data, use Prometheus client libraries (Go, Python, Java, etc.) to define and expose metrics from your service. These metrics can then be scraped by Prometheus for analysis, alerting, and dashboards. Include labels thoughtfully to avoid high-cardinality issues.
How does the Prometheus counter metric type work?
A counter represents a metric that increases monotonically. It tracks the number of events like API hits, background job executions, or errors. If the process restarts, the counter resets, but functions like rate()
account for this.
How do Counter metrics work in Prometheus?
Counter metrics represent a running total of event occurrences. Prometheus scrapes this value at regular intervals, and PromQL queries can compute the rate of change or increase over time, even when counters reset between samples.
How do you define a new metric type in Prometheus?
You can't define entirely new metric types beyond the standard four. However, you can define custom metrics using existing types through Prometheus client libraries. These libraries let you set up counters, gauges, histograms (with configurable buckets), or summaries tailored to your system’s needs.