OpenTelemetry Metrics Aggregation: Sum, Histogram & Temporality

A busy service emits millions of metric data points a day, and nobody queries them raw. OpenTelemetry metrics aggregation collapses those measurements into sums, counts, last values, and histogram buckets in the SDK before export, so you store summaries instead of noise.

Below: the four aggregation types and when each fits, how delta vs cumulative temporality changes what your backend sees, and the Python setup that ties it together.

Understanding OpenTelemetry Metrics Aggregation

OpenTelemetry is an open-source observability framework for collecting, processing, and exporting telemetry data. Metrics aggregation in OpenTelemetry refers to the process of combining multiple metric data points into meaningful statistics that help you analyze system performance. For how it stacks up against other standards, see our OpenMetrics vs OpenTelemetry comparison.

Aggregation helps reduce storage requirements, improve query performance, and generate insights without drowning in raw data. Instead of looking at every individual data point, you get a summarized view of what’s happening in your system.

For example, imagine you’re tracking API response times. If you log every response time separately, querying that data later becomes cumbersome. Instead, aggregating response times using histograms or averages can help identify performance trends efficiently.

Metrics Data Model

The OpenTelemetry metrics data model defines the structure for how metrics are collected, organized, and analyzed. It is designed to facilitate efficient observability, allowing users to monitor system performance over time.

The model is based on several key components, which work together to capture and store meaningful metric data.

Key Components of the Metrics Data Model

Metric: Represents a specific measurement, such as the number of requests or the latency of an operation. It is identified by a unique name and can be associated with various labels (dimensions or tags) that provide additional context, like the service name or region.
Data Streams: A data stream is a sequence of data points representing a metric’s value over time. Each data point is timestamped and includes the metric’s value along with its labels. Data streams help track changes in the metric, enabling insights into its behavior.
Time Series: A time series is essentially a collection of data points that represent how a metric changes over a period. These data points are indexed by time and organized according to metric names and labels. Time series are crucial for identifying trends, analyzing patterns, and comparing metric data across different time intervals.
Events: Events are discrete occurrences or changes that are tied to specific points in time. They provide additional context to the metric data, such as marking when an error occurred, when a system started, or when a configuration was updated. Events enrich the understanding of the metric data, helping to correlate system behavior with observed performance metrics.

How It All Fits Together

The OpenTelemetry data model integrates metrics, data streams, time series, and events to offer a comprehensive framework for observability.

Metrics are collected through instruments, and these metrics are organized into time series based on timestamps and labels.
Data streams capture the evolution of each metric over time.
Events complement this structure by adding contextual information, enabling users to correlate specific actions with metric changes.

This model provides a powerful foundation for analyzing system performance and troubleshooting issues.

Why Metrics Aggregation is Essential for Scalable Systems

Picture CPU monitoring across a few hundred containers. Storing every sample from every one of them buys you nothing. What you actually want is the distribution, p50, p95, p99 over time, and aggregation gets you there without the data bloat.

Metrics in general work the same way. Collection hands you a firehose of latencies, request counts, error rates, resource usage, and keeping every raw point means paying twice, once in storage and again at query time.

Key Benefits of Aggregation:

Reduced Data Volume – Aggregating metrics lowers the storage and processing overhead by summarizing data points.
Faster Query Performance – Pre-aggregated data means dashboards and alerts work efficiently without processing large raw datasets.
Improved Observability – Summarized data helps detect trends, anomalies, and patterns faster.
Lower Operational Cost – Less raw data means reduced storage and compute expenses, optimizing infrastructure costs.

💡

To learn more, read our guide on collecting host metrics using OpenTelemetry.

Types of Metric Aggregations in OpenTelemetry and Their Use Cases

OpenTelemetry supports several aggregation types. Which one you pick shapes what you can query later and what you pay to store in the meantime. Our guide on what OpenTelemetry metrics are and how to use them covers the fundamentals.

Type	Goes up only?	Best for	Example
Sum	For counters, yes. UpDownCounter sums can fall.	Totals: requests served, errors, bytes transmitted	1000 requests in an hour lands as one number, 1000
Count	Yes	How often something happened	Successful logins in a set window
Last Value	No	Current state: memory, CPU load, sessions	Active user sessions right now; only the latest reading counts
Histogram	No. It tracks a distribution.	Latency and percentiles (p50, p95, p99)	Response times bucketed into 0-100ms, 100-500ms, 500ms+

Histograms cost the most to store, and they are still usually worth it: percentiles come only from a distribution, and an average hides exactly the trends you care about.

Step-by-Step Guide to Setting Up Metrics Aggregation in OpenTelemetry

Step 1: Install OpenTelemetry SDK

Depending on your programming language, install the OpenTelemetry SDK. For example, in Python:

pip install opentelemetry-sdk

For Go:

go get go.opentelemetry.io/otel

Step 2: Configure the Meter Provider

A Meter Provider is responsible for metric collection. Here’s how to set it up in Python:

from opentelemetry import metrics
from opentelemetry.sdk.metrics import MeterProvider

metrics.set_meter_provider(MeterProvider())
meter = metrics.get_meter_provider().get_meter("my_application")

Step 3: Define and Record Metrics with Aggregation

Create and update your metrics efficiently to ensure proper aggregation:

counter = meter.create_counter(
    "requests.count",
    description="Total number of requests received by the service",
)

counter.add(1, {"endpoint": "/home", "status": "200"})

Step 4: Export Aggregated Metrics to an Observability Backend

Use an OpenTelemetry exporter (such as Prometheus or OTLP) to ship metrics to your observability backend:

from opentelemetry.sdk.metrics.export import ConsoleMetricExporter, PeriodicExportingMetricReader

exporter = ConsoleMetricExporter()
meter_provider = MeterProvider(
    metric_readers=[PeriodicExportingMetricReader(exporter)]
)
metrics.set_meter_provider(meter_provider)

How to Instrument Metrics and Capture Measurements in OpenTelemetry

To effectively capture and analyze metrics, OpenTelemetry provides various instruments that emit measurements. Choosing the right instrument ensures meaningful and efficient metric collection.

1. Counter

A monotonic instrument that only increases over time.
Best suited for counting events such as requests, errors, or processed messages.

Here’s a counter that tracks the total number of HTTP requests received:

request_counter = meter.create_counter("http_requests_total", "Counts the total number of HTTP requests")
request_counter.add(1, {"endpoint": "/login", "status": "200"})

2. Gauge

Captures an instantaneous value that can increase or decrease.
Used for system state metrics like memory usage, CPU load, or active users.

Reporting current memory usage with an observable gauge looks like this:

memory_gauge = meter.create_observable_gauge("memory_usage", "Tracks current memory usage in MB")
memory_gauge.observe(lambda: get_memory_usage(), {})

3. Histogram

Records the distribution of values, useful for latency or performance measurements.
Aggregates data into percentiles (e.g., p50, p95, p99) to provide more insights than simple averages.

To measure a response time distribution, record each duration into a histogram:

response_time_histogram = meter.create_histogram("http_response_time", "Tracks response times in milliseconds")
response_time_histogram.record(250, {"endpoint": "/home"})

4. UpDownCounter

A counter that can increase or decrease, is ideal for tracking values that fluctuate over time.

Active sessions rise and fall, which makes them a natural fit for an UpDownCounter:

active_users = meter.create_up_down_counter("active_sessions", "Tracks the number of active user sessions")
active_users.add(1)  # User logs in
active_users.add(-1)  # User logs out

Pick the instrument that matches how the value behaves, and the right default aggregation mostly follows. You do not have to instrument everything by hand either. Our guide on converting OpenTelemetry traces to metrics with SpanMetrics shows how to get these metrics from traces you already collect.

Metric Mapping and Temporality

In OpenTelemetry, metrics are the primary way to measure and observe the performance of applications. The mapping of instrument kinds to metric types is a crucial aspect of the OpenTelemetry framework, as it defines how different types of measurements are captured and reported.

Instrument Kinds to Metric Types

OpenTelemetry defines several types of instruments, each mapping to a specific metric type. The most common types of instruments include:

Counter: Tracks monotonically increasing values, such as the number of requests or errors. A counter is typically used when you want to count occurrences, such as HTTP requests, in a system.
Measure: Measures quantities that can change over time, such as the duration of an operation or the amount of memory used. Unlike counters, measures are not necessarily monotonically increasing and can represent any value that changes.
UpDownCounter: Similar to the counter but allows both increments and decrements. It is used to track values that can go both up and down, like the number of active users in a system.
Histogram: Captures a distribution of values over time. This type of instrument is used to gather data on the frequency of events within different value ranges, such as response times or system latencies.

Aggregation Temporality

Temporality refers to the concept of how metric data is aggregated and reported over time. OpenTelemetry supports two main types of aggregation temporality:

Delta Temporality: This method records the change in the metric value since the last measurement. Delta temporality is used for metrics like counters and up-down counters, where you care about the difference in values over time rather than the absolute value.
Cumulative Temporality: In this approach, the metric value reflects the total value over time. For instance, a cumulative counter would report the total count since the application started. Cumulative temporality is typically used with metrics that measure ongoing quantities.

The choice between delta and cumulative temporality impacts how data is recorded and aggregated in observability tools, as well as how it can be queried and interpreted.

We have been bitten by this setting. An SDK upgrade flipped one service’s OTLP exporter to delta temporality, our pipeline assumed cumulative, and the Collector’s Prometheus exporter quietly dropped every delta sum it could not convert. Request rate panels went blank and the error alert built on those series stopped evaluating, so a broken pipeline spent half a day looking like a healthy service. Pin temporality explicitly in exporter config, and alert on absent metrics, because missing data reads as good news otherwise.

💡

Explore our detailed article on native Prometheus support for OpenTelemetry metrics.

Best Practices for Effective Metrics Usage

To make the most out of OpenTelemetry metrics, follow these best practices:

Apply Consistent Labels (Tags): Labels should be meaningful and limited in cardinality to prevent performance degradation. For example, avoid using highly unique identifiers like request IDs as labels.
Establish Clear Naming Conventions: Use standardized and descriptive metric names. A good practice is to follow the <service>.<metric>.<unit> format, such as api.requests.count.
Optimize the Telemetry Pipeline: Reduce the frequency of metric collection for less critical data points, use batch exports, and filter unnecessary metrics before sending them to observability backends.
Use Histograms Over Averages: Percentiles (p50, p95, p99) provide more actionable insights than simple averages, which can be misleading.
Use Aggregation Effectively: Choose the right aggregation type based on the use case—sum for cumulative metrics, count for event occurrences, and histograms for latency analysis.

One default worth overriding early: histogram bucket boundaries. The SDK’s stock buckets stretch from 0 up to 10 seconds, which sounds fine until your endpoints answer in 5 to 50ms and nearly everything lands in the first few buckets, taking your p95 and p99 resolution with it. Once you know where your latencies actually sit, set custom boundaries with a View. Something like 1, 2.5, 5, 10, 25, 50, 100, 250 works for a fast service measured in milliseconds; defaults are fine for a first deploy, not for latency SLOs.

How to Choose the Right Observability Backend for Aggregated Metrics

Once aggregated, metrics need to be stored and visualized. Here are some common observability tools:

Last9 – A scalable and cost-effective observability platform optimized for large-scale data.
Prometheus – Popular for real-time monitoring but requires careful tuning for high-cardinality data.
Grafana – Works well with Prometheus, providing powerful visualization capabilities.
Datadog – Full-featured monitoring with built-in OpenTelemetry support.
New Relic – A comprehensive APM solution with OpenTelemetry integrations.

💡

Check out our guide on filtering metrics by labels in OpenTelemetry Collector.

FAQs

What is metrics aggregation in OpenTelemetry?

Metrics aggregation in OpenTelemetry is the process of combining individual metric data points into summarized statistics such as sums, counts, last values, and histogram buckets. The SDK aggregates measurements before export, which reduces data volume, lowers storage cost, and speeds up queries while preserving the trends needed to analyze system performance.

Which aggregation type does OpenTelemetry use for each instrument?

Each instrument kind has a default aggregation: Counter and UpDownCounter use sum aggregation, Gauge uses last-value aggregation, and Histogram uses explicit-bucket histogram aggregation. Histograms group recorded values into buckets so you can compute percentiles like p50, p95, and p99, which makes them the right choice for latency and other distribution-style measurements.

When should I use a Counter vs. an UpDownCounter or Gauge?

Use a Counter for values that only increase, like total requests or errors. Use an UpDownCounter for values that rise and fall, like active sessions or queue depth. Use a Gauge for point-in-time readings you sample rather than accumulate, like current memory usage. The instrument you pick matters because it determines the default aggregation and how backends interpret the data.

Should I use delta or cumulative temporality?

It depends on your backend. Cumulative temporality reports the running total since the process started and is what Prometheus requires. Delta temporality reports only the change since the last export, which produces smaller payloads and is preferred by some backends like Datadog. Most OTLP exporters default to cumulative; check what your backend expects before changing it.

How do I change the default aggregation for a metric?

Use a View in the SDK. A View matches an instrument by name, type, or meter and overrides how its measurements are aggregated, such as replacing a histogram’s default buckets with custom boundaries or dropping attributes you don’t need. Views are registered on the MeterProvider when you configure it, so no instrumentation code changes are required.

How does aggregation reduce data volume and cost?

Instead of exporting every raw measurement, the SDK collapses measurements into one aggregated data point per unique combination of metric name and attribute values per export interval. A service handling thousands of requests per minute exports a single sum, not thousands of events. Keeping attribute cardinality low by avoiding labels like request IDs keeps the number of exported series and storage costs predictable.

Conclusion:

Get temporality right for your backend before you touch anything else. It is the hardest setting to retrofit, because switching between delta and cumulative later breaks every rate query and alert built on the old behavior, and you rarely find all of them on the first pass. Past that, the boring answer is the right one: keep the SDK’s default aggregations and cumulative temporality unless your backend documentation explicitly asks for delta. The one default not worth keeping is histogram bucket boundaries, which will quietly flatten your p99 long before anything else here hurts you.

💡

The aggregation in this guide happens in the SDK, before export. You get one more shot at it on the way in: Last9’s Control Plane runs streaming aggregation on metrics in flight, so high-cardinality series get rolled up at ingestion without touching instrumentation.