Last9 named a Gartner Cool Vendor in AI for SRE Observability for 2025! Read more →
Last9

Build Logs-to-Metrics Pipelines with OpenTelemetry Collector

Turn your logs into real-time metrics with OpenTelemetry — faster insights, fewer queries, and a leaner observability stack.

Nov 11th, ‘25
See How Last9 Works

Unified observability for all your telemetry. Open standards. Simple pricing.

Talk to an Expert

Your system generates thousands of logs every minute. Most of the time, they’re used for debugging: figuring out what broke, when, and why. But those same logs can also tell you how your system behaves if you treat them as a source of metrics.

Let’s say your payment service records every transaction: success, failure, timeout, validation_error — around 50,000 log lines per minute. You might want to track checkout success rate by payment method, or set up an alert when failures cross 2%. The information you need is already there in your logs; it just isn’t aggregated yet.

With OpenTelemetry, you can turn those logs into usable metrics. The Collector processes log data in real time, extracts attributes like service.name or http.status_code, and emits aggregated metrics for dashboards and alerts — without adding new instrumentation or maintaining parallel code paths.

In this edition of the OTel series, we’ll look at using logs-to-metrics effectively — setup, optimization, and keeping your stack lean and predictable.

Why Logs-to-Metrics Is Worth Implementing

Most systems already produce rich logs. Turning a subset of that data into metrics helps you observe trends without adding new instrumentation.

Control Storage Growth with Aggregated Metrics

A service that logs every HTTP request — including headers, payloads, and latency details — can generate gigabytes per day. Multiply that across 50 services with 90-day retention, and log storage becomes one of your largest operational costs.

Using logs-to-metrics, you can extract aggregates such as http_request_count, request_latency_seconds, or error_rate directly from those logs. The remaining data can be sampled or filtered.

Simplify Instrumentation with One Source of Truth

Many applications emit the same event twice — once as a log and once as a metric:

# Log event
logger.info("checkout_complete", {"status": "success"})
# Metric event
checkout_success_counter.add(1, {"status": "success"})

Over time, these two paths drift — labels change, events fall out of sync, and maintenance grows.

With logs-to-metrics, you can log once. The OpenTelemetry Collector can parse attributes from log records and generate metrics automatically, removing duplicate instrumentation while keeping the same signal fidelity.

Use Metrics for Faster, More Reliable Alerts

Metrics are built for alerting — compact, time-series-based, and efficient to aggregate. Log queries, even optimized ones, are better suited for analysis than continuous evaluation.

When logs are converted into metrics like counter, gauge, or histogram, alerting systems can evaluate them in real time with predictable performance. This improves alert accuracy and reduces the need for complex log queries running on production systems.

Now, let’s break down the two main ways to produce metrics from logs using the Collector.

Two Ways to Create Metrics with the OTel Collector

The OpenTelemetry Collector offers two main ways to create metrics from your existing telemetry: the spanmetrics connector and the count connector. Both help you reuse what your systems already emit — traces or logs — to build metrics without extra instrumentation.

Span-to-Metrics: The spanmetrics Connector

The spanmetrics connector lets you turn spans into RED metrics — request rate, error rate, and duration — without adding any extra instrumentation. If your services already emit traces, the Collector can convert them into metrics that work nicely with dashboards and alerting.

Here’s the idea:

  • Traces come in through the otlp receiver.
  • spanmetrics sits in the traces pipeline, observes every span, and generates metrics.
  • Those metrics flow into the metrics pipeline, where you export them just like any other time series.

The result: you get counters and histograms such as calls, errors, and duration built straight from trace data.

A minimal but production-safe setup looks like this:

receivers:
otlp:
protocols:
grpc:
endpoint: 0.0.0.0:4317
http:
endpoint: 0.0.0.0:4318
connectors:
spanmetrics:
histogram:
explicit:
buckets: [10ms, 50ms, 100ms, 250ms, 500ms, 1s, 2s, 5s, 10s]
dimensions:
- name: http.method
- name: http.status_code
exclude_dimensions:
- http.url
- http.target
dimensions_cache_size: 2000
aggregation_temporality: CUMULATIVE
metrics_flush_interval: 30s
exporters:
debug:
verbosity: detailed
service:
pipelines:
traces:
receivers: [otlp]
exporters: [spanmetrics, debug]
metrics:
receivers: [spanmetrics]
exporters: [debug]

A few things to keep in mind when shaping the metrics:

  • Choose low-cardinality dimensions. HTTP methods and status codes are safe; URLs or user IDs aren’t.
  • Match your latency buckets to your SLOs. The histogram only tells you what you define.
  • Control the cache. The dimensions_cache_size and metrics_expiration settings decide how many unique label combinations the Collector keeps in memory.

Once this is wired up, the Collector will emit metrics such as:

  • calls{http.method="GET", http.status_code="200"}
  • errors{…}
  • duration_bucket{…}

All of them were built directly from spans passing through the pipeline.

Compatibility note: Newer Collector releases deprecate the logging exporter in favor of debug, and automatically include service.name as a dimension.
If you see a “duplicate dimension” error or run into exporter issues on older versions, adjust the config accordingly.

Log-to-Metrics: The count Connector

The count connector turns structured logs into metrics. This is useful when your logs carry enough detail to answer questions like “How often does this event happen?” but you don’t have tracing everywhere.

If you want to track how many logs fall into each severity level, here’s a simple setup:

receivers:
otlp:
protocols:
http:
endpoint: 0.0.0.0:4318
processors:
transform/copy_severity:
statements:
- context: log
statements:
- set(attributes["severity_text"], severity_text)
connectors:
count:
logs:
log_count_by_severity:
description: Count of logs by severity
attributes:
- key: severity_text
exporters:
debug:
verbosity: detailed
service:
pipelines:
logs:
receivers: [otlp]
processors: [transform/copy_severity]
exporters: [count, debug]
metrics:
receivers: [count]
exporters: [debug]

This produces a log_count_by_severity metric with labels for each severity level — info, warn, error, and so on.

A quick note on severity_text

Most logging libraries already include a severity_text field (INFO, WARN, ERROR).
If your logs already have it, you don’t need the transform/copy_severity processor — the count connector will group logs by severity out of the box.

Your simplified pipeline would look like:

service:
pipelines:
logs:
receivers: [otlp]
exporters: [count, debug]
metrics:
receivers: [count]
exporters: [debug]

If your logs don’t expose severity_text, keep the transform step. It ensures the count connector always has a consistent field to work with.

Filtering specific log patterns

You can also count only the log events that match certain conditions. For example, tracking failed checkouts by payment method and failure reason:

connectors:
count:
logs:
checkout_failures:
description: Failed checkout attempts
attributes:
- key: checkout.payment_method
- key: failure.reason
conditions:
- 'attributes["event.name"] == "checkout"'
- 'attributes["checkout.status"] == "failed"'

This produces metrics like:

checkout_failures_total{
payment_method="card",
reason="timeout"
} = 1

Great for building dashboards or wiring up alerts on specific failure patterns.

Preprocess Logs Before Counting

Sometimes logs need a bit of reshaping before they can turn into useful metrics. The transform processor helps you compute new fields or derive attributes before the count connector runs.

Here’s an example that calculates a query’s duration and classifies it as fast or slow:

receivers:
otlp:
protocols:
http:
endpoint: 0.0.0.0:4318
processors:
transform:
# Use log_statements for this Collector version
log_statements:
- context: log
statements:
- set(attributes["duration_ms"], Int(attributes["end_time"]) - Int(attributes["start_time"])) where attributes["end_time"] != nil and attributes["start_time"] != nil
- set(attributes["query_speed"], "slow") where Int(attributes["duration_ms"]) > 1000
- set(attributes["query_speed"], "fast") where Int(attributes["duration_ms"]) <= 1000
connectors:
count:
logs:
db_query_count:
description: Database queries by speed
attributes:
- key: db.operation
- key: query_speed
conditions:
- 'attributes["db.system"] != nil'
exporters:
debug:
verbosity: detailed
service:
pipelines:
logs:
receivers: [otlp]
processors: [transform]
exporters: [count, debug]
metrics:
receivers: [count]
exporters: [debug]

This pipeline computes duration_ms, tags each query as fast or slow, and produces metrics such as:

db_query_count{db.operation="SELECT", query_speed="slow"} = 1

— all derived directly from logs.

Both connectors — spanmetrics and count — make it easy to surface meaningful, low-overhead metrics without changing your application code. Whether the data starts as traces or logs, the Collector gives you a flexible way to shape it into metrics you can actually use.

How to Keep Metric Cardinality in Check

When logs or spans are turned into metrics, every unique combination of label values becomes a new time series. That’s how metric systems work — each label set defines a separate stream. The challenge is that the number of series grows fast.

Say you have:
3 services
10 HTTP methods
50 status codes
That’s 3 × 10 × 50 = 1,500 time series.
Add user_id with 100,000 values, and you now have 150 million series. At that point, queries slow down and storage costs rise quickly.

CardinalityRangeExamplesTypical Impact
Low< 20http.method, log.levelFast, inexpensive
Medium20–100http.status_code, service.nameStable and predictable
High100–1000endpoint_pattern, regionNoticeable storage use
Very High1000+http.url, user.id, client.ipSlow queries, higher cost

The goal is to keep cardinality where it adds value — not everywhere.

Normalize Values Before Metrics Are Created

Dynamic attributes like URLs or IDs can be normalized before metrics are generated. This helps avoid high-cardinality series caused by user IDs, order IDs, or other unique values.

processors:
transform:
log_statements:
- context: log
statements:
- set(attributes["http.target"], replace_pattern(attributes["http.target"], "^/api/v1/users/[0-9]+$", "/api/v1/users/:id"))
- set(attributes["http.target"], replace_pattern(attributes["http.target"], "^/api/v1/orders/[0-9]+$", "/api/v1/orders/:id"))

This turns requests like /api/v1/users/3489 into /api/v1/users/:id, keeping your metrics focused and sustainable without losing useful routing insight.

Drop Labels You Don’t Use

If you’re not filtering, alerting, or grouping by a label, it doesn’t belong in your metrics. Keep it in logs or traces where it’s still searchable, but avoid adding unnecessary dimensions to time series.

Use the Right Signal for the Right Detail

Metrics show trends. Logs and traces capture detail. Use metrics to understand what is happening, and logs or traces to explore why. Keeping granular attributes out of metrics helps the system stay fast and predictable.

Understand Collector Memory Behavior

The Collector caches label combinations in memory:

connectors:
spanmetrics:
dimensions_cache_size: 5000
metrics_flush_interval: 30s
metrics_expiration: 5m

When the cache fills, older entries are evicted. If they reappear, their counters restart — showing brief drops to zero in dashboards.

Monitor these internal metrics to spot pressure on the pipeline:

  • otelcol_processor_accepted_spans
  • otelcol_processor_dropped_spans
  • otelcol_exporter_queue_size

A steady rise in dropped spans or queue size signals that the Collector is nearing its limits.

Test and Tune Cardinality Early

Before pushing to production, inspect what the Collector emits. Use the debug exporter to print metric series and count combinations:

exporters:
debug:
verbosity: detailed
service:
pipelines:
metrics:
receivers: [spanmetrics]
exporters: [debug]

Run this with sample traffic and review how many unique label sets appear. A few thousand is usually fine. Tens of thousands? Time to simplify grouping or dimensions.

Keep It Focused, Not Limited

Working with cardinality is all about shaping data so your metrics stay fast, consistent, and cost-efficient at scale.

Steps to Validate Logs-to-Metrics Configuration

When metrics don’t show up or appear incomplete, the issue usually lies somewhere between configuration, filtering, or resource limits. Here’s how to narrow it down step by step.

Metrics Don’t Appear

Start with the basics — confirm how the connector is wired between pipelines.
It needs to act as an exporter in the source pipeline and a receiver in the destination one:

service:
pipelines:
logs:
receivers: [otlp]
exporters: [count, otlp] # connector + actual exporter
metrics:
receivers: [count] # the count connector receives processed logs here and generates metrics
exporters: [otlp]

If the connector name under connectors: doesn’t match what’s used in pipelines, metrics won’t be generated.

Metrics Look Incomplete

If you’re using the count connector, review the conditions you’ve defined. A filter that’s too specific can block events unintentionally:

connectors:
count:
logs:
checkout_failures:
attributes:
- key: checkout.payment_method
conditions:
- 'attributes["event.name"] == "checkout"'
- 'attributes["checkout.status"] == "failed"'

If these attributes aren’t present in your logs, the connector won’t emit anything.

For the spanmetrics connector, incomplete metrics often come from upstream sampling. A 10% trace sampling rate means only 10% of spans are used to generate metrics.

Also, verify that the attributes you use as dimensions actually exist — dimensioning by http.status_code won’t work if spans don’t include it.

In some setups, the attribute name may differ. Newer semantic conventions use http.response.status_code instead of http.status_code, so check which one your spans actually carry before adding it as a dimension.

Metrics Appear and Then Vanish

If your graphs show metrics for a while and then drop to zero, it’s likely due to the Collector’s cache behavior.

connectors:
spanmetrics:
dimensions_cache_size: 5000
metrics_flush_interval: 30s
metrics_expiration: 5m

The Collector stores a limited set of label combinations in memory. When this cache fills, older combinations are evicted. If they reappear later, the counters reset, which looks like a sudden dip. Increasing dimensions_cache_size or reducing the number of label combinations keeps this under control.

Counters Reset Unexpectedly

Check the aggregation_temporality setting in your configuration.

  • CUMULATIVE: counters grow continuously between scrapes — ideal for Prometheus.
  • DELTA: counters reset after each export.

If your backend expects cumulative data but receives deltas, you’ll see values dropping to zero between exports.

Verify What the Collector Emits

When in doubt, inspect what’s being generated before it’s exported. The logging exporter is handy for that:

exporters:
logging:
verbosity: detailed
sampling_initial: 100
sampling_thereafter: 500
service:
pipelines:
metrics:
receivers: [spanmetrics]
exporters: [logging]

The output will show which attributes exist, what filters match, and whether you’re accidentally dropping data.

In newer Collector versions, the logging exporter is deprecated. If you’re on a recent release, replace it with the debug exporter — it prints the same detailed output and is the recommended option going forward.

Monitor Collector Health

Finally, check the Collector’s own telemetry. A few internal metrics tell you how the pipeline is behaving:

  • otelcol_processor_accepted_spans
  • otelcol_processor_dropped_spans
  • otelcol_exporter_queue_size

If dropped spans or queue size keep increasing, you’re likely running into resource or cardinality limits.

Most debugging comes down to validation — making sure data enters the pipeline, is transformed as expected, and exits without being filtered or evicted along the way.

Practical Config Patterns for Logs-to-Metrics

Logs-to-metrics setups vary across systems, but a few patterns show up repeatedly in production. These configurations help you reduce storage, add useful context, and keep metrics clean without changing code.

Sampling Logs While Keeping Metrics Complete

Metrics are generated from the unsampled logs pipeline, so you can safely reduce log volume with sampling while keeping metric accuracy.

processors:
probabilistic_sampler:
sampling_percentage: 10
service:
pipelines:
# 1. Unsampled logs → metrics
logs/metrics:
receivers: [otlp]
exporters: [count]
# 2. Sampled logs → long-term storage
logs/storage:
receivers: [otlp]
processors: [probabilistic_sampler]
exporters: [otlp]
# 3. Metrics pipeline from count
metrics:
receivers: [count]
exporters: [otlp]

This setup gives you metrics based on 100% of traffic but stores only 10% of logs — a straightforward way to manage retention costs without sacricing metric fidelity.

Add Context Before Conversion

It’s often helpful to enrich logs with additional context before generating metrics. Adding environment data, cluster names, or custom tags makes dashboards easier to filter later.

processors:
resource:
attributes:
- key: environment
value: ${env:ENVIRONMENT}
action: upsert
transform:
log_statements:
- context: log
statements:
- set(attributes["error_type"], "timeout") where body.stringValue contains "timeout"
- set(attributes["error_type"], "validation") where body.stringValue contains "validation"
service:
pipelines:
logs:
receivers: [otlp]
processors: [resource, transform]
exporters: [count, otlp]

After enrichment, the resulting metrics might look like:

error_total{error_type="timeout", environment="staging"}

— giving you a clearer sense of where failures occur and what type they are.

Use Semantic Conventions

Consistent naming makes your telemetry interoperable across traces, logs, and metrics. OpenTelemetry provides well-defined semantic conventions — stick to them where possible.

PreferAvoid
http.request.methodrequest_method
http.response.status_codestatus
db.systemdatabase
service.namecustom service keys

Tip: stick to lowercase, dot-separated keys; avoid camelCase or bespoke names. If you’re stuck with older emitters, use schema translation to map legacy http.methodhttp.request.method and http.status_codehttp.response.status_code.

Adopting standard attribute names ensures your traces, logs, and metrics align naturally when visualized or queried together.

Start with a Small Dimension Set

Keep dimensions simple early on — service.name, http.method, and http.status_code usually cover most needs. Watch the number of active series, then add dimensions one by one (like region or http.route) if you need deeper breakdowns.
This gradual approach helps you understand how each label affects cardinality before scaling up.

Expire Stale Series Automatically

For workloads with short-lived services or frequent label changes, configure metric expiration to keep memory usage stable.

connectors:
spanmetrics:
metrics_expiration: 5m

Metrics that haven’t been updated in five minutes are dropped from memory, preventing stale series from lingering in the Collector.

These patterns are small, composable building blocks — you can mix and match them as your observability setup matures. Start with enrichment and sampling, then add conventions and expiration once your data flow stabilizes.

Why Last9 LogMetrics Is the Better Way

Traditional logs-to-metrics approaches force you to choose between flexibility and performance. You either pre-aggregate in the Collector (losing the ability to query historical data differently) or send everything downstream and aggregate later (burning through storage and compute costs).

Last9 LogMetrics solves this differently. It’s a streaming aggregation engine that runs continuously on your log data — not in the Collector, but in the platform itself. This means:

  • Unlimited cardinality without the cost penalty — aggregate by any dimension without worrying about exploding storage costs or query performance degradation.
  • Change metrics on the fly — update your aggregation logic in the UI and start getting new breakdowns immediately, without redeploying collectors or restarting services. For example, if you’re tracking API errors and suddenly need to break them down by customer_tier to investigate a specific segment, you can add that dimension in Last9’s UI immediately. With Collector-based aggregation, you’d need to update configs, redeploy collectors across all environments, and wait for the changes to propagate — losing visibility into the problem while you’re trying to debug it.
  • No pipeline brittleness — metrics definitions live in the platform, not scattered across Collector configs that require coordinated rollouts to change.

Because Last9 supports OTLP natively, metrics flow directly from the OpenTelemetry Collector alongside traces and logs. Both use shared trace and span identifiers, giving you seamless correlation — from a latency spike in a metric, to its trace, to the logs that explain what happened — without context switching or manual joins.

Unlike batch-based systems or static aggregation rules, Last9 LogMetrics runs as a continuous streaming process. You get real-time metrics with the flexibility to evolve your observability without the infrastructure complexity or storage costs of keeping raw logs forever.

You can set this up in just a few minutes — connect your pipelines, define your metrics, and start building unified dashboards and alerts. And if you’d like to talk through your setup, our team’s happy to help.

In the next part, we’ll walk through field redaction, encryption, audit trails, and keeping OTel pipelines compliant.

Contents

Do More with Less

Unlock unified observability and faster triaging for your team.