Pulling observability data together is rarely clean. Metrics come from everywhere, formats vary, and making sense of it takes some work.
OpenTelemetry Collector and Prometheus fit perfectly here. The Collector handles ingestion and processing from different sources, while Prometheus stores and queries the data. Simple, effective, and no vendor lock-in.
In this blog, we cover how to integrate the Collector with Prometheus, common pitfalls, and ways to control costs.
What You Get with OpenTelemetry Collector and Prometheus
The OpenTelemetry Collector acts as a central point for receiving, processing, and exporting telemetry data. It handles metrics, logs, and traces from multiple sources and gives you control over where and how that data flows.
Prometheus remains one of the most reliable tools for collecting and storing metrics. It’s widely adopted, battle-tested, and comes with an ecosystem that makes integration and querying straightforward.
Using them together gives you a few key advantages:
- Consistent data formats. Whether your services are in Go, Python, or Node.js, the Collector normalizes the metrics before passing them to Prometheus—no format headaches.
- Preprocessing built-in. You can filter, transform, and enrich data within the Collector, helping reduce storage costs and improve query performance.
- No vendor lock-in. The Collector supports multiple exporters. You can switch destinations or add new ones without touching your instrumentation code.
How to Set Up a Basic Configuration
To get OpenTelemetry Collector working with Prometheus, start by creating a configuration file. Here’s the setup that gets metrics flowing:
receivers:
otlp:
protocols:
grpc:
endpoint: 0.0.0.0:4317
http:
endpoint: 0.0.0.0:4318
processors:
batch:
timeout: 1s
send_batch_size: 1024
memory_limiter:
limit_mib: 512
exporters:
prometheus:
endpoint: "0.0.0.0:8889"
namespace: "otel"
const_labels:
environment: "production"
service:
pipelines:
metrics:
receivers: [otlp]
processors: [memory_limiter, batch]
exporters: [prometheus]
This configuration sets up:
- An OTLP receiver to accept metrics over both gRPC and HTTP
- A memory limiter and batching for efficient processing
- A Prometheus exporter that exposes metrics on port
8889
with a custom namespace and constant labels
Save this as otel-collector-config.yaml
, then start the collector:
./otelcol --config otel-collector-config.yaml
Once it’s running, the Collector will start accepting and exposing metrics ready to be scraped by Prometheus.
How to Configure Prometheus to Scrape Your Collector
To pull metrics from the OpenTelemetry Collector, update your prometheus.yml
with a new scrape job:
scrape_configs:
- job_name: 'otel-collector'
static_configs:
- targets: ['localhost:8889']
scrape_interval: 15s
metrics_path: /metrics
job_name
: A label to identify this target in Prometheustargets
: The Collector’s Prometheus exporter endpoint (adjust if running in Docker/Kubernetes)scrape_interval
: How often Prometheus scrapes the metrics (default is 15s)metrics_path
: Defaults to/metrics
, which the Collector uses
After saving the config, restart Prometheus:
systemctl restart prometheus
# or
docker restart prometheus
Prometheus will now scrape the Collector every 15 seconds. You can increase or decrease the interval depending on your retention goals, storage limits, and how granular your metrics need to be.
How the Data Flows
Once everything is set up, here’s how metrics move through the system:
- Your application emits a metric using the OpenTelemetry SDK.
- The metric is sent to the OpenTelemetry Collector via OTLP (gRPC or HTTP).
- The Collector processes it using the configured processors (e.g., batching, memory limits).
- The Collector exposes it in Prometheus format on
/metrics
. - Prometheus scrapes that endpoint at regular intervals.
This gives you OpenTelemetry’s flexible, vendor-neutral instrumentation paired with Prometheus’s proven, efficient storage and querying.
Advanced Configuration for Production OpenTelemetry Setups
Below are common patterns that help scale observability without blowing up costs or complexity:
Filter Noisy or High-Cardinality Metrics at the Source
High-cardinality metrics (like per-pod or per-instance labels) can slow down queries and inflate storage. You can use the filter
processor to drop metrics that aren't useful for long-term analysis.
Here’s how to exclude histogram buckets and filter out metrics from test instances:
processors:
filter:
metrics:
exclude:
match_type: regexp
metric_names:
- ".*_bucket"
resource_attributes:
- key: "service.instance.id"
value: ".*test.*"
This setup:
- Removes histogram
_bucket
metrics, which often aren’t needed unless you're actively analyzing distributions. - Filters out noisy telemetry from test environments.
This reduces the volume of time series stored and helps keep your Prometheus setup fast and affordable.
Export Metrics to Both Local and Remote Backends
You don’t have to pick between local visibility and long-term analytics. You can export metrics to multiple destinations like a local Prometheus instance for quick querying and a managed backend for longer retention.
exporters:
prometheus:
endpoint: "0.0.0.0:8889"
prometheusremotewrite:
endpoint: "https://your-managed-prometheus.com/api/v1/write"
headers:
Authorization: "Bearer your-token-here"
service:
pipelines:
metrics:
receivers: [otlp]
processors: [memory_limiter, batch]
exporters: [prometheus, prometheusremotewrite]
This gives you:
- Low-latency dashboards via local Prometheus
- Historical trends and capacity planning via a remote Prometheus-compatible store
All without rewriting instrumentation or duplicating metrics.
Enrich Metrics with Consistent Resource-Level Labels
Prometheus is powerful when your metrics are well-labeled. You can use the resource
processor to attach consistent metadata like environment, service version, or region — making your dashboards more useful and queries more targeted.
processors:
resource:
attributes:
- key: deployment.environment
value: production
action: upsert
- key: service.version
from_attribute: app.version
action: upsert
Why this matters:
- You can slice metrics by
deployment.environment
,service.version
, etc. - Helps when comparing prod vs staging, or filtering out test noise.
- Useful for alerting on specific versions or regions.
With this, your metrics become easier to search, group, and reason about, no matter how distributed your services are.
Metric Types in OpenTelemetry and How They Map to Prometheus
OpenTelemetry supports several metric types, each suited to different kinds of data. Knowing how these map to Prometheus helps you interpret your metrics correctly and design better dashboards.
Counters: Tracking Things That Only Go Up
Counters in OpenTelemetry become Prometheus counters—monotonically increasing values that reset only when your app restarts. They’re perfect for counting events like:
- Number of requests served
- Total errors encountered
- Bytes processed
Counters help answer “how many?” questions, always increasing as events happen.
Gauges: Snapshots of Current Values
Gauges measure point-in-time values that can go up and down. Think of them as your system’s vital signs, reporting things like:
- Current CPU or memory usage
- Number of active connections
- Queue lengths
Since gauges can increase or decrease, they’re great for monitoring fluctuating metrics.
Histograms: Understanding Value Distributions
Histograms track how values spread out over time. OpenTelemetry histograms map directly to Prometheus histograms, which break down data into buckets. This is useful for:
- Measuring request latencies
- Analyzing response sizes
- Seeing distribution percentiles (like p95 or p99)
Histograms help you spot patterns and outliers in your performance data.
How to Send Application Metrics to the OpenTelemetry Collector
To get your app’s metrics flowing into the OpenTelemetry Collector, you need to configure your application’s OpenTelemetry SDK to export metrics via OTLP.
Below are few examples in Python and Node.js showing how to set this up.
Python Example: Exporting Metrics via OTLP
from opentelemetry import metrics
from opentelemetry.exporter.otlp.proto.grpc.metric_exporter import OTLPMetricExporter
from opentelemetry.sdk.metrics import MeterProvider
from opentelemetry.sdk.metrics.export import PeriodicExportingMetricReader
# Set up the exporter to send metrics to the collector endpoint
exporter = OTLPMetricExporter(endpoint="http://localhost:4317")
# Export metrics every 5 seconds
reader = PeriodicExportingMetricReader(exporter, export_interval_millis=5000)
# Configure the meter provider with the metric reader
metrics.set_meter_provider(MeterProvider(metric_readers=[reader]))
meter = metrics.get_meter("my-python-app")
# Create a counter metric and record a value
request_counter = meter.create_counter("http_requests_total")
request_counter.add(1, {"method": "GET", "endpoint": "/api/users"})
This example sets up the OpenTelemetry meter, connects it to the collector using OTLP over gRPC, and sends a simple counter metric.
Node.js Example: Exporting Metrics via OTLP
const { NodeSDK } = require('@opentelemetry/sdk-node');
const { OTLPMetricExporter } = require('@opentelemetry/exporter-otlp-grpc');
const { PeriodicExportingMetricReader } = require('@opentelemetry/sdk-metrics');
// Initialize SDK with OTLP metric exporter
const sdk = new NodeSDK({
metricReader: new PeriodicExportingMetricReader({
exporter: new OTLPMetricExporter({
url: 'http://localhost:4317',
}),
exportIntervalMillis: 5000,
}),
});
sdk.start();
// Use metrics API to create and record metrics
const { metrics } = require('@opentelemetry/api');
const meter = metrics.getMeter('my-node-app');
const requestCounter = meter.createCounter('http_requests_total');
requestCounter.add(1, { method: 'POST', endpoint: '/api/orders' });
Here, the Node.js SDK exports metrics to the collector every 5 seconds and records a counter metric with attributes.
How to Secure Your OpenTelemetry Collector in Production
Enable TLS for Encrypted Collector Communication
Protect data in transit by configuring TLS on your OTLP receiver. This encrypts metrics as they travel between your apps and the collector:
receivers:
otlp:
protocols:
grpc:
endpoint: 0.0.0.0:4317
tls:
cert_file: /path/to/server.crt
key_file: /path/to/server.key
Add Authentication to Control Data Access
Use bearer token authentication to limit who can send telemetry data to your collector:
extensions:
bearertokenauth:
token: "your-secret-token"
receivers:
otlp:
protocols:
grpc:
auth:
authenticator: bearertokenauth
Why You Should Limit Network Access and Privileges
Run the collector with the least privileges needed, and restrict network access with firewalls or policies. This reduces attack surface and improves security.
How to Monitor Your OpenTelemetry Collector’s Health and Performance
Track these key collector metrics to stay ahead of issues and keep your observability pipeline healthy:
Metric | What it Shows |
---|---|
otelcol_receiver_accepted_metric_points |
Number of metrics accepted by receivers |
otelcol_exporter_sent_metric_points |
Metrics successfully sent to exporters |
otelcol_processor_batch_batch_send_size |
Stats on batch sizes during processing |
otelcol_process_memory_rss |
Collector’s current memory usage |
Add these to your dashboards to identify problems early.
3 Ways to Optimize Costs with OpenTelemetry Collector
Use Sampling to Reduce Data Volume
Sample your metrics to send only a percentage of data, lowering storage and processing needs:
processors:
probabilistic_sampler:
sampling_percentage: 10
Filter Out Unnecessary Metrics
Drop metrics you don’t need to keep your data clean and manageable:
processors:
filter:
metrics:
exclude:
match_type: strict
metric_names: ["unnecessary_metric"]
Aggregate Metrics to Reduce Cardinality
Group metrics by key attributes to lower cardinality and save on storage costs:
processors:
groupbyattrs:
keys: ["service.name", "service.version"]
Troubleshooting Common Issues with OpenTelemetry Collector and Prometheus
Sometimes things don’t work as expected. Here’s how to check what’s going on and fix common problems.
Metrics Not Showing Up in Prometheus
First, make sure your collector is actually getting the metrics. Turn on debug logs in your collector config to see detailed info:
service:
telemetry:
logs:
level: debug
Restart the collector, then check the logs for any errors.
Next, check if Prometheus can reach the collector’s metrics endpoint. Run:
curl http://localhost:8889/metrics
If you see a bunch of metrics, the collector is exporting correctly. If not, the collector might not be running right or there’s a config problem. If the metrics show here but don’t appear in Prometheus, double-check that Prometheus is scraping the right address and port.
Collector Using Too Much Memory
If your collector is eating too much memory, tweak the batch and memory limiter settings to reduce load:
processors:
batch:
timeout: 200ms
send_batch_size: 512
memory_limiter:
limit_mib: 256
spike_limit_mib: 64
This means smaller batches and tighter memory caps, which helps keep the collector from overusing resources.
Prometheus Can’t Scrape Metrics
Check that the scrape job in Prometheus matches the collector’s exporter settings exactly. Pay attention to:
- The IP/port
- The metrics path (usually
/metrics
)
For example, if your collector exports metrics on localhost:8889/metrics
, Prometheus should be configured to scrape that same endpoint.
And, if still issue persists, look at the logs on both the collector and Prometheus sides. They often tell you what’s wrong.
Wrapping Up
OpenTelemetry Collector with Prometheus offers a flexible, vendor-neutral way to collect reliable metrics.
But running and scaling Prometheusespecially with high-cardinality data can get complicated. We at Last9 help you handle that complexity while keeping costs in check. Companies like Probo, CleverTap, and Replit use Last9 to combine metrics, logs, and traces with full OpenTelemetry and Prometheus compatibility.
If managing infrastructure isn’t your focus, Last9 makes observability easier. Book some time with us to know more!
FAQs
Can I use OpenTelemetry Collector with my existing Prometheus setup?
Yes. The collector exports metrics in Prometheus format, so it works with existing Prometheus configurations. Just add the collector as a new scrape target in your prometheus.yml
.
What's the performance impact of adding the collector?
The collector adds minimal latency when configured properly. The batch processor actually improves performance by reducing individual exports. Most applications see better overall performance with the collector handling metric processing.
How do I handle high-cardinality metrics?
Use the collector's filtering and sampling processors to control cardinality before metrics reach Prometheus. You can filter labels, sample high-frequency metrics, or aggregate similar metrics at the collector level.
Can I send metrics to multiple Prometheus instances?
Yes, configure multiple exporters in the same pipeline. You can use both local Prometheus exporters and remote write exporters to send metrics to different destinations simultaneously.
What happens if my collector goes down?
Applications typically buffer metrics temporarily and retry sending them. For production environments, run multiple collector instances behind a load balancer to ensure high availability.