It's 2 AM. Your payment service is down. Customers are angry, and your phone won't stop buzzing.
You open your monitoring dashboard. "500 Internal Server Error" flashes across multiple services. That's all you get. No context, no breadcrumbs, no clue where to start looking.
You spend the next three hours jumping between logs, trying to match timestamps, hoping something connects. Your database looks fine. Your APIs report healthy. Yet customers can't complete purchases.
Modern JavaScript apps span multiple services, databases, and external APIs. When something breaks, finding the source feels impossible. This is the exact problem OpenTelemetry solves.
What OpenTelemetry Does for Your JavaScript Apps
OpenTelemetry tells you the complete story of what happened to each request.
Instead of seeing "Payment failed" in isolation, you see the full journey:
- User clicks "Buy Now"
- Request hits your Node.js API (23ms)
- Calls inventory service (67ms)
- Checks payment gateway (2.1 seconds - found it!)
- Times out with "Gateway unreachable"
Now you know exactly where to look. No more guessing games or midnight detective work.
The JavaScript Observability Problem
Your JavaScript application today probably looks like this:
- React frontend making dozens of API calls
- Node.js services talking to each other
- Background workers processing jobs
- Multiple databases and third-party APIs
- Everything running in containers across cloud providers
When something goes wrong, the failure might start in your frontend, bounce through three microservices, hit a database bottleneck, and show up as a mysterious timeout somewhere else entirely.
Traditional logs give you scattered pieces. OpenTelemetry shows you the complete picture.
See Your First Trace in Minutes
Let’s run a minimal setup to capture distributed traces in a Node.js service.
Install required OpenTelemetry packages
npm install @opentelemetry/sdk-node \
@opentelemetry/auto-instrumentations-node
@opentelemetry/sdk-node
— Provides the core SDK for Node.js, including the tracer provider and span processor.@opentelemetry/auto-instrumentations-node
— Bundles instrumentation modules for popular Node.js libraries (e.g.,http
,express
,pg
), enabling automatic span creation.
Create the tracing bootstrap file
// tracing.js
const { NodeSDK } = require('@opentelemetry/sdk-node');
const { getNodeAutoInstrumentations } = require('@opentelemetry/auto-instrumentations-node');
const sdk = new NodeSDK({
instrumentations: [getNodeAutoInstrumentations()]
});
sdk.start();
console.log('Tracing initialized');
This configures the Node.js SDK with auto-instrumentations. Any supported library that your application imports will emit spans without requiring manual instrumentation.
Initialize tracing before application code
// app.js — this MUST be the first import
require('./tracing');
const express = require('express');
const app = express();
app.get('/api/orders', async (req, res) => {
// Simulate application work
await new Promise(resolve => setTimeout(resolve, 150));
res.json({ orders: ['order-1', 'order-2'] });
});
app.listen(3000, () => {
console.log('Server running on port 3000');
});
Placing require('./tracing')
at the top ensures the instrumentation hooks are registered before the target modules are loaded, so all incoming requests are traced.
Run and trigger a request
node app.js
curl http://localhost:3000/api/orders
When /api/orders
is hit, the OpenTelemetry tracer will:
- Create a root span for the incoming HTTP request.
- Record child spans for downstream calls or middleware triggered by the request.
- Annotate spans with metadata (HTTP method, route, status code, duration) based on semantic conventions.
The resulting trace structure will be visible in your console (or exported to a backend if configured), showing execution timing for each operation in the request lifecycle.
Export Traces to a Backend
In production, you want traces sent to a backend that stores them, correlates them across services, and lets your team query, visualize, and alert on them.
Add Required Dependencies
npm install @opentelemetry/exporter-trace-otlp-http \
@opentelemetry/resources \
@opentelemetry/semantic-conventions
@opentelemetry/exporter-trace-otlp-http
— Sends traces over OTLP/HTTP to any OpenTelemetry-compatible backend.@opentelemetry/resources
— Defines metadata about the service that produced the telemetry.@opentelemetry/semantic-conventions
— Standard keys for service name, version, environment, and other attributes.
Configure the Tracing Pipeline
// tracing.js
const { NodeSDK } = require('@opentelemetry/sdk-node');
const { getNodeAutoInstrumentations } = require('@opentelemetry/auto-instrumentations-node');
const { Resource } = require('@opentelemetry/resources');
const { SemanticResourceAttributes } = require('@opentelemetry/semantic-conventions');
const { BatchSpanProcessor } = require('@opentelemetry/sdk-trace-base');
const { OTLPTraceExporter } = require('@opentelemetry/exporter-trace-otlp-http');
// Identify the service emitting traces
const resource = new Resource({
[SemanticResourceAttributes.SERVICE_NAME]: 'checkout-api',
[SemanticResourceAttributes.SERVICE_VERSION]: '2.1.0',
[SemanticResourceAttributes.DEPLOYMENT_ENVIRONMENT]: process.env.NODE_ENV || 'development',
});
// Export spans over OTLP/HTTP
const traceExporter = new OTLPTraceExporter({
url: process.env.OTEL_EXPORTER_OTLP_ENDPOINT || 'http://localhost:4318/v1/traces',
});
// Batch spans for efficient delivery
const sdk = new NodeSDK({
resource,
spanProcessor: new BatchSpanProcessor(traceExporter),
instrumentations: [getNodeAutoInstrumentations()],
});
sdk.start();
// Ensure all spans are flushed before exit
process.on('SIGTERM', () => {
sdk.shutdown()
.then(() => console.log('Tracing stopped'))
.catch((error) => console.error('Error stopping tracing', error))
.finally(() => process.exit(0));
});
What’s Happening Here
- Resource configuration: Labels every span with your service’s name, version, and environment. This makes it possible to filter and group traces across multiple services in your backend.
- OTLP Trace Exporter: Sends traces in the OpenTelemetry Protocol over HTTP. You can point this at your own collector, a managed vendor endpoint, or something like the Last9 Gateway.
- BatchSpanProcessor: Collects multiple spans before sending them, reducing network calls and improving throughput.
- Graceful shutdown: Flushes any spans still in memory before the process terminates, so you don’t lose data during restarts or deployments.
Running in Production with a Collector
You now have an OTLP exporter pointing to a backend. In many setups, that backend is an OpenTelemetry Collector — either self-hosted or provided by a vendor. The collector acts as a gateway: it receives spans from your app, processes them, and forwards them to one or more storage/analysis systems.
Why Use a Collector
- Centralized config — change exporters and processors without touching application code.
- Data enrichment — add attributes (e.g.,
cluster.name
,team
) before spans reach storage. - Protocol bridging — accept OTLP from services and send to multiple backends in different formats.
- Load management — batch, retry, and buffer telemetry to handle spikes without dropping data.
Minimal Collector + App Setup
docker-compose.yml
version: "3.9"
services:
otel-collector:
image: otel/opentelemetry-collector:0.102.0
command: ["--config=/etc/otel-collector-config.yaml"]
volumes:
- ./otel-collector-config.yaml:/etc/otel-collector-config.yaml
ports:
- "4317:4317" # OTLP gRPC
- "4318:4318" # OTLP HTTP
restart: unless-stopped
checkout-api:
build: .
environment:
- NODE_ENV=production
- OTEL_EXPORTER_OTLP_ENDPOINT=http://otel-collector:4318/v1/traces
depends_on:
- otel-collector
ports:
- "3000:3000"
otel-collector-config.yaml
receivers:
otlp:
protocols:
http:
grpc:
processors:
batch:
timeout: 5s
send_batch_size: 512
exporters:
logging:
loglevel: info
otlphttp:
endpoint: https://last9-gateway.example.com/v1/traces
headers:
"x-api-key": "${LAST9_API_KEY}"
service:
pipelines:
traces:
receivers: [otlp]
processors: [batch]
exporters: [logging, otlphttp]
How This Works
- Application → Collector
Your Node.js service sends spans via OTLP/HTTP to the collector. - Collector Processing
Thebatch
processor groups spans and sends them at controlled intervals. - Export to Multiple Backends
Here, spans are sent both to the console (logging
exporter) and to a Last9 Gateway (otlphttp
). - Switch Backends Without Redeploying
If you later want to also send traces to Jaeger, Last9, or another APM, you add that exporter in the collector config — no code changes in the service.
Add Business Context to Traces
Auto-instrumentation handles the infrastructure-level spans — HTTP requests, database queries, and outbound API calls.
To make traces useful during debugging and performance reviews, you also need spans that describe what the application was actually doing in business terms.This means creating spans manually for critical workflows and attaching attributes that carry domain-specific details.
Example: Tracking a Payment Flow
const { trace, SpanStatusCode } = require('@opentelemetry/api');
async function processOrderPayment(orderId, amount, userId) {
const tracer = trace.getTracer('payment-service');
const span = tracer.startSpan('processOrderPayment');
// Add attributes to capture domain-specific context
span.setAttribute('order.id', orderId);
span.setAttribute('payment.amount', amount);
span.setAttribute('user.id', userId);
span.setAttribute('payment.method', 'credit_card');
try {
span.addEvent('Starting payment validation');
await validatePayment(orderId, amount);
span.addEvent('Charging payment processor');
const result = await chargePaymentProcessor(amount);
span.addEvent('Payment completed');
span.setAttribute('payment.transaction_id', result.transactionId);
span.setAttribute('payment.processor_response_time', result.responseTime);
return result;
} catch (error) {
span.recordException(error);
span.setStatus({
code: SpanStatusCode.ERROR,
message: error.message,
});
throw error;
} finally {
span.end();
}
}
Why This is Important:
- Domain visibility — Instead of just seeing “POST /api/payments failed,” you see exactly which order, user, and payment step failed.
- Actionable error context — Exceptions are tied to the relevant span with full attributes so you can reproduce or investigate quickly.
- Performance insight — Attributes like
payment.processor_response_time
show where time is being spent in business operations. - Granular events —
span.addEvent()
gives precise timestamps for milestones in the flow, making it easier to spot delays.
Targeted manual spans alongside auto-instrumentation make traces technically complete and operationally valuable — helping with debugging, incident response, and performance tuning.
How to Connect Traces Across Services
One of OpenTelemetry’s biggest advantages is being able to follow a request across multiple services — without manually managing trace IDs.
When your services are instrumented, OpenTelemetry automatically injects trace context into outbound calls and extracts it on the receiving end. This lets you see the entire request path, from the entry point (e.g., API gateway) through downstream services.
Example: API Gateway Calling a User Service
const fetch = require('node-fetch');
const { trace, SpanStatusCode } = require('@opentelemetry/api');
async function getUserProfile(userId) {
const tracer = trace.getTracer('api-gateway');
const span = tracer.startSpan('getUserProfile');
span.setAttribute('user.id', userId);
try {
// Trace context is automatically propagated in headers
const response = await fetch(`http://user-service/users/${userId}`);
const profile = await response.json();
span.setAttribute('http.status_code', response.status);
span.setAttribute('user.profile_loaded', Boolean(profile.id));
return profile;
} catch (error) {
span.recordException(error);
span.setStatus({ code: SpanStatusCode.ERROR, message: error.message });
throw error;
} finally {
span.end();
}
}
Operational Advantages:
- Automatic propagation — The API gateway attaches
traceparent
andtracestate
headers, so the user service continues the same trace without extra code. - Unified visibility — Both spans appear in a single trace, letting you see the full path of a request across services.
- No manual ID handling — Developers don’t need to write plumbing code to pass trace IDs through API calls.
- Better debugging — When a request fails downstream, you can trace it back to its origin and see every step in between.
With context propagation enabled, OpenTelemetry turns a collection of service logs into a single, connected story of what happened — crucial for debugging distributed systems.

What This Costs You
Enabling tracing with OpenTelemetry adds minimal, predictable overhead:
- CPU — additional processing to create spans and enrich them with attributes.
- Memory — short-term storage of spans before export.
- Network — small, steady flow of trace data to your backend.
- Latency — slight delay when spans are recorded and flushed.
In return, you gain full visibility into how requests move through your systems — down to the service, operation, and attribute level. That visibility translates directly into faster root cause analysis, quicker fixes, and more confidence in production changes.
For most teams, the operational insight far outweighs the resource cost, which is why OpenTelemetry is increasingly used in production environments from day one.
Common Problems and How to Fix Them
Small configuration issues can stop traces from appearing or cause unnecessary overhead. Here’s how to identify and fix the most common problems:
Problem: “I added tracing but don’t see any spans.”
Fix:
Ensure require('./tracing')
(or your tracing bootstrap file) is the very first line in your application’s entry file. No imports or configuration should run before it so OpenTelemetry can patch libraries as they load.
Problem: “Traces work locally but not in production.”
Fix:
Confirm that OTEL_EXPORTER_OTLP_ENDPOINT
is reachable from production (test with curl
or equivalent). Use BatchSpanProcessor
instead of SimpleSpanProcessor
in production to send spans asynchronously.
Problem: “My CPU usage spiked after adding tracing.”
Fix:
Switch to BatchSpanProcessor
to send spans in batches instead of per request. For very high traffic, enable sampling to reduce span volume, e.g., 10% sampling with TraceIdRatioBasedSampler
.
Problem: “Some traces get cut off when my app restarts.”
Fix:
Listen for termination signals (SIGTERM
, SIGINT
) and call sdk.shutdown()
before exiting. Allow a few seconds for pending spans to be flushed to the backend.
Choose the Right Backend
Once your app is emitting traces, you need a place to send them. The right choice depends on where you are in your workflow and how much infrastructure you want to manage.
Local development
Run something lightweight on your machine so you can see traces immediately.
Jaeger is a good starting point — spin it up with Docker, point your OTLP exporter at it, and you’ll have a UI to inspect spans and timing in real time.
Production
Use a managed service that can handle scale, retention, and search speed without you running storage nodes.
Last9, or any other OTLP-compatible backend — can receive your traces, handle indexing, and store them for your chosen retention period. This means you can run fast queries, correlate activity across services, and set up alerts that trigger on specific trace patterns, giving your team actionable insights in near real time.
Custom setups
If you need full control, run your own OpenTelemetry Collector and back it with a storage engine.
- Common choices include Tempo, ClickHouse, or Elasticsearch for storing traces.
- This route gives you flexibility on retention and sampling but also means maintaining the backend yourself.
The best part — OpenTelemetry doesn’t lock you in. Switching from Jaeger to Last9, or from a vendor to your own backend, is a config change, not a rewrite.
The Real Impact
Tracing with OpenTelemetry turns guesswork into clarity. Instead of excavating logs in the dark, you see exactly how each request runs through your system.
Last9 turbocharges this in a few developer-friendly ways:
- OTLP-native ingestion—Just point your exporter at Last9’s OTLP endpoint and you’re live; same code works in dev or prod.
- Unified Control Plane—Manage ingestion, storage, query settings, and alert rules from a single dashboard. Filter, aggregation, retention, and even rehydration are all tunable without redeploying your app.
- Discover Services & Jobs—Visualize service dependencies, performance metrics, error rates, and execution duration. Plus, track background job health—queue backlog, failed tasks, processing times—and immediately dig into related traces and logs.
Setup takes only a few minutes. Get started for free with 100M events every month.
FAQs
How much overhead does OpenTelemetry add to my application?
There isn’t a single, definitive number—actual overhead depends heavily on your application architecture, instrumentation strategy, and traffic patterns. Benchmark studies in Go applications show that full OpenTelemetry SDK instrumentation can increase CPU usage by around 35%, add 5–8 MB of memory, and raise 99th-percentile latency from about 10 ms to 15 ms, along with approximately 4 MB/s of network traffic. For most applications, however, the impact is lower when sampling and BatchSpanProcessor are used.
Can OpenTelemetry work with my existing monitoring tools?
Yes. OpenTelemetry is vendor-neutral by design and can send trace data to many systems, everything from local versions of Jaeger or Zipkin to modern platforms like Last9 that accept OTLP-compatible traces.
Do I need to instrument every part of my application?
No. You can start small. Take advantage of auto-instrumentation, which covers common frameworks and libraries with no manual code changes. Then, selectively add manual spans for your core business logic. Many teams find that auto-instrumentation alone delivers most of the visibility they need.
How do I manage the volume of telemetry data?
Use sampling to control trace volume and cost. Head-based (probabilistic) sampling reduces span load upfront, while tail-based sampling lets you capture traces after seeing them, useful if only slow or error flows matter. A common pattern is to sample 10%, but always capture errors and slow responses.
Can OpenTelemetry help with frontend performance?
Yes—if you also instrument the browser with OpenTelemetry’s Web SDK. You can track user-level metrics like First Contentful Paint, Time to Interactive, and trace fetch requests. This gives visibility across the full user journey, not just backend services.
How does OpenTelemetry compare to other monitoring approaches?
OpenTelemetry is an open standard that provides structured and interconnected telemetry, including traces, metrics, and logs, across various tools. Unlike proprietary APMs, it doesn’t lock you into a single vendor. And compared to logging or DIY metric setups, it follows semantic conventions that make your data immediately actionable and portable.
Is OpenTelemetry production-ready?
Yes. While some parts of the spec continue to evolve, SDKs like the JavaScript implementation are stable and widely used in production environments—from early-stage startups to enterprise fleets. Upgrades are generally non-breaking, and the project sees regular updates and broad community adoption.