OpenTelemetry agents automatically instrument your application at runtime without requiring code changes — you attach them at startup, and they inject tracing, metrics, and logs using bytecode manipulation or eBPF.
If you're running production services and want observability without refactoring every microservice, agents are the fastest path. They hook into your runtime (JVM, .NET CLR, Python interpreter, Node.js V8) and intercept framework calls, database queries, HTTP requests, and more. The tradeoff is performance overhead and less control compared to manual SDK instrumentation.
Here's what you need to know: agents typically add 5-10% CPU overhead and 20-50MB memory depending on your language and traffic volume. For most teams, that's acceptable. If you're running latency-sensitive services at extreme scale, you'll want to benchmark first. But for legacy apps, third-party services, or rapid rollouts across dozens of microservices, agents are usually the right call.
This blog covers how agents work, when to use them vs the SDK, installation patterns for Java, .NET, Python, and Node.js, performance impact, common production issues, and how to route telemetry to backends that won't explode your costs.
How OpenTelemetry Agents Work
OpenTelemetry agents run inside your application process and modify your code at runtime. They don't change your source files — they intercept function calls and inject instrumentation logic dynamically.
The mechanics vary by language:
- Java: Uses the
-javaagentJVM flag to load bytecode transformers that rewrite classes as they're loaded. Hooks into Spring, Tomcat, JDBC, gRPC, etc. - .NET: Uses CLR profiling APIs (
CORECLR_ENABLE_PROFILING) to inject IL instructions into methods at startup. - Python: Wraps framework entry points using Python's import hooks and function decorators. Works with Flask, Django, FastAPI, etc.
- Node.js: Uses V8's
--requireflag to load instrumentation modules before your app starts. Intercepts HTTP, Express, database drivers. - Go: No traditional agent — Go's static compilation prevents runtime bytecode manipulation. Instead, you use auto-instrumentation libraries at compile time.
Once attached, the agent automatically creates spans for incoming requests, outgoing HTTP calls, database queries, and message queue operations. It exports telemetry via OTLP (OpenTelemetry Protocol) to your backend.
What Gets Instrumented Automatically
Most agents cover:
- HTTP servers and clients
- Database drivers (PostgreSQL, MySQL, MongoDB, Redis)
- gRPC and messaging systems (Kafka, RabbitMQ, SQS)
- Framework-specific logic (Spring, Django, Express, ASP.NET Core)
The exact coverage depends on the language. Java has the most mature agent with 100+ supported libraries. Python and .NET are close behind. Node.js coverage is improving, but it still has gaps for newer frameworks.
If your app uses an unsupported library, the agent won't instrument it automatically. You'll need to add manual spans using the OpenTelemetry SDK or submit a contribution to the agent's instrumentation registry.
OpenTelemetry Agent vs SDK: Which Should You Use?
Use the agent if:
- You need instrumentation across legacy apps with no time for code changes
- You're rolling out observability to dozens of services quickly
- Your team doesn't have the bandwidth to manually instrument every endpoint
- You're okay with ~5-10% overhead and limited control over span semantics
Use the SDK if:
- You're building a new service and want full control over what gets traced
- You need custom attributes, span events, or fine-grained sampling logic
- Your app is latency-sensitive, and you can't tolerate agent overhead
- You want to control exactly which operations create spans
Here's a quick comparison:
| Feature | Agent | SDK |
|---|---|---|
| Code changes required | None | Yes (manual instrumentation) |
| Performance overhead | 5-10% CPU, 20-50MB memory | <1% (depends on your code) |
| Framework support | Automatic for popular frameworks | Manual for everything |
| Custom attributes | Limited (via env vars) | Full control |
| Sampling control | Basic (env vars, config) | Advanced (custom samplers) |
| Debugging complexity | Higher (agent internals are opaque) | Lower (you control the code) |
Example:
If you're instrumenting a legacy Java monolith running Spring Boot, use the agent. Retrofitting manual SDK calls into thousands of controllers and services isn't worth it.
If you're building a Go microservice from scratch and need to track specific business logic (like payment processing stages), use the SDK. Go doesn't have a runtime agent anyway, and you'll want control over span naming and attributes.
If you're running a polyglot system with services in Java, Python, Node.js, and .NET, start with agents for consistency. You can always mix in SDK instrumentation later for critical paths.
How to Install OpenTelemetry Agents in Production
Agent installation varies by language, but the pattern is always the same: download the agent artifact, pass it to the runtime at startup, and configure exporters via environment variables.
Java
Download the latest OpenTelemetry Java agent JAR from the GitHub releases page. Then add it to your JVM startup:
java -javaagent:/path/to/opentelemetry-javaagent.jar \
-Dotel.service.name=my-service \
-Dotel.exporter.otlp.endpoint=http://localhost:4318 \
-jar my-application.jarKey environment variables:
OTEL_SERVICE_NAME— Service identifier in tracesOTEL_EXPORTER_OTLP_ENDPOINT— Where to send telemetry (OTLP backend)OTEL_TRACES_EXPORTER— Set tootlp(default) ornoneto disableOTEL_METRICS_EXPORTER— Set tootlpornoneOTEL_LOGS_EXPORTER— Set tootlpornone
Production note: Don't use the latest tag in production. Pin a specific agent version (e.g., 1.32.0) to avoid unexpected behavior from auto-updates.
.NET
Install the OpenTelemetry .NET Automatic Instrumentation via script or NuGet. Then set CLR profiling environment variables:
export CORECLR_ENABLE_PROFILING=1
export CORECLR_PROFILER={918728DD-259F-4A6A-AC2B-B85E1B658318}
export CORECLR_PROFILER_PATH=/path/to/OpenTelemetry.AutoInstrumentation.Native.so
export OTEL_DOTNET_AUTO_HOME=/path/to/otel-dotnet-auto
export OTEL_SERVICE_NAME=my-service
export OTEL_EXPORTER_OTLP_ENDPOINT=http://localhost:4318
dotnet MyApp.dllDocker example:
FROM mcr.microsoft.com/dotnet/aspnet:8.0
WORKDIR /app
COPY --from=otel/autoinstrumentation-dotnet:latest /autoinstrumentation /otel-auto
ENV CORECLR_ENABLE_PROFILING=1 \
CORECLR_PROFILER={918728DD-259F-4A6A-AC2B-B85E1B658318} \
CORECLR_PROFILER_PATH=/otel-auto/linux-x64/OpenTelemetry.AutoInstrumentation.Native.so \
OTEL_DOTNET_AUTO_HOME=/otel-auto \
OTEL_SERVICE_NAME=my-dotnet-service \
OTEL_EXPORTER_OTLP_ENDPOINT=http://otel-collector:4318
COPY ./publish /app
ENTRYPOINT ["dotnet", "MyApp.dll"]Python
Install the OpenTelemetry Python auto-instrumentation packages:
pip install opentelemetry-distro opentelemetry-exporter-otlp
opentelemetry-bootstrap -a installThen wrap your application startup:
export OTEL_SERVICE_NAME=my-service
export OTEL_EXPORTER_OTLP_ENDPOINT=http://localhost:4318
opentelemetry-instrument python my_app.pyFor Flask apps:
# No code changes needed — just run with the wrapper
opentelemetry-instrument flask runNote: The opentelemetry-instrument wrapper automatically detects installed frameworks (Flask, Django, FastAPI, SQLAlchemy) and instruments them. If you use a niche library, you may need to manually instrument it with the SDK.
Node.js
Install the OpenTelemetry Node.js auto-instrumentation package:
npm install @opentelemetry/auto-instrumentations-nodeCreate a tracing.js file:
const { NodeSDK } = require('@opentelemetry/sdk-node');
const { getNodeAutoInstrumentations } = require('@opentelemetry/auto-instrumentations-node');
const { OTLPTraceExporter } = require('@opentelemetry/exporter-trace-otlp-http');
const sdk = new NodeSDK({
serviceName: 'my-service',
traceExporter: new OTLPTraceExporter({
url: 'http://localhost:4318/v1/traces',
}),
instrumentations: [getNodeAutoInstrumentations()],
});
sdk.start();Then require it before your app starts:
node --require ./tracing.js app.jsNote: If you're using TypeScript with ts-node, use --require with the compiled JS version of tracing.js, not the TS source. Otherwise, you'll hit module resolution errors.
Kubernetes Deployment with OpenTelemetry Operator
The easiest way to deploy agents in Kubernetes is the OpenTelemetry Operator, which injects agents as init containers automatically. The operator is a CNCF project with production-grade stability.
Install the operator:
kubectl apply -f https://github.com/open-telemetry/opentelemetry-operator/releases/latest/download/opentelemetry-operator.yamlCreate an Instrumentation resource:
apiVersion: opentelemetry.io/v1alpha1
kind: Instrumentation
metadata:
name: my-instrumentation
spec:
exporter:
endpoint: http://otel-collector:4318
propagators:
- tracecontext
- baggage
sampler:
type: parentbased_traceidratio
argument: "1.0"
java:
image: ghcr.io/open-telemetry/opentelemetry-operator/autoinstrumentation-java:latest
dotnet:
image: ghcr.io/open-telemetry/opentelemetry-operator/autoinstrumentation-dotnet:latest
python:
image: ghcr.io/open-telemetry/opentelemetry-operator/autoinstrumentation-python:latest
nodejs:
image: ghcr.io/open-telemetry/opentelemetry-operator/autoinstrumentation-nodejs:latestAnnotate your deployment:
apiVersion: apps/v1
kind: Deployment
metadata:
name: my-app
spec:
template:
metadata:
annotations:
instrumentation.opentelemetry.io/inject-java: "true"
spec:
containers:
- name: app
image: my-app:latestThe operator will inject the agent as an init container, mount it into your pod, and configure the necessary environment variables. Your app starts with instrumentation already attached.
Note: Pin operator and agent image versions in production. Using latest can introduce breaking changes during pod restarts.
OpenTelemetry Agent Performance Overhead
Agent overhead varies by language, traffic volume, and how many libraries you're instrumenting. Based on OpenTelemetry's official performance documentation and community benchmarks, here's what production deployments typically observe:
| Language | CPU Overhead | Memory Overhead | Latency Impact |
|---|---|---|---|
| Java | 5-10% | 30-50MB heap | +1-3ms per request |
| .NET | 5-8% | 20-40MB | +1-2ms per request |
| Python | 8-12% | 15-30MB | +2-5ms per request |
| Node.js | 6-10% | 20-35MB | +1-4ms per request |
These numbers come from running agents on apps handling 5k-10k requests per second. Your mileage will vary based on your stack.
When Overhead Becomes a Problem
Agents add overhead in two ways:
- Instrumentation hooks — Every intercepted function call runs agent code (span creation, context propagation, attribute collection)
- Telemetry export — Agents batch and send spans to the backend, which consumes CPU and network bandwidth
If you're running high-throughput services (50k+ req/sec) or latency-sensitive APIs (p99 < 10ms), agent overhead can become noticeable. In those cases:
- Use head-based sampling — Only trace 1% or 10% of requests to reduce span volume
- Disable metric collection — Metrics have higher overhead than traces
- Tune batch sizes — Increase batch size and export intervals to reduce network calls
- Profile agent impact — Use JFR (Java), dotTrace (.NET), or py-spy (Python) to see where the agent is spending time
Troubleshooting OpenTelemetry Agents
Agents fail silently more often than they should. Here's how to debug common issues.
Spans Not Showing Up in Your Backend
Check the exporter endpoint:
# Make sure the endpoint is reachable
curl -X POST http://localhost:4318/v1/traces -H "Content-Type: application/json" -d '{"resourceSpans":[]}'If the endpoint isn't responding, your agent can't send telemetry. Check firewall rules, network policies, or service mesh configs.
Enable agent debug logging:
For Java:
-Dotel.javaagent.debug=trueFor .NET:
export OTEL_LOG_LEVEL=debugFor Python:
export OTEL_LOG_LEVEL=debugFor Node.js:
const { diag, DiagConsoleLogger, DiagLogLevel } = require('@opentelemetry/api');
diag.setLogger(new DiagConsoleLogger(), DiagLogLevel.DEBUG);Debug logs will show you if the agent is attaching correctly, which libraries it's instrumenting, and whether spans are being exported.
High CPU Usage from the Agent
If your app's CPU spikes after enabling the agent, you're probably tracing too much.
Disable instrumentation for specific libraries:
For Java:
-Dotel.instrumentation.[library-name].enabled=false
# Example: disable Kafka instrumentation
-Dotel.instrumentation.kafka.enabled=falseFor Python:
export OTEL_PYTHON_DISABLED_INSTRUMENTATIONS=flask,sqlalchemyFor Node.js, exclude instrumentations in your tracing.js:
instrumentations: [
getNodeAutoInstrumentations({
'@opentelemetry/instrumentation-fs': { enabled: false },
}),
]Reduce sampling rate:
export OTEL_TRACES_SAMPLER=parentbased_traceidratio
export OTEL_TRACES_SAMPLER_ARG=0.1 # Sample 10% of tracesClassLoader Conflicts (Java)
If your Java app throws ClassNotFoundException or NoClassDefFoundError after attaching the agent, you've hit a version conflict between the agent's bundled libraries and your app's dependencies.
Fix:
- Check which library is conflicting (usually SLF4J, Guava, or OkHttp)
- Exclude it from the agent:
-Dotel.javaagent.exclude-classes=com.google.common.*,org.slf4j.*- If that doesn't work, upgrade your app's dependency to match the agent's version (check the agent's POM file for exact versions)
Agent Not Instrumenting a Custom Framework
If you're using a niche web framework or database driver, the agent might not have instrumentation for it yet.
Check the instrumentation registry:
- Java supported libraries
- .NET supported libraries
- Python supported libraries
- Node.js supported libraries
If your library isn't listed, you have two options:
- Manually instrument it using the SDK — Add span creation calls around the library's entry points
- Contribute instrumentation to the OpenTelemetry project — Write a plugin and submit a PR
Configure Sampling and Exporters
Agents export telemetry to backends via OTLP. By default, they send 100% of spans, which can get expensive at scale.
Head-Based Sampling
Head-based sampling decides at the start of a trace whether to record it. If a trace is sampled, all spans in that trace are kept. If not, the entire trace is dropped (OpenTelemetry sampling documentation).
Set sampling rate via environment variables:
export OTEL_TRACES_SAMPLER=parentbased_traceidratio
export OTEL_TRACES_SAMPLER_ARG=0.1 # 10% samplingSampling strategies:
always_on— Sample every trace (default)always_off— Drop every trace (useful for disabling tracing)traceidratio— Sample based on trace ID hash (deterministic, distributed-friendly)parentbased_traceidratio— Respect the parent sampling decision; otherwise, use the trace ID ratio
For most production systems, 10% sampling gives you enough trace coverage without overwhelming your backend. If you need more granular control (e.g., sample 100% of errors but 1% of successful requests), you'll need tail-based sampling, which requires an OpenTelemetry Collector.
Exporter Configuration
Agents support multiple exporters. The most common is OTLP over HTTP:
export OTEL_EXPORTER_OTLP_ENDPOINT=http://otel-collector:4318
export OTEL_EXPORTER_OTLP_PROTOCOL=http/protobuf
export OTEL_EXPORTER_OTLP_HEADERS="x-api-key=YOUR_API_KEY"For OTLP over gRPC:
export OTEL_EXPORTER_OTLP_ENDPOINT=http://otel-collector:4317
export OTEL_EXPORTER_OTLP_PROTOCOL=grpcNote: Use an OpenTelemetry Collector as a sidecar or central gateway rather than sending telemetry directly from agents to your backend. The collector can buffer spans, apply tail-based sampling, enrich data with metadata, and route to multiple backends. It also isolates your app from backend downtime — if your observability vendor goes down, the collector will queue spans locally until the backend recovers.
Why Last9 Works Well with OpenTelemetry Agents
Once you've deployed OpenTelemetry agents, you're sending spans, metrics, and logs to a backend. The challenge: agent-generated telemetry is high-cardinality and high-volume. Auto-instrumentation creates spans for every HTTP request, database query, and cache hit — often with dozens of attributes per span.
Traditional observability backends struggle with this. They force you to either:
- Drop 95% of spans via aggressive sampling — You lose trace coverage and can't debug tail latency or rare errors
- Pay exponentially more as cardinality grows — Costs spike when you instrument more services or add custom attributes
- Hit query timeouts — Backends built for metrics can't handle complex trace queries at scale
Last9 solves this by treating high-cardinality telemetry as a first-class problem, not an edge case.
No Sampling Required at the Agent Level
You can send 100% of spans from your agents without exploding storage costs or query performance. That means:
- No blind spots — You see every error, every slow query, every outlier
- Accurate percentiles — p99 latency calculations aren't skewed by sampling artifacts
- Better root cause analysis — You can trace individual requests end-to-end, even if they failed 0.01% of the time
Faster Queries on Messy Trace Data
Agents generate messy telemetry: auto-generated span names, attribute explosion, and inconsistent tagging across services. Last9's query engine is optimized for this reality. You can filter, aggregate, and visualize trace data without hitting timeouts or needing to pre-aggregate everything.
For example, querying "show me all spans where http.status_code >= 500 and db.latency > 100ms across the last 7 days" works instantly — even if that query spans millions of spans. Traditional backends either can't run this query or require you to pre-aggregate it into a custom metric.
Cost Predictability
Unlike backends that charge per span or per GB ingested, Last9's pricing model doesn't penalize you for instrumenting everything. You're not forced to choose between observability and budget.
OpenTelemetry agents emit detailed context — routes, users, flags, builds, and more. That level of detail is where high-cardinality data shines. Last9 is designed to handle it end-to-end, so you can keep their instrumentation intact as systems scale.
You can try Last9 free with 100M events per month and bring in your existing OpenTelemetry data in under 5 minutes.
FAQs
What's the difference between the OpenTelemetry agent and the SDK?
The agent instruments your app automatically at runtime without code changes. The SDK requires manual instrumentation in your code. Use the agent for legacy apps or quick rollouts; use the SDK for fine-grained control over spans, attributes, and sampling.
How much overhead does an OpenTelemetry agent add?
Typically 5-10% CPU and 20-50MB memory, depending on language and traffic volume. Java agents have the highest overhead; eBPF-based agents (for compiled languages like C++) have the lowest. Always benchmark in staging before rolling out to production.
Can I use OpenTelemetry agents in Kubernetes?
Yes. The easiest way is the OpenTelemetry Operator, which injects agents as init containers. Alternatively, bake the agent into your Docker image and set environment variables in your deployment YAML. Both approaches work — the operator is just more convenient for managing instrumentation at scale.
Do OpenTelemetry agents work with proprietary APM tools?
Yes, if the APM tool supports OTLP (OpenTelemetry Protocol). Last9, Datadog, New Relic, Dynatrace, and all accept OTLP. You just point the agent's exporter to their endpoint. Some vendors also provide their own OpenTelemetry distributions with vendor-specific optimizations.
What happens if the OpenTelemetry agent crashes?
The agent runs in the same process as your app, so a crash can take down the app. In practice, agents are stable — the OpenTelemetry project has extensive test coverage and production usage at scale. Still, test in staging first and monitor agent-specific errors (e.g., ClassLoader conflicts in Java) in production.
Should I sample traces at the agent level or use tail-based sampling?
Use head-based sampling at the agent if you're okay with simple probabilistic sampling (e.g., "trace 10% of all requests"). Use tail-based sampling if you need smarter logic (e.g., "trace 100% of errors and slow requests, but only 1% of fast successful requests"). Tail-based sampling requires an OpenTelemetry Collector, which adds operational complexity but gives you a much better signal-to-noise ratio.
Can I use OpenTelemetry agents alongside manual SDK instrumentation?
Yes. Agents and SDK instrumentation can coexist in the same app. The agent handles auto-instrumentation for frameworks and libraries, while you use the SDK to add custom spans, attributes, and events for business logic. Just make sure both are configured to use the same exporter and sampling settings.