Zero-Code OpenTelemetry for Vert.x

The standard OpenTelemetry Java Agent is a remarkable piece of engineering. Drop one JAR on the JVM, set a couple of environment variables, and you get distributed tracing across hundreds of libraries with zero code changes. For most Java apps, it just works.

For Vert.x apps, it doesn't.

If you have ever tried it, you have probably seen one of these in your traces: spans that show as separate roots instead of children of an inbound HTTP request, async HTTP calls that lose context the moment they cross an event-loop boundary, RxJava chains where the trace context evaporates after the first flatMap, log lines with no trace_id, and on Java 21, virtual thread support that quietly breaks. These aren't niche edge cases. They're tracked as open issues against opentelemetry-java-instrumentation — including #11860 and #10526 — and they affect every Vert.x deployment that tries to use the upstream agent.

So we built a zero-code javaagent specifically for Vert.x: last9/vertx-opentelemetry. The latest stable is v2.3.4 (March 2026). It's a single JAR per stack, no Maven dependency, no code changes, Java 8 through 21, and it works with Vert.x's event-loop model instead of against it.

This post walks through:

Why the upstream OTel Java Agent fails on Vert.x — the ThreadLocal assumption and the RxJava thread-hopping problem
How the agent works on Vert.x 4 — the native VertxTracer SPI plus ByteBuddy where the SPI doesn't reach
How the agent works on Vert.x 3 — pure ByteBuddy bytecode rewriting
RxJava context propagation across operators
What gets instrumented end-to-end (Netty, Router, JDBC, Kafka, Redis, Aerospike, RESTEasy, SQS)
Java 8 support — and why it actually matters for Vert.x 3 fleets
Vert.x internal metrics via the Micrometer → OTel bridge
Log-trace correlation without touching logback.xml
Why no Maven dependency — shaded internals and the JAR-conflict problem
Quick start, troubleshooting, and links to the repo.

Why the upstream OTel Java Agent breaks on Vert.x

OpenTelemetry context propagation in Java — like all of OTel-Java — is built on io.opentelemetry.context.Context. The default storage is ThreadLocal. A span starts on Thread A, anything Thread A does until the span ends inherits that context, and the moment work hops to Thread B, the SDK expects you (or the agent) to explicitly carry context across.

That model is fine for thread-per-request servers. It is fundamentally wrong for Vert.x.

Vert.x runs on a small pool of event-loop threads. A single event loop handles requests for many simultaneous clients, hopping between them as I/O completes. The framework attaches its own per-request context to each callback via io.vertx.core.Context — not to the thread. When you write:

HttpServer server = vertx.createHttpServer();
server.requestHandler(req -> {
    webClient.get(8081, "downstream", "/api").send(ar -> {
        req.response().end(ar.result().bodyAsString());
    });
});

…the inbound request handler runs on event-loop thread #1, but the webClient.get(...).send(ar -> ...) callback might run on event-loop thread #2 or #3 depending on which loop happens to be free. From OTel's perspective, the callback runs in a fresh thread with a fresh ThreadLocal. The context is gone.

The upstream agent's bytecode auto-instrumentation compensates for this in many libraries by wrapping callbacks at the bytecode level — when you submit a Runnable to an ExecutorService, the agent rewrites it to capture the current context and re-install it on the worker thread. But Vert.x doesn't go through ExecutorService for its event loop. It manages its own dispatch.

RxJava makes the problem worse. Operators like subscribeOn, observeOn, and flatMap can switch threads at any point in the chain. Even if you start an Rx pipeline with the right context on the calling thread, by the time the chain has fanned out across Schedulers.io() and Schedulers.computation(), the context has been replaced multiple times. The upstream agent does have RxJava plugin hooks, but they assume the standard threading model — they don't compose well with Vert.x's own event-loop scheduling, and the result is partial: some spans connect, others don't, and you can't easily tell which.

For Java 21 virtual threads, the picture gets worse: the upstream agent's ThreadLocal propagation doesn't transfer cleanly across virtual thread parks (#10526). Every modern Vert.x deployment is moving toward virtual threads, and the upstream agent is not ready.

So: the upstream agent isn't broken in isolation. It's correctly designed for ThreadLocal-based concurrency. Vert.x just isn't that. We needed something that knew about Vert.x's context model.

Vert.x 4: native `VertxTracer` SPI plus targeted ByteBuddy

Vert.x 4 ships with a built-in tracing SPI: io.vertx.core.spi.VertxTracer. If you provide an implementation, Vert.x calls into it for every HTTP request, every event bus message, every SQL query, every Kafka send and receive — all the integration points that matter — and it provides the originating Vert.x Context so you can attach trace state to the right scope, not the wrong thread.

The agent installs a VertxTracer implementation at startup by transforming the VertxOptions constructor before the application's main class loads. Once that's in place, the rest of the v4 stack is automatically traced via the SPI:

Component	How	What you get
HTTP server	VertxTracer SPI	SERVER spans for every request
HTTP client	VertxTracer SPI	CLIENT spans + `traceparent` injection
EventBus	VertxTracer SPI	INTERNAL spans for send/publish
SQL client (PgPool, MySQLPool)	VertxTracer SPI	CLIENT spans with SQL
Redis client	VertxTracer SPI	CLIENT spans
Kafka	VertxTracer SPI	PRODUCER/CONSUMER spans (see Kafka with OTel for context-propagation details)

For everything the SPI doesn't cover — the Vert.x Web Router, third-party clients like Jedis and Lettuce, raw JDBC, Aerospike, RESTEasy, AWS SQS — the agent uses ByteBuddy to rewrite specific methods at class-load time. The Router instrumentation is the most important piece beyond the SPI: without it, your spans are named after the literal request path (GET /v1/users/12345, GET /v1/users/12346, …), which destroys cardinality. The agent reads the route pattern from the matched route at request time and uses it as the span name and http.route attribute (GET /v1/users/:id).

This split — native SPI for the core stack, ByteBuddy for the rest — is what keeps the v4 agent small and predictable. Most of the surface area is delegating to a Vert.x extension point that already exists; we don't try to monkey-patch what the framework gives us a clean way to plug into.

Vert.x 3: pure ByteBuddy bytecode rewriting

Vert.x 3 has no VertxTracer SPI. So the v3 agent is pure ByteBuddy.

At JVM startup, the agent registers transformers for every class it intends to instrument. When the JVM loads io.vertx.core.http.impl.HttpServerImpl, the transformer fires, rewrites requestHandler() to wrap the user-supplied handler in a tracing wrapper, and the rewritten class is what gets loaded. The original source code on disk is untouched. The original JAR on disk is untouched. ByteBuddy intercepts at the class-load boundary — the rewrite is invisible to anyone who isn't reading bytecode.

The full v3 instrumentation set:

Component	Span Kind	Key Attributes
Netty HTTP server	SERVER	`http.request.method`, `url.path`, `http.response.status_code`
Router (RxJava2 + core)	SERVER	`http.route` (pattern, not literal path)
WebClient	CLIENT	`url.full`, `server.address`, `http.response.status_code`
JDBCClient	CLIENT	`db.system`, `db.name`, `db.statement`
KafkaProducer / KafkaConsumer	PRODUCER / CONSUMER	`messaging.destination.name`, batch counts
AerospikeClient	CLIENT	`db.system=aerospike`, `net.peer.name`
MySQLPool / PgPool	CLIENT	`db.system`, `db.statement`
Jedis (Pool, Cluster, Pipeline)	CLIENT	`db.system=redis`, `db.statement`
Lettuce (sync/async/reactive)	CLIENT	`db.system=redis`, `db.statement`
Raw JDBC (`Statement.execute*`)	CLIENT	`db.system` (auto-detected), `db.statement`
Netty HTTP client	CLIENT	`http.method`, `net.peer.name`
RESTEasy (JAX-RS)	—	`@Path` templates → `http.route`
AWS SQS (SDK v1 + v2)	CONSUMER	`messaging.system=AmazonSQS`

Span attributes follow the OpenTelemetry semantic conventions for HTTP, DB, and messaging. The agent does the transformation in the SDK so you don't have to think about it — your traces and metrics in any OTel-compatible backend look the same as they would for a Spring Boot app instrumented with the upstream agent.

RxJava context propagation across operators

This is the part that takes the most engineering attention.

RxJava lets operators choose their own threads. subscribeOn(Schedulers.io()) runs the chain's source on an I/O thread. observeOn(Schedulers.computation()) switches to a computation thread mid-pipeline. flatMap can fan out work across multiple threads. None of those threads have your trace context unless someone explicitly carries it across.

The fix: RxJava exposes RxJavaPlugins.setOn*Assembly hooks that fire every time an operator is constructed. The agent installs assembly-time hooks that capture the current OTel Context at the moment the operator is built (which is on the calling thread, with the right context) and replays it whenever the operator runs (which may be on a totally different thread).

The result is that you can write completely standard RxJava:

webClient.get(8081, "service-a", "/users").rxSend()
    .flatMap(resp -> dbClient.rxQuery("SELECT * FROM accounts WHERE user_id = ?")
        .map(rows -> enrich(resp, rows)))
    .subscribeOn(Schedulers.io())
    .observeOn(Schedulers.computation())
    .subscribe(result -> req.response().end(result));

…and every span in that chain — the WebClient call, the DB query, the eventual response handler — connects to the inbound SERVER span. No withContext(), no manual span propagation, no thread-local scopes opened and closed.

This is the single biggest reason a Vert.x-specific agent is worth having. RxJava context propagation isn't optional for reactive Java services. It's how you avoid being lied to by your own traces.

W3C `traceparent` injection on every outbound call

Once the agent has the right context, W3C trace context propagation on outgoing calls is the easy part. Every WebClient request, every Netty HTTP client call, and every Kafka producer record gets a traceparent header injected automatically. Downstream services that read traceparent (any OTel-instrumented service does) will pick the inbound trace up and continue it.

This means cross-service traces work the moment you put the agent on both sides — without any application code knowing about OTel at all. For mixed deployments (some services on this agent, some on Spring Boot with the upstream agent, some on Python with opentelemetry-python), they interoperate because everyone speaks the same wire format.

Java 8 support — and why it matters for Vert.x 3

The original v1.x agent required Java 11+. v2.2.2 (March 2026) made the agent fully work on Java 8.

This is not a curiosity. Vert.x 3 fleets in the wild are still very often on Java 8 — particularly long-lived services that were deployed before Vert.x 4 existed and haven't yet justified the JDK upgrade. The upstream OTel Java Agent dropped Java 8 support in 2.0, which means anyone running Java 8 Vert.x 3 simply can't use it. Telling those teams "upgrade to Java 11 first" is a non-starter for production fleets that have been stable for years.

To keep the agent working on Java 8, the build is constrained on multiple axes: OTel SDK 1.38.0 (the last release supporting Java 8), ByteBuddy 1.14.x, and the OkHttp-based OTLP sender instead of the JDK 11+ java.net.http.HttpClient. The OkHttp sender is shaded under io.last9.internal.okhttp3 so it doesn't conflict with applications that bundle their own OkHttp. JVM metrics — memory, GC, threads, CPU, classes — all export from Java 8 just like they do from Java 21.

For the Java instrumentation story more broadly, this matters because it removes the most common excuse for not adding observability: "we'd love to but our runtime is too old." For Vert.x 3 on Java 8, there is no longer a valid excuse.

Vert.x internal metrics via the Micrometer → OTel bridge

Beyond traces, the agent exports Vert.x's own internal metrics: HTTP connection pool sizes, event bus message rates, event-loop lag, worker pool utilization, SQL pool wait times. These are the metrics you actually want when an event-loop thread starts spending too long on a single task or when a database pool is saturated.

Vert.x 4 emits these metrics natively via the Micrometer MetricsOptions SPI. The agent installs a MicrometerMetricsOptions configured to publish to a MeterRegistry that bridges into the OTel SDK. From your backend's perspective, they look like ordinary OTel metrics — same OTLP endpoint, same authentication, same labels. You don't run a Micrometer-aware backend and an OTel-aware backend side by side; everything funnels through OTLP.

This is one of the cleanest examples of OTel and Micrometer not being in opposition. Vert.x already produces Micrometer metrics. We don't replace that — we route the output through the OTel SDK so it lands wherever you ship OTel data.

Log-trace correlation without touching `logback.xml`

The agent auto-installs a Logback turbo filter that injects trace_id and span_id into every log line's MDC. No %X{trace_id} placeholder edits. No logback.xml changes. No application code changes.

If your backend correlates logs to traces by these fields (Last9 does, as do most modern logging backends), every log line emitted during a request handler — from any library, including third-party libraries the application doesn't control — gets the matching trace context attached automatically. You click a span in the trace UI, and you see the exact log lines that emitted during that span. The flow is what people imagine when they hear "observability"; the disappointment most teams hit is when the correlation isn't actually configured. The agent makes it default.

For applications that use a different logging framework (Log4j 2, SLF4J Simple, java.util.logging), the agent currently focuses on Logback because that's what the vast majority of Vert.x apps use. The architecture is general — there's no fundamental reason it can't extend — but Logback is the priority.

Why no Maven dependency

Most observability libraries ship as a Maven artifact you add to your pom.xml. We deliberately don't.

Adding io.last9:vertx3-rxjava2-otel-autoconfigure:2.3.4 as a runtime dependency would put unshaded versions of OTel SDK, ByteBuddy, and OkHttp on your application classpath. The agent itself ships those same libraries shaded under io.last9.internal.*. If both versions are present, the application's classloader will load the unshaded ones first — and the agent's instrumented code, which expects the shaded versions, gets a no-op tracer because the shaded class names don't match anything. You see no spans, no log correlation, no metrics. The agent is loaded, but silently disabled.

The fix is structural: don't ship a Maven artifact at all for the javaagent path. Distribute only the JAR, and tell users to drop it in via -javaagent:. The agent loads on the system classloader, all its internals are shaded, and there's no dependency surface for the application to accidentally collide with.

A library mode does still exist — useful for environments where you can't pass a JVM flag (some packaged appliances, certain serverless runtimes) — but it's a fallback. The headline path is the agent. For distributed tracing setup more broadly, the lesson is that runtime instrumentation belongs at the runtime layer, not the build layer.

Quick start

# Set OTel environment variables
export OTEL_SERVICE_NAME=my-vertx-service
export OTEL_EXPORTER_OTLP_ENDPOINT=https://otlp.last9.io
export OTEL_EXPORTER_OTLP_HEADERS="Authorization=Basic <token>"

# Download the agent for your Vert.x version
curl -L -o vertx3-otel-agent.jar \
  https://github.com/last9/vertx-opentelemetry/releases/download/v2.3.4/vertx3-otel-agent-2.3.4.jar

# Run with the agent attached
java -javaagent:vertx3-otel-agent.jar -jar my-app.jar

For Vert.x 4, swap vertx3-otel-agent → vertx4-otel-agent. That's the entire setup.

In Docker:

FROM eclipse-temurin:11-jre-alpine
COPY target/my-app.jar /app/my-app.jar
COPY vertx3-otel-agent.jar /app/vertx3-otel-agent.jar
CMD ["java", "-javaagent:/app/vertx3-otel-agent.jar", "-jar", "/app/my-app.jar"]

The agent supports the standard OpenTelemetry SDK environment variables — OTEL_SERVICE_NAME, OTEL_EXPORTER_OTLP_ENDPOINT, OTEL_EXPORTER_OTLP_HEADERS, OTEL_RESOURCE_ATTRIBUTES, OTEL_TRACES_SAMPLER, etc. If you've configured an OTel-instrumented JVM before, the same env vars work here.

Troubleshooting common failure modes

Disconnected traces (outgoing calls show as separate root spans)

Look for bytecode instrumentation installed in the agent's startup log — confirms the agent loaded and ByteBuddy transformers are active. Then look for RxJava context propagation installed — confirms the assembly hooks are wired. If both are present and downstream calls still don't link, the downstream service likely isn't reading traceparent. Verify with curl -v on the downstream and look for the inbound header.

No spans exported at all

Three culprits, in order of likelihood: OTEL_EXPORTER_OTLP_ENDPOINT is wrong (check for Connection refused in stderr), authentication is wrong (look for HTTP 401/403), or the agent failed to initialize the OTel SDK (look for Tracer is NO-OP in stderr — this means the SDK never set up). Set OTEL_LOG_LEVEL=debug for verbose export logs.

CLIENT spans present but no SERVER spans

The most common cause is a Maven dependency on the unshaded library colliding with the agent's shaded internals. Look for WARNING: HttpServerAdviceHelper is missing version marker in the startup log. If it's there, find and remove the io.last9:vertx*-otel-autoconfigure dependency from your pom.xml. The fix is in the dependency tree, not the agent.

Send your Vert.x telemetry to Last9

The agent is OTLP-native and works against any OTel-compatible backend. Last9 happens to be one such backend, with cardinality-tolerant ingest that handles the kind of label fan-out reactive services tend to produce — http.route patterns, db.statement summaries, Kafka topic dimensions, and per-instance tags from cloud resource detection.

If your reactive Java services are blind today because the upstream OTel Java Agent didn't fit, this is the path. One JAR, no Maven, no code changes.

Download vertx-opentelemetry v2.3.4 → Start sending Vert.x telemetry to Last9 →

Zero-Code OpenTelemetry for Vert.x

Contents

Why the upstream OTel Java Agent breaks on Vert.x

Vert.x 4: native `VertxTracer` SPI plus targeted ByteBuddy

Vert.x 3: pure ByteBuddy bytecode rewriting

RxJava context propagation across operators

W3C `traceparent` injection on every outbound call

Java 8 support — and why it matters for Vert.x 3

Vert.x internal metrics via the Micrometer → OTel bridge

Log-trace correlation without touching `logback.xml`

Why no Maven dependency

Quick start

Troubleshooting common failure modes

Send your Vert.x telemetry to Last9

References

Contents

Start observing for free. No lock-in.

OpenTelemetry · Prometheus

Datadog · New Relic · Others

Built on Open Standards

Zero-Code OpenTelemetry for Vert.x

Contents

Why the upstream OTel Java Agent breaks on Vert.x

Vert.x 4: native VertxTracer SPI plus targeted ByteBuddy

Vert.x 3: pure ByteBuddy bytecode rewriting

RxJava context propagation across operators

W3C traceparent injection on every outbound call

Java 8 support — and why it matters for Vert.x 3

Vert.x internal metrics via the Micrometer → OTel bridge

Log-trace correlation without touching logback.xml

Why no Maven dependency

Quick start

Troubleshooting common failure modes

Send your Vert.x telemetry to Last9

References

Contents

Start observing for free. No lock-in.

OpenTelemetry · Prometheus

Datadog · New Relic · Others

Built on Open Standards

Vert.x 4: native `VertxTracer` SPI plus targeted ByteBuddy

W3C `traceparent` injection on every outbound call

Log-trace correlation without touching `logback.xml`