You're sampling 1% of traces in production. A payment request fails at 3 AM. Logs show an error in order-service, but the full picture isn't there because different services made different sampling decisions. order-service kept the trace; payment-service didn't. So you end up checking logs and timestamps across a few services to piece things together.
This happens because the usual probability sampling approach makes a separate choice at each service boundary. Each service decides on its own whether to keep or drop the trace, so the end-to-end view can vary across a multi-service flow.
OpenTelemetry's Consistent Probability Sampling updates this model, fixing a long-standing TODO from the original tracing spec. The first service makes the sampling choice, and that choice is passed through the rest of the trace. This keeps the decision steady across services and helps retain the full path when you need it.
The Challenge With Independent Sampling
If you've worked with distributed tracing for a while, you've probably seen this happen. You set a 10% sampling rate, but when something goes wrong, the full trace isn't always there. Not because of a misconfiguration — it's simply how independent sampling works.
Most setups use TraceIdRatioBased sampling. Each service looks at the trace ID and makes its own call about keeping or dropping the span. No shared state. No coordination. Just separate decisions at every hop.
Here's what that means in practice with three services at 10% each:
frontend-service → 10% chance kept
├─ order-service → 10% chance kept
└─ payment-service → 10% chance kept
The chance of seeing the entire trace becomes:
0.1 × 0.1 × 0.1 = 0.001 (0.1%)So even though you set 10%, the number of complete traces ends up much lower. It's just a side effect of each service acting on its own view of the request.
This is exactly what OpenTelemetry's Consistent Probability Sampling tries to improve. Instead of every service making a separate choice, the first decision travels with the trace, so the outcome stays the same across the entire path.
What Consistent Probability Sampling Does
Consistent Probability Sampling gives every service in a trace the same information to make the same decision. Instead of each hop guessing on its own, the decision is shared in a predictable way.
It relies on two values stored in the W3C tracestate:
Randomness Value (R): A 56-bit number tied to the trace. It's either set directly or derived from the trace ID. Because it's part of the trace, every service sees the same value.
Rejection Threshold (T): This is calculated from your sampling rate and carried through the request as ot=th:value. Since it travels with the trace, each service receives the same threshold.
The decision is straightforward:
if T <= R: keep the span
else: drop itBecause R is stable and T moves with the trace, every service ends up making the same choice. If you configure 10% sampling, you get 10% of full traces — not different fragments from different services.
How the Threshold Calculation Works
The threshold is simply a numeric representation of your sampling rate. It's calculated like this:
T = (1 - sampling_probability) × 2^56Here's how it looks for a few common rates:
# 100% sampling
T = 0
TraceState: ot=th:0
# 50% sampling
T = 36,028,797,018,963,968
TraceState: ot=th:8
# 10% sampling
T = 64,851,834,234,134,732
TraceState: ot=th:e666666666666
# 1% sampling
T = 71,330,803,795,527,475
TraceState: ot=th:fd70a4OpenTelemetry stores this threshold in hexadecimal and drops trailing zeros to keep TraceState small (except for 0 itself, which represents 100% sampling). In practice, four hex digits are usually enough for production setups.
When a span arrives with something like ot=th:fd70a4 in its tracestate, every service knows the exact threshold for that trace. Each one simply compares its shared R value against T and arrives at the same decision.
TraceState Carries the Decision
W3C Trace Context assigns each trace a tracestate field, allowing systems to store additional information. OpenTelemetry uses the ot key with a th sub-key to hold the threshold:
traceparent: 00-4bf92f3577b34da6a3ce929d0e0e4736-00f067aa0ba902b7-01
tracestate: ot=th:cThe th:c represents a 25% sampling rate. Any service that sees this header follows the same 25% decision for the entire trace.
If a trace doesn't include an explicit randomness value, OpenTelemetry derives R from the last 7 bytes of the trace ID, following W3C Trace Context Level 2 rules. This new spec introduces the Random Trace Flag, which requires the least-significant 56 bits of the TraceID to be random—that's the last 14 hexadecimal digits or 7 bytes.
For traces with non-random TraceIDs, you can use explicit randomness instead:
tracestate: ot=rv:abcdef01234567This rv field lets you provide your own 56-bit randomness value. It's useful for:
- Achieving consistent sampling across multiple independent traces by using the same randomness
- Translating external sampling decisions (like hash-based ones) into OpenTelemetry consistent sampling
tracestate moves automatically through standard HTTP headers and gRPC metadata, so the information travels across the whole request path without extra configuration.
Steps to Implement Consistent Sampling in Go
If you're setting this up in Go, the good news is the SDK already has what you need. You just pick the sampler for the service that starts traces, and another for services that mostly continue traces.
1. For services that start traces
You make the first sampling decision here, so this is where the consistent sampler runs:
package main
import (
"go.opentelemetry.io/contrib/samplers/probability/consistent"
"go.opentelemetry.io/otel"
"go.opentelemetry.io/otel/sdk/trace"
)
func initTracer() {
// Make the initial sampling decision (example: 1%).
root := consistent.ProbabilityBased(0.01)
// Ensure every child span follows that decision.
sampler := consistent.ParentProbabilityBased(root)
tp := trace.NewTracerProvider(
trace.WithSampler(sampler),
)
otel.SetTracerProvider(tp)
}This setup makes sure the first hop decides once, and every downstream span sticks to that choice.
2. For services that don't usually start traces
These services almost always receive a traceparent, so they just follow whatever decision has already been made:
func initDownstreamTracer() {
sampler := consistent.ParentProbabilityBased(
trace.NeverSample(), // Only used if a trace starts here unexpectedly.
)
tp := trace.NewTracerProvider(
trace.WithSampler(sampler),
)
otel.SetTracerProvider(tp)
}That's it — most of the work comes from forwarding headers, not from the sampler itself.
What to keep an eye on
- If you look at incoming headers, you should see
tracestate: ot=th:...on sampled traces. - Requests that share the same traceparent should all be kept or all be dropped, across every hop.
- Your backend should show roughly the sampling rate you configured, but now with complete traces instead of fragments.
Steps to Implement in Java
And if you're doing the same in Java, here's the equivalent setup.
The pattern is the same: one sampler for the first hop, and parent-based for everything else.
1. Fixed sampling rate
import io.opentelemetry.contrib.sampler.consistent56.*;
import io.opentelemetry.sdk.trace.samplers.ParentBased;
import io.opentelemetry.sdk.trace.samplers.Sampler;
// ~1% of full traces
Sampler sampler =
ParentBased.builder(
ConsistentFixedThresholdSampler.create(0.01)
).build();2. With simple rules
Here's an example where you skip health checks but sample everything else:
import io.opentelemetry.contrib.sampler.consistent56.*;
Sampler ruleBased =
ConsistentRuleBasedSampler.builder()
.addRule(
key -> key.equals("http.target"),
value -> value.startsWith("/health"),
ConsistentAlwaysOffSampler.getInstance()
)
.setFallbackSampler(ConsistentFixedThresholdSampler.create(0.01))
.build();
Sampler sampler =
ParentBased.builder(ruleBased).build();How to confirm it's working
- Turn on header logging for one service and look for
ot=th:... - Trigger a request that touches multiple services — you should get either the full trace or no trace at all
- If you're mixing Go and Java, both will understand the same tracestate fields
Migrate From TraceIdRatioBased
If you're moving to Consistent Probability Sampling, do it in small steps. The idea is simple: keep your current behavior steady while you roll out consistency.
Phase 1: Add Parent-Based Wrappers
Wrap your existing sampler so child spans follow the parent's choice.
// Before
sampler := trace.TraceIDRatioBased(0.01)
// After
sampler := trace.ParentBased(
trace.TraceIDRatioBased(0.01),
)Roll this out to all services. The root decision still comes from TraceIdRatioBased, but every child now respects it.
Phase 2: Upgrade Entry (Root) Services
Find services that start traces and switch them to the consistent sampler.
import "go.opentelemetry.io/contrib/samplers/probability/consistent"
sampler := consistent.ParentProbabilityBased(
consistent.ProbabilityBased(0.01), // 1%
)Ship this to your entry points first. These will start emitting tracestate with the threshold, so the decision is shared across hops.
Phase 3: Upgrade Downstream Services
Now switch internal services to the same consistent setup:
sampler := consistent.ParentProbabilityBased(
consistent.ProbabilityBased(0.01),
)This staggered rollout keeps behavior predictable while you move everything over.
Use Declarative Configuration
OpenTelemetry's declarative format lets you express the same plan without changing code.
Basic parent-based with 1% root sampling:
file_format: '1.0-rc.1'
tracer_provider:
sampler:
parent_based:
root:
trace_id_ratio_based:
ratio: 0.01Rule-based example (skip health checks):
file_format: '1.0-rc.1'
tracer_provider:
sampler:
rule_based_routing:
fallback_sampler:
trace_id_ratio_based:
ratio: 0.01
span_kind: SERVER
rules:
- action: DROP
attribute: url.path
pattern: /health
- action: RECORD_AND_SAMPLE
attribute: http.status_code
pattern: 5..This setup skips /health, always keeps 5xx and samples everything else at 1%.
Language support (today):
- Java: Fully supported (experimental, agent 2.21+)
- JavaScript: In progress via
opentelemetry-configuration - Go: Partial support via
go.opentelemetry.io/contrib/otelconf
Integrate Collector
You can also tune sampling in the OpenTelemetry Collector with the probabilistic sampler processor. The upgraded processor now keeps the original sampling decision and encodes probability in the OpenTelemetry TraceState.
Proportional mode — scales incoming decisions:
processors:
probabilistic_sampler:
mode: proportional
sampling_percentage: 10If services send traces at 10%, the Collector outputs ~1%.
Equalizing mode — normalizes everything to a target:
processors:
probabilistic_sampler:
mode: equalizing
sampling_percentage: 1Brings mixed inputs to ~1% end-to-end.
Hash seed mode — uses an attribute when the trace ID isn't available:
processors:
probabilistic_sampler:
mode: hash_seed
sampling_percentage: 10
attribute_source: service.instance.idDifference between Head vs Tail Sampling
Consistent Probability Sampling is a head sampling method. The decision to keep or drop a trace is made as soon as the trace starts, and every service follows that same decision. This works well when your traffic is steady, high-volume, and you want predictable resource use.
Head sampling is a good fit when your services are generating thousands of traces per second, and most of that traffic looks routine. Since the choice happens early, you avoid buffering overhead and keep CPU, memory, and network usage tied directly to the sampling rate.
Where head sampling shines:
- High volumes of repetitive, healthy requests
- Cost-sensitive environments
- Traces where the decision doesn't depend on the outcome
Tail sampling, on the other hand, waits until the full trace is known. That gives you more control in situations where the decision depends on the result of the request, but it comes with memory and buffering costs.
Tail sampling is useful when:
- You need every error trace
- The decision requires full trace data
- Traffic volume is low enough to buffer everything
Performance Shape
With head sampling, the reduction is straightforward.
If you process 10,000 traces/sec and sample at 1%, you export about 100 traces/sec.
This translates into:
- ~99% less CPU for span processing
- ~99% lower memory pressure
- ~99% less network usage
Tail sampling isn't as light.
At 10,000 TPS with a 2-second average trace duration, you may hold roughly 20,000 spans in memory — around 200 MB raw, often 500 MB–1 GB after overhead.
When To Use Consistent Sampling
Consistent sampling fits most high-scale systems where full retention isn't practical, and where you want a clean, complete view of sampled traces. It works particularly well in microservice setups with many hops, since a single decision carries across the entire path.
It's not ideal when every error trace must be captured or when compliance requires complete retention. In those cases, tail sampling for specific paths or transactions may be the better tool.
For most large systems, 1–5% at the trace root gives you stable sampling, complete traces, and manageable backend costs — without fragmenting data across services.
How To Start Using Consistent Probability Sampling
You can roll out Consistent Probability Sampling in a few clear steps. The idea is simple: make one sampling decision at the start of a trace and have every service follow it.
1. Upgrade your SDK dependencies
Go:
go get go.opentelemetry.io/contrib/samplers/probability/consistentJava (Gradle):
implementation 'io.opentelemetry.contrib:opentelemetry-samplers-consistent-probability:1.33.0-alpha'2. Configure root samplers on services that start traces
These services create the first sampling decision.
import "go.opentelemetry.io/contrib/samplers/probability/consistent"
sampler := consistent.ParentProbabilityBased(
consistent.ProbabilityBased(0.01), // 1%
)3. Use parent-based samplers everywhere else
Downstream services rarely start traces, so they should simply follow whatever decision the parent has already made.
sampler := consistent.ParentProbabilityBased(
trace.NeverSample(), // used only if a trace starts here unexpectedly
)Reliable End-to-End Traces With Consistent Sampling
Consistent Probability Sampling keeps the sampling decision inside your services. Last9 simply works with whatever your OpenTelemetry setup sends — no backend sampling, no overrides, no hidden filtering.

When traces arrive, we read the TraceState metadata and store the trace exactly as your instrumentation generated it. This ensures you see complete traces for the requests your system decided to sample, at a volume you already control.
What our platform supports:
- Full compatibility with W3C Trace Context (Level 2)
- Automatic reading of sampling information from TraceState
- Correct grouping and representation of sampled traces across services
- Integration with OpenTelemetry SDKs, Collectors, and declarative configs without extra setup
If you want to see how this looks with real traffic, you can start a free trial or book some time with us for a detailed walkthrough.