Processors and Transforms
Reduce telemetry volume and shape data before it reaches Last9 using the filter, transform, batch, and memory_limiter processors in the OpenTelemetry Collector.
Processors run on telemetry data between the receiver and the exporter. They are the primary tool for reducing volume, dropping noise, and enriching data before it reaches Last9.
This guide covers the four processors most commonly used in production deployments:
| Processor | What it does |
|---|---|
filter | Drop spans, logs, or metrics that match a condition |
transform | Rename, enrich, or modify telemetry using OTTL expressions |
memory_limiter | Prevent the collector from OOMing under traffic spikes |
batch | Buffer and flush data in efficient batches |
All processors must be listed in the service.pipelines section to take effect.
filter processor
The filter processor drops telemetry that matches one or more OTTL conditions. Use it to eliminate spans, logs, or metrics that add volume without adding value.
Drop internal spans
Internal spans from frameworks like GraphQL routers, ORM layers, and service meshes can account for the majority of your trace volume. Drop them to reduce ingestion by up to 60–70%:
processors: filter/drop_internal_spans: error_mode: ignore traces: span: - 'kind == SPAN_KIND_INTERNAL and status.code != STATUS_CODE_ERROR'Drop internal spans for specific services
When you need finer control — for example, dropping internal spans only from known high-volume services:
processors: filter/drop_internal_spans: error_mode: ignore traces: span: - 'resource.attributes["service.name"] == "gql-router" and kind == SPAN_KIND_INTERNAL' - 'resource.attributes["service.name"] == "kong" and kind == SPAN_KIND_INTERNAL' - 'resource.attributes["service.name"] == "api-gateway" and kind == SPAN_KIND_INTERNAL and status.code != STATUS_CODE_ERROR'Drop database noise spans
Transaction bookkeeping spans (BEGIN, COMMIT, ROLLBACK) inflate trace counts without providing actionable information:
processors: filter/drop_db_noise: error_mode: ignore traces: span: - 'attributes["db.system"] != "" and IsMatch(name, "^(BEGIN|COMMIT|ROLLBACK)$")'Drop logs by severity
Drop DEBUG and TRACE logs to reduce log volume in production. Logs at INFO and above pass through:
processors: filter/drop_debug_logs: error_mode: ignore logs: log_record: - 'severity_number < SEVERITY_NUMBER_INFO'Drop redundant metric buckets
For Prometheus histograms, you often only need one of _count, _sum, or _bucket depending on your use case. Drop the ones you don’t query:
processors: filter/drop_histogram_sum: error_mode: ignore metrics: datapoint: - 'IsMatch(metric.name, ".*_sum$")'transform processor
The transform processor modifies telemetry in place using OTTL statements. Use it to rename spans, normalize operation names, add missing attributes, or fix instrumentation gaps.
Remove unique IDs from span names
Auto-instrumented frameworks sometimes embed request-specific values (UUIDs, user IDs, numeric IDs) in span names, creating unbounded cardinality in APM:
processors: transform/normalize_span_names: error_mode: ignore trace_statements: - context: span statements: # Remove query params: "GET /users?id=abc123" → "GET /users" - replace_pattern(name, "\\?.*$", "") # Remove numeric path segments: "/api/users/12345/orders" → "/api/users/{id}/orders" - replace_pattern(name, "/[0-9]+", "/{id}") # Remove UUIDs: "/sessions/550e8400-e29b-41d4-a716-446655440000" → "/sessions/{uuid}" - replace_pattern(name, "/[0-9a-f]{8}-[0-9a-f]{4}-[0-9a-f]{4}-[0-9a-f]{4}-[0-9a-f]{12}", "/{uuid}")Fix GraphQL span visibility in APM
Apollo Server and other GraphQL frameworks emit the HTTP transport span and the named operation span both as SERVER kind. This inflates throughput 2× in APM and breaks the Operations tab. The fix demotes the HTTP layer to INTERNAL so the named operation is the single source of truth:
processors: transform/fix_graphql_spans: error_mode: ignore trace_statements: # Step 1: Give the HTTP span a meaningful name before demotion - context: span conditions: - instrumentation_scope.name == "@opentelemetry/instrumentation-http" and kind == SPAN_KIND_SERVER statements: - set(name, Concat([attributes["http.method"], " ", attributes["http.target"]], "")) where IsString(attributes["http.target"])
# Step 2: Demote HTTP SERVER → INTERNAL when a named GraphQL operation span exists - context: span conditions: - instrumentation_scope.name == "@opentelemetry/instrumentation-http" and kind == SPAN_KIND_SERVER and IsMatch(name, ".*/graphql.*") statements: - set(kind, SPAN_KIND_INTERNAL)
# Step 3: Add http.method to GraphQL operation spans so APM Operations tab shows them - context: span conditions: - IsString(attributes["graphql.operation.type"]) and kind == SPAN_KIND_SERVER and attributes["http.method"] == nil statements: - set(attributes["http.method"], "POST") - set(attributes["http.status_code"], "500") where attributes["http.status_code"] == nil and status.code == STATUS_CODE_ERROR - set(attributes["http.status_code"], "200") where attributes["http.status_code"] == nilAdd static labels to CloudWatch metrics
CloudWatch metrics arrive without service_name or environment labels, making service-level alerting impossible. Use a transform processor to add them:
processors: transform/enrich_cloudwatch: error_mode: ignore metric_statements: - context: datapoint statements: - set(attributes["service_name"], "payments-service") where resource.attributes["aws.cloudwatch.namespace"] == "AWS/ApplicationELB" - set(attributes["deployment_environment"], "production")Propagate resource attributes to span attributes
Some backends and dashboards expect certain attributes on the span rather than the resource. Copy them across:
processors: transform/promote_resource_attrs: error_mode: ignore trace_statements: - context: span statements: - set(attributes["k8s.namespace"], resource.attributes["k8s.namespace.name"]) where attributes["k8s.namespace"] == nil - set(attributes["host"], resource.attributes["host.name"]) where attributes["host"] == nilmemory_limiter processor
The memory_limiter processor protects the collector from OOMing under sudden traffic spikes. It checks memory usage on an interval and begins dropping data when usage crosses a threshold.
Always place memory_limiter first in every pipeline.
processors: memory_limiter: check_interval: 1s limit_percentage: 80 spike_limit_percentage: 25| Field | Value | Meaning |
|---|---|---|
check_interval | 1s | How often to check memory usage |
limit_percentage | 80 | Start refusing new data at 80% of available memory |
spike_limit_percentage | 25 | Soft limit — begin backpressure 25% below limit_percentage |
Setting the memory limit for the collector process
Set an explicit memory limit for the collector via the --mem-ballast-size-mib flag or GOMEMLIMIT env var. If the collector runs in Kubernetes, set a memory limit on the pod and configure memory_limiter to 80% of that limit:
# Kubernetes pod specresources: limits: memory: 4Gi requests: memory: 2Gi# Corresponding memory_limiter configprocessors: memory_limiter: check_interval: 1s limit_percentage: 80 # 80% of 4Gi = ~3.2Gi spike_limit_percentage: 25batch processor
The batch processor buffers spans, logs, and metrics before sending them to the exporter. Batching reduces the number of outbound connections and improves compression ratios.
processors: batch: timeout: 5s send_batch_size: 1000 send_batch_max_size: 2000| Field | Value | Meaning |
|---|---|---|
timeout | 5s | Send the current batch after this interval even if send_batch_size is not reached |
send_batch_size | 1000 | Target number of items per batch |
send_batch_max_size | 2000 | Maximum batch size; 0 means no limit |
Putting it all together
A complete pipeline configuration using all four processors. Order matters: memory_limiter first, then filtering, then transforms, then batching.
processors: memory_limiter: check_interval: 1s limit_percentage: 80 spike_limit_percentage: 25
filter/drop_internal_spans: error_mode: ignore traces: span: - 'kind == SPAN_KIND_INTERNAL and status.code != STATUS_CODE_ERROR'
filter/drop_debug_logs: error_mode: ignore logs: log_record: - 'severity_number < SEVERITY_NUMBER_INFO'
transform/normalize_span_names: error_mode: ignore trace_statements: - context: span statements: - replace_pattern(name, "\\?.*$", "") - replace_pattern(name, "/[0-9]+", "/{id}")
batch: timeout: 5s send_batch_size: 1000 send_batch_max_size: 2000
service: pipelines: traces: receivers: [otlp] processors: - memory_limiter - filter/drop_internal_spans - transform/normalize_span_names - batch exporters: [otlp/last9]
logs: receivers: [otlp] processors: - memory_limiter - filter/drop_debug_logs - batch exporters: [otlp/last9]
metrics: receivers: [otlp, prometheus] processors: - memory_limiter - batch exporters: [otlp/last9]Troubleshooting
-
Spans still appearing after filter
Check that the processor name exactly matches what’s listed in
service.pipelines. A processor defined but not listed inprocessors:in the pipeline is silently ignored. -
error_mode: ignorevserror_mode: propagateignoreskips items that cause evaluation errors (e.g. missing attributes) and continues processing. Usepropagateduring development to surface OTTL syntax errors — switch toignorein production. -
High CPU from transform processor
OTTL expressions with regex (
IsMatch,replace_pattern) are evaluated per span. For very high-throughput services, move static attribute assignments toresource_detectionorresourcedetectionprocessors, which run once per batch rather than per item.
Please get in touch with us on Discord or Email if you have any questions.